A data science framework for transforming electronic health records into real-world evidence
将电子健康记录转化为现实世界证据的数据科学框架
基本信息
- 批准号:10664706
- 负责人:
- 金额:$ 8.9万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-03 至 2025-07-31
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalAccelerationAlgorithmsBayesian NetworkBenchmarkingBiometryCaliforniaChronicClassificationClinic VisitsClinical DataClinical ResearchClinical TrialsComplexDataData ReportingData ScienceData SetDedicationsDiseaseDrug ApprovalE-learningEffectivenessElderlyElectronic Health RecordEligibility DeterminationEndoscopyEquityExclusionFood and Drug Administration Drug ApprovalFutureGoalsHealthcareImmune System DiseasesJointsLearningMachine LearningMalignant NeoplasmsMasksMeasurementMeasuresMentorsMethodsModelingNatural Language ProcessingNatureNew Drug ApprovalsPatient RepresentativePatientsPatternPerformancePharmaceutical PreparationsPopulationPredispositionPregnancyPublishingRaceRandomized, Controlled TrialsRecording of previous eventsResearch SubjectsSample SizeSan FranciscoSelection for TreatmentsSemanticsSeveritiesSourceStructureSubgroupSymptomsTestingTextTimeTrainingTreatment EffectivenessUlcerative ColitisUncertaintyUniversitiesWorkalgorithm trainingcareercareer developmentcohortcomputerized toolscostdata harmonizationdata integrationelectronic structureheterogenous dataimprovedin silicoinnovationinsightlearning strategymeetingsoutcome predictionpatient health informationpatient subsetsrandomized trialreconstructionsupport toolstooltreatment effect
项目摘要
PROJECT SUMMARY
Randomized controlled trials (RCTs) are the gold-standard in clinical research but are subject to many
limitations including high costs, limited generalizability, and small sample sizes in patient subgroups. By
contrast, electronic health records (EHRs) are widely available and contain information on large and
representative patient cohorts. However, because they capture the uncontrolled observations of many
clinicians, they are highly susceptible to bias. The recent availability of the raw data from RCTs has created a
unique opportunity to integrate them with that from EHRs, and to innovate methods that exploit the distinct
advantages of each dataset.
We propose to identify the zone of overlap between these data and build bridges in data representations.
These bridges could enable us to better emulate randomized trials using EHR data and measure the same
effects seen in the trials. Consequently, it would allow us to study subgroups that were excluded from the
pivotal trials associated with new drug approvals by the FDA.
We will test these ideas out in the context of Ulcerative colitis (UC) and scale to others in future work. We
have obtained access to the raw data from 12 RCTs in UC (N=6,226). These data contain timed and structured
measurements of disease activity including the Mayo score, a composite score of patient symptoms and
endoscopic severity. We have also obtained access to the EHR data of 3,270 UC patients treated at the
University of California San Francisco. These data contain similar data as RCTs but largely in an unstructured
form. In addition, these assessments tend to be incomplete relative to trials due to costs and invasiveness of
some tests. We will address this problem of unharmonized and incomplete EHR data in three aims.
In Aim 1, we will harmonize the RCT data into an analysis-ready format. We will also develop text
classification tools to transform free-texted EHR data into Mayo subscores, and validate these tools against
data from a second center. In Aim 2, we will integrate the RCT and EHR data, train algorithms to impute RCT-
based representations of the patient state from partial measurements made in EHRs, and test them under
conditions typifying real-world data capture. In Aim 3, we will use these algorithms to harmonize EHR data,
validate them as a tool to recover the same effects as RCTs, and study new patient subgroups.
The applicant will carry out these aims and train in biostatistics, natural language processing, machine
learning, and overall career development. With the help of his mentors, he will launch a career dedicated to
developing and disseminating methods for learning from complex clinical data, and in so doing, promote a
future of better healthcare for all patients.
项目概要
随机对照试验 (RCT) 是临床研究的黄金标准,但受到许多因素的影响
其局限性包括成本高、普遍性有限以及患者亚组样本量小。经过
相比之下,电子健康记录 (EHR) 已广泛使用,并且包含大量的信息
具有代表性的患者群体。然而,因为它们捕捉到了许多人不受控制的观察结果
临床医生,他们非常容易受到偏见的影响。最近来自随机对照试验的原始数据的可用性创造了一个
将它们与电子病历相结合的独特机会,并创新利用独特的方法
每个数据集的优点。
我们建议确定这些数据之间的重叠区域并在数据表示中建立桥梁。
这些桥梁可以使我们更好地使用 EHR 数据模拟随机试验并测量相同的结果
试验中看到的效果。因此,它将使我们能够研究被排除在外的亚组
与 FDA 批准新药相关的关键试验。
我们将在溃疡性结肠炎 (UC) 的背景下测试这些想法,并在未来的工作中扩展到其他领域。我们
已获得 UC 12 项随机对照试验 (N=6,226) 的原始数据。这些数据包含定时和结构化的
疾病活动度的测量,包括 Mayo 评分、患者症状的综合评分和
内窥镜严重程度。我们还获得了在该中心接受治疗的 3,270 名 UC 患者的 EHR 数据。
加州大学旧金山分校。这些数据包含与随机对照试验类似的数据,但大部分是非结构化的
形式。此外,由于成本和侵入性,这些评估相对于试验来说往往是不完整的。
一些测试。我们将通过三个目标解决电子病历数据不统一和不完整的问题。
在目标 1 中,我们将把 RCT 数据统一为可供分析的格式。我们还将开发文本
分类工具将自由文本 EHR 数据转换为 Mayo 子评分,并根据
来自第二个中心的数据。在目标 2 中,我们将整合 RCT 和 EHR 数据,训练算法来估算 RCT-
基于 EHR 中部分测量的患者状态表示,并在
代表现实世界数据捕获的条件。在目标 3 中,我们将使用这些算法来协调 EHR 数据,
验证它们作为恢复与随机对照试验相同效果的工具,并研究新的患者亚组。
申请人将实现这些目标并接受生物统计学、自然语言处理、机器学习方面的培训
学习和整体职业发展。在导师的帮助下,他将开启自己的职业生涯
开发和传播从复杂临床数据中学习的方法,并以此促进
为所有患者提供更好的医疗保健的未来。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Algorithmic Identification of Treatment-Emergent Adverse Events From Clinical Notes Using Large Language Models: A Pilot Study in Inflammatory Bowel Disease.
使用大型语言模型从临床记录中算法识别治疗中出现的不良事件:炎症性肠病的初步研究。
- DOI:10.1002/cpt.3226
- 发表时间:2024
- 期刊:
- 影响因子:6.7
- 作者:Silverman,AnnaL;Sushil,Madhumita;Bhasuran,Balu;Ludwig,Dana;Buchanan,James;Racz,Rebecca;Parakala,Mahalakshmi;El-Kamary,Samer;Ahima,Ohenewaa;Belov,Artur;Choi,Lauren;Billings,Monisha;Li,Yan;Habal,Nadia;Liu,Qi;Tiwari,Jawahar;B
- 通讯作者:B
Assessing the Impact of COVID-19 on IBD Outcomes Among Vulnerable Patient Populations in a Large Metropolitan Center.
- DOI:10.1093/ibd/izad041
- 发表时间:2023-03
- 期刊:
- 影响因子:4.9
- 作者:F. Odufalu;Justin L Sewell;Vivek A. Rudrapatna;M. Somsouk;U. Mahadevan
- 通讯作者:F. Odufalu;Justin L Sewell;Vivek A. Rudrapatna;M. Somsouk;U. Mahadevan
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vivek A Rudrapatna其他文献
Robust measurement of the real world effectiveness of Tofacitinib for the treatment of Ulcerative Colitis using electronic health records: a protocol and statistical analysis plan v1
使用电子健康记录对托法替尼治疗溃疡性结肠炎的真实世界有效性进行稳健测量:方案和统计分析计划 v1
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Vivek A Rudrapatna;Atul J. Butte - 通讯作者:
Atul J. Butte
Vivek A Rudrapatna的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
医用电子直线加速器设计模型中非线性特征值问题的算法及相关预处理研究
- 批准号:12371379
- 批准年份:2023
- 资助金额:44 万元
- 项目类别:面上项目
分布式非凸非光滑优化问题的凸松弛及高低阶加速算法研究
- 批准号:12371308
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
基于增广拉格朗日函数的加速分裂算法及其应用研究
- 批准号:12371300
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
加速器磁铁励磁电源扰动物理机制和观测抑制算法的研究
- 批准号:12305170
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于任意精度计算架构的量子信息处理算法硬件加速技术研究
- 批准号:62304037
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
A computational model for prediction of morphology, patterning, and strength in bone regeneration
用于预测骨再生形态、图案和强度的计算模型
- 批准号:
10727940 - 财政年份:2023
- 资助金额:
$ 8.9万 - 项目类别:
Directed evolution of broadly fungible biosensors
广泛可替代生物传感器的定向进化
- 批准号:
10587024 - 财政年份:2023
- 资助金额:
$ 8.9万 - 项目类别:
High-resolution cerebral microvascular imaging for characterizing vascular dysfunction in Alzheimer's disease mouse model
高分辨率脑微血管成像用于表征阿尔茨海默病小鼠模型的血管功能障碍
- 批准号:
10848559 - 财政年份:2023
- 资助金额:
$ 8.9万 - 项目类别:
Bioethical, Legal, and Anthropological Study of Technologies (BLAST)
技术的生物伦理、法律和人类学研究 (BLAST)
- 批准号:
10831226 - 财政年份:2023
- 资助金额:
$ 8.9万 - 项目类别:
GPU-based SPECT Reconstruction Using Reverse Monte Carlo Simulations
使用反向蒙特卡罗模拟进行基于 GPU 的 SPECT 重建
- 批准号:
10740079 - 财政年份:2023
- 资助金额:
$ 8.9万 - 项目类别: