Robust methods for missing data in electronic health records-based studies
基于电子健康记录的研究中缺失数据的稳健方法
基本信息
- 批准号:10181873
- 负责人:
- 金额:$ 56.68万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-04-12 至 2025-03-31
- 项目状态:未结题
- 来源:
- 关键词:AddressAttentionCaringClinicalCohort StudiesComplexDataData ProvenanceElectronic Health RecordEligibility DeterminationEthicsFaceHealth PersonnelHealth systemLiteratureLongitudinal StudiesMeasurementMethodologyMethodsModelingObservational StudyOutcomePatient CarePatientsProbabilityResearchResearch DesignResearch PersonnelSamplingSelection BiasSeriesStatistical MethodsSystemTechniquesTimeWeightbariatric surgerybasecohortcost effectivedesignepidemiology studyexperienceflexibilityinnovationnovelopportunity costprospectivepublic health researchrandomized trialsemiparametrictool
项目摘要
PROJECT SUMMARY
Electronic health record (EHR) data represent a huge opportunity for cost-efficient clinical and public health
research, especially when a randomized trial or a prospective observational study is not feasible or ethical. EHR
systems, however, are typically developed to support clinical and/or billing activities. As such, substantial care
is needed when using EHR data to address a particular scientific question. In this, an important potential threat
to validity is missing data. Moreover, since EHR data are not collected for any particular research question, it
will often be the case that measurements that are critical to answering the question will be unavailable in the
record of some patients. This, in turn, requires researchers to contend with the potential for selection bias and
compromised generalizability.
Towards addressing issues of missing data in an EHR, researchers could, in principle, appeal to a vast
statistical literature and use standard methods such as multiple imputation (MI), inverse-probability weighting
(IPW) or doubly- robust (DR) estimation. These methods, however, have generally been developed outside of the
EHR context. As such, they typically fail to acknowledge the complexity of the EHR data, in particular the many
decisions made by patients and health care providers that give rise to `complete data' in the EHR, known to as
the data provenance. Because of the disconnect between this complexity and the settings for which most missing
data methods are developed, the application of standard missing data methods to EHR-based studies will often
fail to resolve selection bias and generalizability will remain compromised.
Unfortunately, in contrast to confounding bias, very little attention has been paid to developing methods for
missing data that are specifically tailored to the complexity of EHR-based studies. We will begin to address this
gap by developing, implementing and evaluating a suite of novel, innovative statistical tools including: Aim 1: A
unified framework for robust causal inference in unmatched and matched EHR-based cohort studies with missing
confounder data; Aim 2: A formal, robust framework for causal inference in emulated target trials based on EHR
data; Aim 3: A novel blended analysis framework for missing data in EHR-based studies that combines MI and
IPW in an innovative and unique way; Aim 4: A novel double-sampling strategy for when the EHR data are
suspected to be missing-not-at-random.
The proposed aims are motivated by challenges the investigative team has faced in a series of EHR-based
studies of long-term outcomes among patients who have undergone bariatric surgery. Throughout this research,
we will use data from one of these studies, the DURABLE study, which has rich demographic and longitudinal
clinical information from three Kaiser Permanente health systems on ≈45,000 patients who underwent bariatric
surgery between 1997-2015, as well as on ≈1,636,000 non-surgical enrollees during that time period.
项目摘要
电子健康记录(EHR)数据代表了巨大的临床和公共卫生的巨大机会
研究,尤其是当随机试验或前瞻性观察性研究不可行或道德时。 EHR
但是,通常开发系统以支持临床和/或计费活动。因此,大量的护理
使用EHR数据来解决特定科学问题时需要。在这种情况下,一个重要的潜在威胁
有效性缺少数据。此外,由于未针对任何特定的研究问题收集EHR数据,因此
通常,对于回答问题至关重要的测量值通常是不可用的
一些患者的记录。反过来,这要求研究人员包含具有选择偏见的潜力和
损害了普遍性。
为了解决EHR中缺少数据的问题,研究人员原则上可以看出一个新的
统计文献和使用标准方法,例如多重插补(MI),反概率加权
(IPW)或双重鲁棒(DR)估计。但是,这些方法通常是在以外开发的
EHR上下文。因此,他们通常无法确认EHR数据的复杂性,特别是许多
患者和医疗保健提供者做出的决定在EHR中产生“完整数据”,称为
数据出处。由于这种复杂性与大多数缺少的设置之间的断开连接
开发了数据方法,标准缺失数据方法在基于EHR的研究中的应用通常会经常
无法解决选择偏差,并且概括性将保持损害。
不幸的是,与混淆偏见相反,很少有人注意开发方法
缺少针对基于EHR的研究的复杂性的特定数据。我们将开始解决这个问题
通过开发,实施和评估一套新颖的创新统计工具,包括:AIM 1:A
在基于EHR和匹配的基于EHR的同类研究中,统一的框架用于强大因果推断,缺失
混杂数据;目标2:基于EHR的模拟目标试验中的因果推断的正式,强大的框架
数据; AIM 3:在基于EHR的研究中缺少数据的新型混合分析框架,结合了MI和
以创新和独特的方式IPW;目标4:针对EHR数据为何时的新型双重采样策略
怀疑是不失踪的。
提议的目标是由调查团队在一系列基于EHR的一系列挑战中激发的挑战
在接受减肥手术的患者中长期结局的研究。通过这项研究,
我们将使用其中一项研究的数据,即耐用的研究,该研究具有丰富的人群和纵向
来自三个Kaiser Permanente Health Systems的临床信息,约45,000例减肥患者
1997 - 2015年之间的手术以及在此期间的≈1,636,000次非手术中的手术。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
SEBASTIEN HANEUSE其他文献
SEBASTIEN HANEUSE的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('SEBASTIEN HANEUSE', 18)}}的其他基金
Robust methods for missing data in electronic health records-based studies
基于电子健康记录的研究中缺失数据的稳健方法
- 批准号:
10390382 - 财政年份:2021
- 资助金额:
$ 56.68万 - 项目类别:
Robust methods for missing data in electronic health records-based studies
基于电子健康记录的研究中缺失数据的稳健方法
- 批准号:
10589133 - 财政年份:2021
- 资助金额:
$ 56.68万 - 项目类别:
Clustered semi-competing risks analysis in quality of end-of-life care studies
临终关怀研究质量中的聚类半竞争风险分析
- 批准号:
8612275 - 财政年份:2014
- 资助金额:
$ 56.68万 - 项目类别:
Clustered semi-competing risks analysis in quality of end-of-life care studies
临终关怀研究质量中的聚类半竞争风险分析
- 批准号:
8805834 - 财政年份:2014
- 资助金额:
$ 56.68万 - 项目类别:
Design and Inference for Hybrid Ecological Studies
混合生态研究的设计和推理
- 批准号:
7434489 - 财政年份:2007
- 资助金额:
$ 56.68万 - 项目类别:
Design and Inference for Hybrid Ecological Studies
混合生态研究的设计和推理
- 批准号:
7626310 - 财政年份:2007
- 资助金额:
$ 56.68万 - 项目类别:
Design and Inference for Hybrid Ecological Studies
混合生态研究的设计和推理
- 批准号:
7185366 - 财政年份:2007
- 资助金额:
$ 56.68万 - 项目类别:
相似国自然基金
人机共驾模式下驾驶人监管注意力弱化-恢复规律与调控机理
- 批准号:52302425
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
逆全球化下跨国企业动态能力形成的微观机理研究:高管注意力配置视角
- 批准号:72302220
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
注意力感知驱动的车载多模态传感器在线协同校正
- 批准号:42301468
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于两阶段注意力深度学习方法的系统性金融风险测度与预警研究
- 批准号:72301101
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
采用多种稀疏自注意力机制的Transformer隧道衬砌裂缝检测方法研究
- 批准号:62301339
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
相似海外基金
Executive functions in urban Hispanic/Latino youth: exposure to mixture of arsenic and pesticides during childhood
城市西班牙裔/拉丁裔青年的执行功能:童年时期接触砷和农药的混合物
- 批准号:
10751106 - 财政年份:2024
- 资助金额:
$ 56.68万 - 项目类别:
Paid Sick Leave Mandates and Mental Healthcare Service Use
带薪病假规定和心理保健服务的使用
- 批准号:
10635492 - 财政年份:2023
- 资助金额:
$ 56.68万 - 项目类别:
Addressing Gaps in Language Access Services through a Patient-Centered Decision-Support Tool
通过以患者为中心的决策支持工具解决语言获取服务中的差距
- 批准号:
10699030 - 财政年份:2023
- 资助金额:
$ 56.68万 - 项目类别:
Promesa: Urban gardening and peer nutritional counseling to improve HIV care outcomes among people with food insecurity in the Dominican Republic
Promesa:城市园艺和同伴营养咨询可改善多米尼加共和国粮食不安全人群的艾滋病毒护理结果
- 批准号:
10698434 - 财政年份:2023
- 资助金额:
$ 56.68万 - 项目类别:
A Dry Electrode for Universal Accessibility to EEG
用于普遍获取脑电图的干电极
- 批准号:
10761609 - 财政年份:2023
- 资助金额:
$ 56.68万 - 项目类别: