Open Health Natural Language Processing Collaboratory
开放健康自然语言处理合作实验室
基本信息
- 批准号:9385056
- 负责人:
- 金额:$ 158.96万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-09-01 至 2022-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Project Summary
One of the major barriers in leveraging Electronic Health Record (EHR) data for clinical and translational
science is the prevalent use of unstructured or semi-structured clinical narratives for documenting clinical
information. Natural Language Processing (NLP), which extracts structured information from narratives, has
received great attention and has played a critical role in enabling secondary use of EHRs for clinical and
translational research. As demonstrated by large scale efforts such as ACT (Accrual of patients for Clinical
Trials), eMERGE, and PCORnet, using EHR data for research rests on the capabilities of a robust data and
informatics infrastructure that allows the structuring of clinical narratives and supports the extraction of clinical
information for downstream applications. Current successful NLP use cases often require a strong informatics
team (with NLP experts) to work with clinicians to supply their domain knowledge and build customized NLP
engines iteratively. This requires close collaboration between NLP experts and clinicians, not feasible at
institutions with limited informatics support. Additionally, the usability, portability, and generalizability of the
NLP systems are still limited, partially due to the lack of access to EHRs across institutions to train the
systems. The limited availability of EHR data limits the training available to improve the workforce competence
in clinical NLP. We aim to address the above challenges by extending our existing collaboration among
multiple CTSA hubs on open health natural language processing (OHNLP) to share distributional information of
NLP artifacts (i.e., words, n-grams, phrases, sentences, concept mentions, concepts, and text segments)
acquired from real EHRs across multiple institutions. We will leverage the advanced privacy-preserving
computing infrastructure of iDASH (integrating Data for Analysis, Anonymization, and SHaring) for privacy-
preserving data analysis models and will partner with diverse communities including Observational Health Data
Sciences and Informatics (OHDSI), Precision Medicine Initiative (PMI), PCORnet, and Rare Diseases Clinical
Research Network (RDCRN) to demonstrate the utility of NLP for translational research. This CTSA innovation
award RFA provides us with a unique opportunity to address the challenges faced with clinical NLP and
through strong partnership with multiple research communities and leadership roles of the research team in
clinical NLP, we envision that the successful delivery of this project will broaden the utilization of clinical NLP
across the research community. There are four aims planned: i) obtain PHI-suppressed NLP artifacts with
retained distribution information across multiple institutions and assess the privacy risk of accessing PHI-
suppressed artifacts, ii) generate a synthetic text corpus for exploratory analysis of clinical narratives and
assess its utility in NLP tasks leveraging various NLP challenges, iii) develop privacy-preserving computational
phenotyping models empowered with NLP, and iv) partner with diverse communities to demonstrate the utility
of our project for translational research.
项目摘要
利用电子健康记录(EHR)数据的主要障碍之一
科学是普遍使用非结构化或半结构化临床叙述来记录临床的
信息。自然语言处理(NLP)从叙述中提取结构化信息,
受到了极大的关注,并且在使EHR用于临床和
翻译研究。正如大规模努力(例如ACT)所证明的那样(临床患者应计
试验),出现和PCORNET使用EHR数据进行研究,取决于强大的数据的功能
信息基础设施,允许临床叙事结构并支持临床的提取
下游应用程序的信息。当前成功的NLP用例通常需要强大的信息学
团队(与NLP专家一起)与临床医生一起提供其领域知识并建立定制的NLP
发动机迭代。这需要NLP专家和临床医生之间的密切合作,在
信息学支持有限的机构。此外,
NLP系统仍然受到限制,部分原因是跨机构无法访问EHR来培训
系统。 EHR数据的有限可用性限制了可用的培训以提高劳动力能力
在临床NLP中。我们的目的是通过扩展我们现有的合作来应对上述挑战
关于开放健康自然语言处理(OHNLP)的多个CTSA枢纽,以共享
NLP工件(即单词,n-grams,短语,句子,概念提及,概念和文本段)
从多个机构中从真实的EHR中获取。我们将利用高级隐私权
计算IDASH的基础架构(集成数据以进行分析,匿名和共享)以进行隐私 -
保存数据分析模型,并将与包括观察健康数据在内的不同社区合作
科学和信息学(OHDSI),精密医学倡议(PMI),PCORNET和罕见疾病临床
研究网络(RDCRN)证明了NLP用于转化研究的实用性。这项CTSA创新
RFA奖为我们提供了一个独特的机会,以应对临床NLP和
通过与多个研究社区和研究团队的领导角色的牢固伙伴关系
临床NLP,我们设想该项目的成功交付将扩大临床NLP的利用
在整个研究界。计划有四个目标:i)与
保留多个机构之间的分销信息,并评估访问PHI的隐私风险
抑制工件,ii)生成一个合成文本语料库,用于临床叙述和
评估其在利用各种NLP挑战的NLP任务中的效用,iii)开发保护隐私计算
通过NLP授权的表型模型,iv)与不同社区合作以展示实用程序
我们的转化研究项目。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据
数据更新时间:2024-06-01
Xiaoqian Jiang的其他基金
Robust privacy preserving distributed analysis platform for cancer research: addressing data bias and disparities
用于癌症研究的强大隐私保护分布式分析平台:解决数据偏差和差异
- 批准号:1064256210642562
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Harmonizing multiple clinical trials for Alzheimer's disease to investigate differential responses to treatment via federated counterfactual learning
协调阿尔茨海默病的多项临床试验,通过联合反事实学习研究对治疗的差异反应
- 批准号:1071479710714797
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
iDASH Genome Privacy and Security Competition Workshop
iDASH 基因组隐私和安全竞赛研讨会
- 批准号:1061429210614292
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
- 批准号:1074059710740597
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
- 批准号:1036734910367349
- 财政年份:2022
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Finding combinatorial drug repositioning therapy for Alzheimer's disease and related dementias
寻找治疗阿尔茨海默病和相关痴呆症的组合药物重新定位疗法
- 批准号:1061568410615684
- 财政年份:2020
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Finding combinatorial drug repositioning therapy for Alzheimer's disease and related dementias
寻找治疗阿尔茨海默病和相关痴呆症的组合药物重新定位疗法
- 批准号:1059820710598207
- 财政年份:2020
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Finding combinatorial drug repositioning therapy for Alzheimer's disease and related dementias
寻找治疗阿尔茨海默病和相关痴呆症的组合药物重新定位疗法
- 批准号:1013350110133501
- 财政年份:2020
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Finding combinatorial drug repositioning therapy for Alzheimer's disease and related dementias
寻找治疗阿尔茨海默病和相关痴呆症的组合药物重新定位疗法
- 批准号:1037745510377455
- 财政年份:2020
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Decentralized differentially-private methods for dynamic data release and analysis
用于动态数据发布和分析的去中心化差分隐私方法
- 批准号:92391009239100
- 财政年份:2017
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
相似国自然基金
基于自适应分级多层图注意力机制的疾病关联微生物预测模型及算法研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
基于自适应分级多层图注意力机制的疾病关联微生物预测模型及算法研究
- 批准号:62272064
- 批准年份:2022
- 资助金额:54.00 万元
- 项目类别:面上项目
注意力引导的复杂场景精确室内定位关键算法研究
- 批准号:62102459
- 批准年份:2021
- 资助金额:24.00 万元
- 项目类别:青年科学基金项目
注意力引导的复杂场景精确室内定位关键算法研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向电力系统安全评估的深度稀疏图注意力卷积集成模型和增量学习算法研究与应用
- 批准号:
- 批准年份:2020
- 资助金额:55 万元
- 项目类别:面上项目
相似海外基金
Dynamic neural coding of spectro-temporal sound features during free movement
自由运动时谱时声音特征的动态神经编码
- 批准号:1065611010656110
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Computer-Aided Triage of Body CT Scans with Deep Learning
利用深度学习对身体 CT 扫描进行计算机辅助分类
- 批准号:1058555310585553
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Development of a Novel Virtual Reality Treatment for Emerging Adults with ADHD
开发一种针对患有多动症的新兴成人的新型虚拟现实治疗方法
- 批准号:1072108410721084
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Quantitative imaging of choroid plexus function and neurofluid circulation in Alzheimer's Disease Related Dementia
阿尔茨海默病相关痴呆症脉络丛功能和神经液循环的定量成像
- 批准号:1071834610718346
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别:
Identifying and addressing missingness and bias to enhance discovery from multimodal health data
识别和解决缺失和偏见,以增强多模式健康数据的发现
- 批准号:1063739110637391
- 财政年份:2023
- 资助金额:$ 158.96万$ 158.96万
- 项目类别: