National NLP Clinical Challenges (n2c2): Challenges in Natural Language Processing for Clinical Narratives
国家 NLP 临床挑战 (n2c2):临床叙述自然语言处理的挑战
基本信息
- 批准号:10393499
- 负责人:
- 金额:$ 2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-05-15 至 2024-04-30
- 项目状态:已结题
- 来源:
- 关键词:Access to InformationAddressAmericanClinicClinicalCollaborationsCommunitiesCommunity DevelopmentsComplementDataData ScienceData SetDevelopmentEducational workshopElectronic Health RecordEvaluationFosteringFutureFuture GenerationsGoalsGoldGrantGrowthHandHeadHealthcareImprove AccessIndividualInformaticsInstitutionIsraelJournalsKnowledgeMeasuresMedical InformaticsMedical centerMethodologyNatural Language ProcessingOutcomePaperPeer ReviewPerformancePrivacyPublicationsPublishingRecordsResearchResearch PersonnelRestRunningSeriesSourceStructureSystemSystems DevelopmentTargeted ResearchTechnologyTimeUnited States National Institutes of HealthUnited States National Library of MedicineUniversitiesbasebiomedical informaticsclinical developmentcomputerizedfallshead-to-head comparisonhealth dataindexingmedical schoolsmeetingspractical applicationprogramssymposiumusabilityworking group
项目摘要
Project Summary and Abstract
Narratives of electronic health records (EHRs) contain useful information that is difficult to automatically extract,
index, search, or interpret. Natural language processing (NLP) technologies can extract this information and
convert it in to a structured format that is more readily accessible by computerized systems. However, the
development of NLP systems is contingent on access to relevant data and EHRs are notoriously difficult to obtain
because of privacy reasons. Despite the recent efforts to de-identify and release narrative EHRs for research,
these data are still very rare. As a result, clinical NLP, as a field has lagged behind. To address this problem,
since 2006, we organized thirteen shared tasks, accompanied with workshops and journal publications. Twelve
of these shared tasks have focused on the development of clinical NLP systems and the remaining one on the
usability of these systems. We have covered both depth and breadth in terms of shared tasks, preparing tasks
that study cutting-edge NLP problems on a variety of EHR data from multiple institutions. Our shared tasks are
the longest running series of clinical NLP shared tasks, with ever growing EHR data sets, tasks, and participation.
Our most popular three data sets have been cited 495 (2010 data), 284 (2006 de-id data), and 274 (2009 data)
times, respectively, representing hundreds of articles that have come out of these three data sets alone. Our
goal in this proposal is to continue the efforts we started in 2006 under i2b2 shared task challenges (i2b2, NIH
NLM U54LM008748, PI: Kohane and R13 LM011411, PI: Uzuner) to de-identify EHRs, annotate them with gold-
standard annotations for clinical NLP tasks, and release them to the research community for the development
and head-to-head comparison of clinical NLP systems, for the advancement of the state of the art. Continuing
our efforts under National NLP Clinical Challenges (n2c2) based at the Health Data Science program of the
newly established Department of Biomedical Informatics at Harvard Medical School, we aim to form partnerships
with the community to grow the shared task efforts in several ways: (1) grow the available de-identified EHR data
sets through partnerships that can contribute to the volume and variety of the data, and (2) grow the available
gold-standard annotations in terms of depth and breadth of NLP tasks. Given these aims and partnerships, we
plan to hold a series of shared tasks. We will complement these shared tasks with workshops that meet in
conjunction with the Fall Symposium of the American Medical Informatics Association and with journal special
issues so that advancement of the state of the art can be sped up and future generations can build on the past.
项目概要和摘要
电子健康记录 (EHR) 的叙述包含难以自动提取的有用信息,
索引、搜索或解释。自然语言处理(NLP)技术可以提取这些信息并
将其转换为计算机系统更容易访问的结构化格式。然而,
NLP 系统的开发取决于相关数据的获取,而 EHR 的获取是出了名的困难
由于隐私原因。尽管最近努力去识别和发布叙述性电子病历以供研究,
这些数据仍然非常罕见。结果,临床 NLP 作为一个领域已经落后了。为了解决这个问题,
自 2006 年以来,我们组织了 13 项共同任务,并举办了研讨会和期刊出版物。十二
这些共享任务中的一部分集中在临床 NLP 系统的开发上,其余的则集中在临床 NLP 系统的开发上。
这些系统的可用性。我们在共享任务、准备任务方面涵盖了深度和广度
研究来自多个机构的各种 EHR 数据的前沿 NLP 问题。我们的共同任务是
运行时间最长的临床 NLP 共享任务系列,其 EHR 数据集、任务和参与度不断增长。
我们最受欢迎的三个数据集被引用次数为495(2010年数据)、284(2006年de-id数据)和274(2009年数据)
次,分别代表仅来自这三个数据集的数百篇文章。我们的
本提案的目标是继续我们 2006 年在 i2b2 共享任务挑战(i2b2、NIH
NLM U54LM008748,PI:Kohane 和 R13 LM011411,PI:Uzuner)对 EHR 进行去识别化,用金色对其进行注释-
临床NLP任务的标准注释,并发布给研究社区开发
以及临床 NLP 系统的头对头比较,以推进最先进的技术。继续
我们在国家 NLP 临床挑战 (n2c2) 下的努力基于健康数据科学计划
哈佛医学院新成立生物医学信息学系,我们旨在建立合作伙伴关系
与社区合作,通过多种方式扩大共享任务的力度:(1) 扩大可用的去识别化 EHR 数据
通过合作伙伴关系来增加数据的数量和种类,以及 (2) 增加可用数据
NLP 任务的深度和广度方面的黄金标准注释。鉴于这些目标和伙伴关系,我们
计划举行一系列共同任务。我们将通过举办研讨会来补充这些共同任务
与美国医学信息学协会秋季研讨会以及期刊特别会议相结合
问题,以便加快最先进技术的进步,让后代能够在过去的基础上再接再厉。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ozlem Uzuner其他文献
Ozlem Uzuner的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ozlem Uzuner', 18)}}的其他基金
Joint learning methods for event and relation extraction from clinical narratives
从临床叙述中提取事件和关系的联合学习方法
- 批准号:
10507223 - 财政年份:2022
- 资助金额:
$ 2万 - 项目类别:
National NLP Clinical Challenges (n2c2): Challenges in Natural Language Processing for Clinical Narratives
国家 NLP 临床挑战 (n2c2):临床叙述自然语言处理的挑战
- 批准号:
10670801 - 财政年份:2019
- 资助金额:
$ 2万 - 项目类别:
Leveraging Unlabeled and Pseudo Data for Clinical Information Extraction
利用未标记和伪数据进行临床信息提取
- 批准号:
9813134 - 财政年份:2019
- 资助金额:
$ 2万 - 项目类别:
National NLP Clinical Challenges (n2c2): Challenges in Natural Language Processing for Clinical Narratives
国家 NLP 临床挑战 (n2c2):临床叙述自然语言处理的挑战
- 批准号:
9759499 - 财政年份:2019
- 资助金额:
$ 2万 - 项目类别:
Challenges in Natural Language Processing in Clinical Text
临床文本自然语言处理的挑战
- 批准号:
9597333 - 财政年份:2017
- 资助金额:
$ 2万 - 项目类别:
Challenges in Natural Language Processing for Clinical Narratives
临床叙述自然语言处理的挑战
- 批准号:
8400218 - 财政年份:2012
- 资助金额:
$ 2万 - 项目类别:
Challenges in Natural Language Processing for Clinical Narratives
临床叙述自然语言处理的挑战
- 批准号:
8913773 - 财政年份:2012
- 资助金额:
$ 2万 - 项目类别:
Challenges in Natural Language Processing for Clinical Narratives
临床叙述自然语言处理的挑战
- 批准号:
8722031 - 财政年份:2012
- 资助金额:
$ 2万 - 项目类别:
相似国自然基金
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
相似海外基金
PREP-MT: Providing Research Education for Postbaccalaureate Trainees in Montana
PREP-MT:为蒙大拿州的学士后学员提供研究教育
- 批准号:
10772282 - 财政年份:2023
- 资助金额:
$ 2万 - 项目类别:
COVID-19 Telehealth Policies' Impact on Provision of Alcohol and Substance Use Disorder Services at Federally Qualified Health Centers
COVID-19 远程医疗政策对联邦合格健康中心提供酒精和药物滥用障碍服务的影响
- 批准号:
10662155 - 财政年份:2023
- 资助金额:
$ 2万 - 项目类别:
Reading Bees: Adapting and Testing a Mobile App Designed to Empower Families to Read more Interactively with Children in Distinct Geographical and Cultural Contexts
阅读蜜蜂:调整和测试一款移动应用程序,旨在让家庭能够在不同的地理和文化背景下与孩子进行更多互动阅读
- 批准号:
10729773 - 财政年份:2023
- 资助金额:
$ 2万 - 项目类别:
Improving Access to Diabetes Information for Deaf and Hard of Hearing Populations
改善失聪和听力障碍人群获取糖尿病信息的机会
- 批准号:
10729891 - 财政年份:2023
- 资助金额:
$ 2万 - 项目类别:
Comprehensive Pediatric Phenotyping for Evidence-Based Diagnosis in Genetic Disease
用于遗传病循证诊断的综合儿科表型分析
- 批准号:
10644205 - 财政年份:2023
- 资助金额:
$ 2万 - 项目类别: