Advanced End-to-End Relation Extraction with Deep Neural Networks
使用深度神经网络进行高级端到端关系提取
基本信息
- 批准号:10615695
- 负责人:
- 金额:$ 33.27万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-07-01 至 2025-03-31
- 项目状态:未结题
- 来源:
- 关键词:Adverse eventArchitectureAreaBenchmarkingBioinformaticsBiologyBiomedical ResearchBlack raceClassificationClinicalCodeCollaborationsCombination Drug TherapyCommunicationCommunitiesComplexComputer softwareDataData SetDependenceDiseaseDistantDrug InteractionsEncapsulatedEtiologyEvaluationFosteringFundingFutureGenerationsGenesGrowthHandHeartInformation RetrievalInformation SciencesIntramural ResearchJointsKnowledge DiscoveryLabelLanguageLeadLinkLiteratureManualsMapsMeasuresMethodologyMethodsModelingMolecularNamesNatural Language ProcessingPatientsPeer ReviewPerformancePeriodicalsPharmaceutical PreparationsPhysiciansPositioning AttributeProcessReportingResearchResearch PersonnelResourcesReview LiteratureScientistSemanticsSoftware ToolsSourceStandardizationStructureSupervisionSystemTerminologyTestingTextTrainingTranslational ResearchTreesUnited States National Library of Medicinebiomedical data scienceclinical caredeep neural networkimprovedinsightinterestknowledge baseknowledgebasemachine learning methodnatural languageneuralneural networkneural network architecturenew therapeutic targetnoveloff-label useprotein protein interactionside effectsocial mediasupervised learningsyntaxtransfer learning
项目摘要
ABSTRACT
Relations linking various biomedical entities constitute a crucial resource that enables biomedical data science
applications and knowledge discovery. Relational information spans the translational science spectrum going
from biology (e.g., protein–protein interactions) to translational bioinformatics (e.g., gene–disease associations),
and eventually to clinical care (e.g., drug–drug interactions). Scientists report newly discovered relations in nat-
ural language through peer-reviewed literature and physicians may communicate them in clinical notes. More
recently, patients are also reporting side-effects and adverse events on social media. With exponential growth in
textual data, advances in biomedical natural language processing (BioNLP) methods are gaining prominence for
biomedical relation extraction (BRE) from text. Most current efforts in BRE follow a pipeline approach containing
named entity recognition (NER), entity normalization (EN), and relation classification (RC) as subtasks. They
typically suffer from error snowballing — errors in a component of the pipeline leading to more downstream errors
— resulting in lower performance of the overall BRE system. This situation has lead to evaluation of different
BRE substaks conducted in isolation. In this proposal we make a strong case for strictly end-to-end evaluations
where relations are to be produced from raw text. We propose novel deep neural network architectures that
model BRE in an end-to-end fashion and directly identify relations and corresponding entity spans in a single
pass. We also extend our architectures to n-ary and cross-sentence settings where more than two entities may
need to be linked even as the relation is expressed across multiple sentences. We also propose to create two
new gold standard BRE datasets, one for drug–disease treatment relations and another first of a kind dataset
for combination drug therapies. Our main hypothesis is that our end-to-end extraction models will yield supe-
rior performance when compared with traditional pipelines. We test this through (1). intrinsic evaluations based
on standard performance measures with several gold standard datasets and (2). extrinsic application oriented
assessments of relations extracted with use-cases in information retrieval, question answering, and knowledge
base completion. All software and data developed as part of this project will be made available for public use and
we hope this will foster rigorous end-to-end benchmarking of BRE systems.
抽象的
连接各种生物医学实体的关系构成了实现生物医学数据科学的重要资源
应用和知识发现跨越了转化科学领域。
从生物学(例如蛋白质-蛋白质相互作用)到转化生物信息学(例如基因-疾病关联),
并最终进入临床护理(例如,药物与药物的相互作用)。
通过同行评审的文献来了解口头语言,医生可以在临床笔记中传达它们。
最近,患者也在社交媒体上报告副作用和不良事件,数量呈指数级增长。
文本数据,生物医学自然语言处理(BioNLP)方法的进步正在引起人们的关注
从文本中提取生物医学关系(BRE)目前大多数 BRE 工作都遵循包含以下内容的管道方法。
命名实体识别(NER)、实体标准化(EN)和关系分类(RC)作为子任务。
通常会遭受错误滚雪球般的影响——管道组件中的错误会导致更多下游错误
——导致整个BRE系统的性能降低,这种情况导致了不同的评估。
在此提案中,我们强烈建议进行严格的端到端评估。
我们提出了新颖的深度神经网络架构,其中关系是从原始文本中产生的。
以端到端的方式对 BRE 进行建模,并直接识别单个关系和相应的实体跨度
我们还将我们的架构扩展到 n 元和跨句子设置,其中可能有两个以上的实体。
即使关系是跨多个句子表达的,我们也建议创建两个。
新的黄金标准 BRE 数据集,一个用于药物与疾病治疗关系的数据集,另一个同类数据集
我们的主要假设是我们的端到端提取模型将产生超强的效果。
与传统管道相比,我们通过基于(1)的内在评估来测试这一点。
基于多个黄金标准数据集的标准性能测量和(2)。
对信息检索、问答和知识中的用例提取的关系进行评估
作为该项目一部分开发的所有软件和数据将可供公众使用和使用。
我们希望这将促进 BRE 系统严格的端到端基准测试。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Improved biomedical word embeddings in the transformer era.
改进了 Transformer 时代的生物医学词嵌入。
- DOI:
- 发表时间:2021-08
- 期刊:
- 影响因子:4.5
- 作者:Noh, Jiho;Kavuluru, Ramakanth
- 通讯作者:Kavuluru, Ramakanth
Joint Learning for Biomedical NER and Entity Normalization: Encoding Schemes, Counterfactual Examples, and Zero-Shot Evaluation.
生物医学 NER 和实体标准化的联合学习:编码方案、反事实示例和零样本评估。
- DOI:
- 发表时间:2021-08
- 期刊:
- 影响因子:0
- 作者:Noh, Jiho;Kavuluru, Ramakanth
- 通讯作者:Kavuluru, Ramakanth
End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies.
化学-蛋白质相互作用提取的端到端模型:更好的标记化和基于跨度的管道策略。
- DOI:
- 发表时间:2023-06
- 期刊:
- 影响因子:0
- 作者:Ai, Xuguang;Kavuluru, Ramakanth
- 通讯作者:Kavuluru, Ramakanth
Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis.
获取家族史信息词典:来自变压器辅助子语言分析的双向编码器表示。
- DOI:
- 发表时间:2023-06-27
- 期刊:
- 影响因子:3.2
- 作者:Wang, Liwei;He, Huan;Wen, Andrew;Moon, Sungrim;Fu, Sunyang;Peterson, Kevin J;Ai, Xuguang;Liu, Sijia;Kavuluru, Ramakanth;Liu, Hongfang
- 通讯作者:Liu, Hongfang
COVID-19 Event Extraction from Twitter via Extractive Question Answering with Continuous Prompts.
通过带有连续提示的提取问答从 Twitter 中提取 COVID-19 事件。
- DOI:
- 发表时间:2024-01-25
- 期刊:
- 影响因子:0
- 作者:Jiang, Yuhang;Kavuluru, Ramakanth
- 通讯作者:Kavuluru, Ramakanth
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Venkata Naga Ramakanth Kavuluru其他文献
Venkata Naga Ramakanth Kavuluru的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Venkata Naga Ramakanth Kavuluru', 18)}}的其他基金
Fast and fine: NLP methods for near real-time and fine-grained overdose surveillance
快速而精细:用于近实时和细粒度过量监测的 NLP 方法
- 批准号:
10590000 - 财政年份:2022
- 资助金额:
$ 33.27万 - 项目类别:
Advanced End-to-End Relation Extraction with Deep Neural Networks
使用深度神经网络进行高级端到端关系提取
- 批准号:
10386881 - 财政年份:2020
- 资助金额:
$ 33.27万 - 项目类别:
Advanced End-to-End Relation Extraction with Deep Neural Networks
使用深度神经网络进行高级端到端关系提取
- 批准号:
10200889 - 财政年份:2020
- 资助金额:
$ 33.27万 - 项目类别:
相似国自然基金
“共享建筑学”的时空要素及表达体系研究
- 批准号:
- 批准年份:2019
- 资助金额:63 万元
- 项目类别:面上项目
基于城市空间日常效率的普通建筑更新设计策略研究
- 批准号:51778419
- 批准年份:2017
- 资助金额:61.0 万元
- 项目类别:面上项目
宜居环境的整体建筑学研究
- 批准号:51278108
- 批准年份:2012
- 资助金额:68.0 万元
- 项目类别:面上项目
The formation and evolution of planetary systems in dense star clusters
- 批准号:11043007
- 批准年份:2010
- 资助金额:10.0 万元
- 项目类别:专项基金项目
新型钒氧化物纳米组装结构在智能节能领域的应用
- 批准号:20801051
- 批准年份:2008
- 资助金额:18.0 万元
- 项目类别:青年科学基金项目
相似海外基金
TeleLine: Plug-n-Play Inline Respiratory Remote Data Acquisition System
TeleLine:即插即用内联呼吸远程数据采集系统
- 批准号:
10603124 - 财政年份:2023
- 资助金额:
$ 33.27万 - 项目类别:
Adolescent trauma produces enduring disruptions in sleep architecture that lead to increased risk for adult mental illness
青少年创伤会对睡眠结构产生持久的破坏,从而导致成人精神疾病的风险增加
- 批准号:
10730872 - 财政年份:2023
- 资助金额:
$ 33.27万 - 项目类别:
Value of Sleep Metrics in Predicting Opioid-Use Disorder Treatment Outcomes: Leadership and Data Coordinating Center
睡眠指标在预测阿片类药物使用障碍治疗结果中的价值:领导力和数据协调中心
- 批准号:
10783610 - 财政年份:2023
- 资助金额:
$ 33.27万 - 项目类别:
A translational bioinformatics approach to elucidate and mitigate polypharmacy induced adverse drug reactions
阐明和减轻复方用药引起的药物不良反应的转化生物信息学方法
- 批准号:
10507532 - 财政年份:2022
- 资助金额:
$ 33.27万 - 项目类别:
Longitudinal neuroimaging and statistical genetics modeling of substance use and trauma-related phenotypes
物质使用和创伤相关表型的纵向神经影像和统计遗传学模型
- 批准号:
10592238 - 财政年份:2022
- 资助金额:
$ 33.27万 - 项目类别: