Crowd-Assisted Deep Learning (CrADLe) Digital Curation to Translate Big Data into Precision Medicine
群体辅助深度学习 (CrADLe) 数字管理将大数据转化为精准医学
基本信息
- 批准号:9979659
- 负责人:
- 金额:$ 46.72万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-08-01 至 2022-07-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsAlzheimer&aposs DiseaseAnimal ModelArtificial IntelligenceBig DataBig Data to KnowledgeBiologicalBiological AssayCategoriesCell LineCell modelClassificationClinicalCollaborationsCommunitiesControlled VocabularyCrowdingDataData SetDefectDepositionDiagnosisDiseaseDisease modelDrug ModelingsE-learningEffectivenessEngineeringFundingFunding AgencyFutureGene ExpressionGene TargetingGenomicsHumanImageIntelligenceLabelLinkLogicMachine LearningMalignant NeoplasmsMapsMeSH ThesaurusMeasuresMedicineMeta-AnalysisMetadataMethodsModelingMolecularMolecular ProfilingNational Research CouncilNatural Language ProcessingOntologyPathway interactionsPatientsPatternPeer ReviewPerformancePharmaceutical PreparationsPhysiciansProblem SolvingPubMedPublic DomainsPublicationsResourcesSamplingScientific InquiryScientistSourceSpecific qualifier valueSpeedTextThe Cancer Genome AtlasTrainingTranslatingUnited States National Institutes of HealthValidationWorkbasebig biomedical databiomarker discoveryburden of illnesscell typeclassical conditioningcomputer programcrowdsourcingdeep learningdeep learning algorithmdigitaldisease phenotypeexperimental studygenomic datahuman diseaseimprovedknockout genelarge scale datanovel therapeuticsopen datapotential biomarkerprecision medicineprogramspublic repositoryspecific biomarkers
项目摘要
PROJECT SUMMARY/ABSTRACT
The NIH and other agencies are funding high-throughput genomics (‘omics) experiments that deposit
digital samples of data into the public domain at breakneck speeds. This high-quality data measures the
‘omics of diseases, drugs, cell lines, model organisms, etc. across the complete gamut of experimental factors
and conditions. The importance of these digital samples of data is further illustrated in linked peer-reviewed
publications that demonstrate its scientific value. However, meta-data for digital samples is recorded as free
text without biocuration necessary for in-depth downstream scientific inquiry.
Deep learning is revolutionary machine intelligence paradigm that allows for an algorithm to program
itself thereby removing the need to explicitly specify rules or logic. Whereas physicians / scientists once
needed to first understand a problem to program computers to solve it, deep learning algorithms optimally tune
themselves to solve problems. Given enough example data to train on, deep learning machine intelligence
outperform humans on a variety of tasks. Today, deep learning is state-of-the-art performance for image
classification, and, most importantly for this proposal, for natural language processing.
This proposal is about engineering Crowd Assisted Deep Learning (CrADLe) machine intelligence to
rapidly scale the digital curation of public digital samples. We will first use our NIH BD2K-funded Search Tag
Analyze Resource for Gene Expression Omnibus (STARGEO.org) to crowd-source human annotation of open
digital samples. We will then develop and train deep learning algorithms for STARGEO digital curation based
on learning the associated free text meta-data each digital sample. Given the ongoing deluge of biomedical
data in the public domain, CrADLe may perhaps be the only way to scale the digital curation towards a
precision medicine ideal.
Finally, we will demonstrate the biological utility to leverage CrADLe for digital curation with two large-
scale and independent molecular datasets in: 1) The Cancer Genome Atlas (TCGA), and 2) The Accelerating
Medicines Partnership-Alzheimer’s Disease (AMP-AD). We posit that CrADLe digital curation of open samples
will augment these two distinct disease projects with a host big data to fuel the discovery of potential biomarker
and gene targets. Therefore, successful funding and completion of this work may greatly reduce the burden of
disease on patients by enhancing the efficiency and effectiveness of digital curation for biomedical big data.
项目概要/摘要
美国国立卫生研究院 (NIH) 和其他机构正在资助高通量基因组学(“组学”)实验,这些实验将
这些高质量的数据以极快的速度进入公共领域。
“涵盖所有实验因素的疾病、药物、细胞系、模型生物体等组学
链接的同行评审进一步说明了这些数字数据样本的重要性。
然而,数字样本的元数据是免费记录的。
没有深入下游科学探究所需的生物管理的文本。
深度学习是革命性的机器智能范式,允许算法进行编程
从而消除了显式指定规则或逻辑的需要。
需要首先了解问题才能对计算机进行编程来解决它,深度学习算法可以优化调整
给定足够的示例数据来训练深度学习机器智能。
如今,深度学习在图像方面的表现已经超越了人类。
分类,对于本提案来说最重要的是自然语言处理。
该提案是关于设计人群辅助深度学习(CrADLe)机器智能以
快速扩展公共数字样本的数字管理我们将首先使用 NIH BD2K 资助的搜索标签。
分析基因表达综合资源 (STARGEO.org) 以众包人类注释开放
然后,我们将为基于 STARGEO 的数字化管理开发和训练深度学习算法。
鉴于生物医学的不断泛滥,学习每个数字样本相关的自由文本元数据。
CrADLe 可能是将数字管理扩展到公共领域的唯一方法
精准医疗的理想选择。
最后,我们将展示利用 CrADLe 进行数字管理的生物学效用,其中包括两个大型项目:
规模和独立的分子数据集:1) 癌症基因组图谱 (TCGA),以及 2) 加速
我们假设 CrADLe 对开放样本进行数字管理。
将利用主机大数据来增强这两个不同的疾病项目,以推动潜在生物标志物的发现
因此,成功资助和完成这项工作可能会大大减轻负担。
通过提高生物医学大数据数字治疗的效率和有效性来治疗患者的疾病。
项目成果
期刊论文数量(14)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Translational Radiomics: Defining the Strategy Pipeline and Considerations for Application-Part 1: From Methodology to Clinical Implementation.
- DOI:10.1016/j.jacr.2017.12.008
- 发表时间:2018-03
- 期刊:
- 影响因子:0
- 作者:Shaikh F;Franc B;Allen E;Sala E;Awan O;Hendrata K;Halabi S;Mohiuddin S;Malik S;Hadley D;Shrestha R
- 通讯作者:Shrestha R
Dissecting novel mechanisms of hepatitis B virus related hepatocellular carcinoma using meta-analysis of public data.
使用公共数据的荟萃分析剖析乙型肝炎病毒相关肝细胞癌的新机制。
- DOI:10.4251/wjgo.v14.i9.1856
- 发表时间:2022-09-15
- 期刊:
- 影响因子:3
- 作者:Aljabban J;Rohr M;Syed S;Cohen E;Hashi N;Syed S;Khorfan K;Aljabban H;Borkowski V;Segal M;Mukhtar M;Mohammed M;Boateng E;Nemer M;Panahiazar M;Hadley D;Jalil S;Mumtaz K
- 通讯作者:Mumtaz K
Translational Radiomics: Defining the Strategy Pipeline and Considerations for Application-Part 2: From Clinical Implementation to Enterprise.
- DOI:10.1016/j.jacr.2017.12.006
- 发表时间:2018-03
- 期刊:
- 影响因子:0
- 作者:Shaikh F;Franc B;Allen E;Sala E;Awan O;Hendrata K;Halabi S;Mohiuddin S;Malik S;Hadley D;Shrestha R
- 通讯作者:Shrestha R
Data Analytics of Electronic Health Records to Enhance Care of Coronary Artery Disease in Younger Women with Avoiding Possible Delay in Treatment.
- DOI:10.3233/shti220277
- 发表时间:2022-06-06
- 期刊:
- 影响因子:0
- 作者:Panahiazar, Maryam;Bishara, Andrew M;Chern, Yorick;Alizadehsani, Roohallah;Latif, Omar S;Hadley, Dexter;Beygui, Ramin E
- 通讯作者:Beygui, Ramin E
Precision Diagnosis Of Melanoma And Other Skin Lesions From Digital Images.
从数字图像精确诊断黑色素瘤和其他皮肤病变。
- DOI:
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Bhattacharya,Abhishek;Young,Albert;Wong,Andrew;Stalling,Simone;Wei,Maria;Hadley,Dexter
- 通讯作者:Hadley,Dexter
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Dexter D Hadley其他文献
Dexter D Hadley的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Dexter D Hadley', 18)}}的其他基金
Crowd-Assisted Deep Learning (CrADLe) Digital Curation to Translate Big Data into Precision Medicine
群体辅助深度学习 (CrADLe) 数字管理将大数据转化为精准医学
- 批准号:
10063300 - 财政年份:2017
- 资助金额:
$ 46.72万 - 项目类别:
Crowd-Assisted Deep Learning (CrADLe) Digital Curation to Translate Big Data into Precision Medicine
群体辅助深度学习 (CrADLe) 数字管理将大数据转化为精准医学
- 批准号:
9403171 - 财政年份:2017
- 资助金额:
$ 46.72万 - 项目类别:
相似海外基金
Fluency from Flesh to Filament: Collation, Representation, and Analysis of Multi-Scale Neuroimaging data to Characterize and Diagnose Alzheimer's Disease
从肉体到细丝的流畅性:多尺度神经影像数据的整理、表示和分析,以表征和诊断阿尔茨海默病
- 批准号:
10462257 - 财政年份:2023
- 资助金额:
$ 46.72万 - 项目类别:
Traumatic Brain Injury Anti-Seizure Prophylaxis in the Medicare Program
医疗保险计划中的创伤性脑损伤抗癫痫预防
- 批准号:
10715238 - 财政年份:2023
- 资助金额:
$ 46.72万 - 项目类别:
Brain Digital Slide Archive: An Open Source Platform for data sharing and analysis of digital neuropathology
Brain Digital Slide Archive:数字神经病理学数据共享和分析的开源平台
- 批准号:
10735564 - 财政年份:2023
- 资助金额:
$ 46.72万 - 项目类别:
Deciphering the Glycan Code in Human Alzheimer's Disease Brain
破译人类阿尔茨海默病大脑中的聚糖代码
- 批准号:
10704673 - 财政年份:2023
- 资助金额:
$ 46.72万 - 项目类别:
Enhanced Medication Management to Control ADRD Risk Factors Among African Americans and Latinos
加强药物管理以控制非裔美国人和拉丁裔的 ADRD 风险因素
- 批准号:
10610975 - 财政年份:2023
- 资助金额:
$ 46.72万 - 项目类别: