Biology-aware machine learning methods for characterizing microbiome genotype and phenotype
用于表征微生物组基因型和表型的生物学感知机器学习方法
基本信息
- 批准号:10275055
- 负责人:
- 金额:$ 34.47万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-15 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:AdoptedAlgorithmsAreaAwarenessBiologicalBiologyBiomedical ResearchCharacteristicsComputing MethodologiesDataData SetDiseaseEnvironmentEpidemiologyGenomeGenomicsGenotypeGoalsHigh Performance ComputingImmunologyKnowledgeLaboratoriesMachine LearningMeasurableMeasuresMetagenomicsMethodsModernizationOrganismPhenotypePhylogenetic AnalysisPhylogenyProcessRecording of previous eventsResearchSamplingSequence AlignmentShapesStatistical Data InterpretationTechniquesTestingTreesUpdateWorkcomparativedeep learningdesigngenome-wideimprovedinterestmachine learning methodmicrobiomemicrobiome analysismultiple data sourcesstatistics
项目摘要
PROJECT SUMMARY
1 The Mirarab laboratory designs leading computational methods for answering biological and biomedical ques-
2 tions, focusing on scalability and accuracy. These methods span several areas (e.g., microbiome profiling,
3 multiple sequence alignment, and phylogenomics), and a common thread among them is evolutionary mod-
4 eling. The lab has developed scalable and accurate methods for reconstructing evolutionary histories (i.e.,
5 phylogenies) and using these histories in downstream biomedical applications. Reconstructing phylogenies is a
6 fundamental goal and a precursor to many biological analyses. Methods developed by this lab (e.g., ASTRAL)
7 are at the forefronts of modern genome-wide phylogenetics. Moreover, biomedical research increasingly uses
8 evolutionary histories in diverse areas like microbiome analyses, immunology, epidemiology, and comparative
9 genomics. While the lab has previously focused more on inferring species histories, it has recently started
10 to shift its focus to developing methods for microbiome analyses. The inference and the use of evolutionary
11 histories in analyzing environmental microbiome samples present a unique set of challenges.
12 In the next five years, the Mirarab lab will focus on designing, testing, and applying improved methods for
13 statistical analyses of microbiome data. These methods will target two questions. (i) Profiling: What organisms
14 constitute a given sample? (ii) Association: How are samples different in their organismal composition, and
15 how do these differences connect to measurable characteristics of their environment? While both questions
16 have been subject to considerable research, many computational challenges remain, providing an opportunity
17 for better methods to make a significant impact. Instead of focusing solely on new algorithms, the lab will
18 also work on building better reference datasets and combining data from multiple sources. Thus, the project
19 aims to harness the unprecedented computational power, large available datasets, and recent advances in
20 machine learning to improve state-of-the-art dramatically. The project will not use off-the-shelf machine learning
21 methods in a black-box fashion. Instead, it develops methods that incorporate biological knowledge (e.g., of the
22 evolutionary relationships) into machine learning methods in a principled biologically-motivated fashion.
23 The lab will pursue several ambitious goals for both profiling and association questions. The project will
24 (i) create methods to infer a continuously-updated reference alignment and tree encompassing all sequenced
25 prokaryotic genomes (half a million currently) to be used for profiling, (ii) build methods for ultra-sensitive sam-
26 ple profiling, (iii) use deep learning to connect data obtained using amplicon sequencing and metagenomics,
27 (iv) build discordance-aware phylogenetic measures of sample differentiation, and (v) develop machine learning
28 methods for associating a profiled microbiome to phenotypes of interest such as disease. These new methods
29 will draw on statistics, machine learning, discrete optimization, and high-performance computing. Consistent
30 with the goals of MIRA, the project may explore new unforeseen opportunities if they fit its general goals.
项目概要
1 Mirarab 实验室设计了领先的计算方法来回答生物学和生物医学问题
2 项,重点关注可扩展性和准确性,这些方法涵盖多个领域(例如微生物组分析、
3 多重序列比对和系统基因组学),其中的共同点是进化模式
4 eling。实验室开发了可扩展且准确的方法来重建进化历史(即,
5 系统发育)并在下游生物医学应用中使用这些历史重建系统发育。
6 本实验室开发的基本目标和方法(例如 ASTRAL)。
7 处于现代全基因组系统发育学的前沿,而且生物医学研究越来越多地使用它们。
微生物组分析、免疫学、流行病学和比较等不同领域的 8 条进化史
9 基因组学虽然之前更多地专注于推断物种历史,但最近才开始研究。
10 将重点转向微生物组分析方法的开发 进化的推论和使用。
分析环境微生物组样本的 11 条历史提出了一系列独特的挑战。
12 在接下来的五年中,Mirarab 实验室将专注于设计、测试和应用改进的方法
13 微生物组数据的统计分析这些方法将针对两个问题(i)分析:什么生物体。
14 构成给定样本? (ii) 关联:样本的生物成分有何不同,以及
15 这些差异如何与其环境的可测量特征联系起来?
16 已经经过大量研究,仍然存在许多计算挑战,提供了机会
17 实验室将不再仅仅关注新算法,而是寻求更好的方法来产生重大影响。
18 还致力于构建更好的参考数据集并组合来自多个来源的数据。
19 旨在利用前所未有的计算能力、大量可用数据集以及最新进展
20 显着提高最先进的机器学习 该项目不会使用现成的机器学习。
相反,它以黑盒方式开发了 21 种方法,其中包含了生物学知识(例如,生物知识)。
22 进化关系)以原则性的生物驱动方式转化为机器学习方法。
23 该实验室将在分析和关联问题方面追求几个雄心勃勃的目标。
24 (i) 创建方法来推断持续更新的参考比对和包含所有测序的树
25 个原核基因组(目前有 50 万个)用于分析,(ii) 构建超灵敏样本的方法
26 ple 分析,(iii) 使用深度学习来连接使用扩增子测序和宏基因组学获得的数据,
27 (iv) 建立样本分化的不一致感知系统发育测量,以及 (v) 开发机器学习
28 种将微生物组与感兴趣的表型(例如疾病)相关联的方法。
29 将利用统计学、机器学习、离散优化和高性能计算。
30 根据 MIRA 的目标,如果符合其总体目标,该项目可能会探索新的不可预见的机会。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Siavash Mir arabbaygi其他文献
Siavash Mir arabbaygi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Siavash Mir arabbaygi', 18)}}的其他基金
Biology-aware machine learning methods for characterizing microbiome genotype and phenotype
用于表征微生物组基因型和表型的生物学感知机器学习方法
- 批准号:
10696960 - 财政年份:2021
- 资助金额:
$ 34.47万 - 项目类别:
Biology-aware machine learning methods for characterizing microbiome genotype and phenotype
用于表征微生物组基因型和表型的生物学感知机器学习方法
- 批准号:
10810437 - 财政年份:2021
- 资助金额:
$ 34.47万 - 项目类别:
Biology-aware machine learning methods for characterizing microbiome genotype and phenotype
用于表征微生物组基因型和表型的生物学感知机器学习方法
- 批准号:
10798957 - 财政年份:2021
- 资助金额:
$ 34.47万 - 项目类别:
相似国自然基金
面向二氧化碳封存的高可扩展时空并行区域分解算法及其大规模应用
- 批准号:12371366
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
无界区域中非局部Klein-Gordon-Schrödinger方程的保结构算法研究
- 批准号:12301508
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于深度强化学习的约束多目标群智算法及多区域热电调度应用
- 批准号:62303197
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向多区域单元化生产线协同调度问题的自动算法设计研究
- 批准号:62303204
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
颜面缺损修复三维目标参照数据构建的区域权重非刚性配准算法研究
- 批准号:
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:面上项目
相似海外基金
Accelerating genomic analysis for time critical clinical applications
加速时间紧迫的临床应用的基因组分析
- 批准号:
10593480 - 财政年份:2023
- 资助金额:
$ 34.47万 - 项目类别:
Single viewpoint panoramic imaging technology for colonoscopy
肠镜单视点全景成像技术
- 批准号:
10580165 - 财政年份:2023
- 资助金额:
$ 34.47万 - 项目类别:
Biomarker-Guided Evaluation of Glycated Testing Modalities for Dysglycemia among Persons Living with HIV (BEGET)
HIV 感染者血糖异常的生物标志物引导糖化检测方式评估 (BEGET)
- 批准号:
10751444 - 财政年份:2023
- 资助金额:
$ 34.47万 - 项目类别:
Tele-Sox: A Tele-Medicine solution based on wearables and gamification to prevent Venous thromboembolism in Oncology Geriatric Patients
Tele-Sox:基于可穿戴设备和游戏化的远程医疗解决方案,用于预防肿瘤老年患者的静脉血栓栓塞
- 批准号:
10547300 - 财政年份:2023
- 资助金额:
$ 34.47万 - 项目类别:
Optimization and Validation of a Cost-effective Image-Guided Automated Extracapsular Extension Detection Framework through Interpretable Machine Learning in Head and Neck Cancer
通过可解释的机器学习在头颈癌中优化和验证具有成本效益的图像引导自动囊外扩展检测框架
- 批准号:
10648372 - 财政年份:2023
- 资助金额:
$ 34.47万 - 项目类别: