MIM: Machine Learning, Systems Modeling, and Experimental Approaches to Understand the Universal Rules of Life of Microbiota Using Marine Time Series Data

MIM:利用海洋时间序列数据了解微生物群生命普遍规则的机器学习、系统建模和实验方法

基本信息

  • 批准号:
    2125142
  • 负责人:
  • 金额:
    $ 250.07万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-01-01 至 2026-12-31
  • 项目状态:
    未结题

项目摘要

Understanding the relationships among microbial organisms, the functioning of their genes and communities, the environment, and how these relationships reflect universal rules of life remain essential problems in microbiology. The team of investigators are leveraging metagenomics data, the collective DNA content of the entire community, from a marine time series to address these questions. Although the science of metagenomics is over a decade old, there are still many aspects of metagenomic datasets that require new approaches to extract valuable information. The research team will apply state-of-the-art integrative machine learning, systems modeling and experimental approaches to existing and newly generated time series metagenomic data to better understand the interaction networks in microbial communities and their impacts on microbial community function, which have major implications for understanding the global cycling of elements and processing of energy in ecosystems. The machine learning and mathematical modeling tools developed in this proposal should provide new avenues for fundamental analysis of metagenomes. The theory and computational tools will also directly benefit both the statistical and machine learning community on causal inference as well as ecological modeling. Ultimately, these tools will enable investigators to help uncover the universal rules of life within microbiomes from many different environments, including those present in animals and plants. The project will provide interdisciplinary training for postdoctoral fellows, graduate, undergraduate and high school students with emphasis on underrepresented groups in data science, computer science, statistics, computational biology, environmental biology and ecology. Software tools developed during the project will be disseminated to the community.Over the past two decades, the San Pedro Ocean Time (SPOT) Series associated with University of Southern California Microbial Observatory has collected time series marker gene, metagenomic, and metatranscriptomic data at different time scales (daily, weekly, monthly, and seasonally) across various depths, locations and perturbations (pristine and polluted) in the ocean. With the rich available time series data, the research team will develop machine learning, systems modeling, and experimental approaches to understand the universal rules of life of microbial communities. The specific aims of this project are to (1) develop machine learning approaches to identify all microbes, known or novel, within the microbial communities and also host of mobile genetic elements, such as viruses and plasmids, through metagenomic read assembly and binning, (2) further investigate the Granger graphical models with knockoff false discovery control, apply the resulting computational tools to the SPOT data to identify causal relationships among the known microbial genomes, metagenome assembled genomes, and environmental factors. (3) based on the causal networks constructed from the first two aims, develop mechanistic models driving organism abundances and community structure, such as competition, cross-feeding, virus-host interactions, grazing and physical transport, and develop a predictive framework for application to diverse and future ecosystems. (4) experimentally validate the predicted virus-host interactions using proximity-ligation experiments and the dynamics and emerging properties of the microbial communities. User-friendly software packages to automate the procedures for analyzing metagenomic data will be developed. Co-funding for this research was provided by the Biological Oceanography and Mathematical Biology programs.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
了解微生物生物之间的关系,其基因和社区的功能,环境以及这些关系如何反映生命的普遍规则仍然是微生物学的基本问题。调查人员团队正在利用宏基因组学数据,即整个社区的集体DNA含量,从海洋时间序列来解决这些问题。尽管宏基因组学科学已经过去十年了,但宏基因组数据集的许多方面仍需要新的方法来提取有价值的信息。研究团队将对现有和新生成的时间序列的宏基因组数据应用最先进的集成机器学习,系统建模和实验方法,以更好地了解微生物群落中的相互作用网络及其对微生物社区功能的影响,这对了解全球元素的元素和能源元素的处理具有重大影响。本提案中开发的机器学习和数学建模工具应为元基因组的基本分析提供新的途径。理论和计算工具还将直接受益于因果推断以及生态建模的统计和机器学习社区。最终,这些工具将使研究人员能够从许多不同环境(包括动物和植物中存在的环境)中发现微生物中的普遍生活规则。该项目将为博士后研究员,毕业生,本科和高中生提供跨学科培训,重点是数据科学,计算机科学,统计学,计算生物学,环境生物学和生态学领域的代表性不足。 Software tools developed during the project will be disseminated to the community.Over the past two decades, the San Pedro Ocean Time (SPOT) Series associated with University of Southern California Microbial Observatory has collected time series marker gene, metagenomic, and metatranscriptomic data at different time scales (daily, weekly, monthly, and seasonally) across various depths, locations and perturbations (pristine and polluted) in the ocean.有了丰富的可用时间序列数据,研究团队将开发机器学习,系统建模和实验方法,以了解微生物社区的普遍生活规则。该项目的具体目的是(1)开发机器学习方法,以识别微生物群落内的所有微生物,以及许多移动遗传元素(例如病毒和质粒),通过宏基因组读取组装和固定物(2)进一步调查已知的计算数据的Granger Generial,以进一步调查GRANGER图形模型,以识别仿制的发现型数据,以识别仿制的型号的工具,从元基因组组装的基因组和环境因素。 (3)基于从前两个目标构建的因果网络,开发机械模型驱动有机体丰富的和社区结构,例如竞争,交叉进食,病毒 - 宿主相互作用,放牧和物理运输,并为对多样化和未来生态系统的应用提供预测框架。 (4)通过实验验证了使用接近性结合实验以及微生物群落的动力学和新兴特性来验证预测的病毒宿主相互作用。将开发用户友好的软件包来自动化分析宏基因组数据的过程。这项研究的共同资助由生物海洋学和数学生物学计划提供。该奖项反映了NSF的法定任务,并且使用基金会的知识分子优点和更广泛的影响评估标准,被认为值得通过评估来获得支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
HiFine: integrating Hi-C-based and shotgun-based methods to refine binning of metagenomic contigs
  • DOI:
    10.1093/bioinformatics/btac295
  • 发表时间:
    2022-05-12
  • 期刊:
  • 影响因子:
    5.8
  • 作者:
    Du, Yuxuan;Sun, Fengzhu
  • 通讯作者:
    Sun, Fengzhu
Normalizing Metagenomic Hi-C Data and Detecting Spurious Contacts Using Zero-Inflated Negative Binomial Regression
  • DOI:
    10.1089/cmb.2021.0439
  • 发表时间:
    2022-01-12
  • 期刊:
  • 影响因子:
    1.7
  • 作者:
    Du, Yuxuan;Laperriere, Sarah M.;Sun, Fengzhu
  • 通讯作者:
    Sun, Fengzhu
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Fengzhu Sun其他文献

HiCzin: Normalizing metagenomic Hi-C data and detecting spurious contacts using zero-inflated negative binomial regression
HiCzin:使用零膨胀负二项式回归标准化宏基因组 Hi-C 数据并检测虚假接触
  • DOI:
    10.1101/2021.03.01.433489
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yuxuan Du;S. Laperriere;J. Fuhrman;Fengzhu Sun
  • 通讯作者:
    Fengzhu Sun
Alignment-Free Sequence Comparison Based on Next Generation Sequencing Reads: Extended Abstract
On the use of population-based registries in the clinical validation of genetic tests for disease susceptibility
基于人群的登记在疾病易感性基因检测临床验证中的应用
  • DOI:
    10.1097/00125817-200005000-00005
  • 发表时间:
    1999
  • 期刊:
  • 影响因子:
    8.8
  • 作者:
    Quanhe Yang;M. Khoury;S. Coughlin;Fengzhu Sun;Dana Flanders
  • 通讯作者:
    Dana Flanders
Dynamic programming algorithms for haplotype block partitioning: applications to human chromosome 21 haplotype data
单倍型块划分的动态规划算法:在人类 21 号染色体单倍型数据中的应用
Exploring Treatment-Driven Subclonal Evolution of Prognostic and Predictive Triple Biomarkers: Dual Gene Fusions and Chimeric RNA Variants in Subtypes of Acute Myeloid Leukemia Patients with KMT2A Rearrangement
  • DOI:
    10.1182/blood-2024-204217
  • 发表时间:
    2024-11-05
  • 期刊:
  • 影响因子:
  • 作者:
    Homa Dadrastoussi;Yi Xu;Shengwen calvin Li;Jeffrey Xiao;Qian Liu;Durga Cherukuri;Yan Liu;Saied Mirshahidi;Jane Xu;Xuelian Chen;Julian Olea;Kaijin Wu;Kevin R. Kelly;Xi Zhang;Fengzhu Sun;Cristina Maria Ghiuzeli;Esther G Chong;Hisham Abdel-Azim;Mark Reeves;David Baylink
  • 通讯作者:
    David Baylink

Fengzhu Sun的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Fengzhu Sun', 18)}}的其他基金

Inference of Markovian Properties of Molecular Sequences Using Shotgun Reads and Applications
使用鸟枪读取和应用推断分子序列的马尔可夫性质
  • 批准号:
    1518001
  • 财政年份:
    2015
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Continuing Grant
Computational and Mathematical Study in Protein Interactions and Functions
蛋白质相互作用和功能的计算和数学研究
  • 批准号:
    0241102
  • 财政年份:
    2003
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Continuing Grant

相似国自然基金

面向机器人复杂操作的接触形面和抓取策略共适应学习
  • 批准号:
    52305030
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于偏序邻域的多粒度图机器学习与决策
  • 批准号:
    62366008
  • 批准年份:
    2023
  • 资助金额:
    33 万元
  • 项目类别:
    地区科学基金项目
基于机器学习方法的土壤多孔介质中EPFRs环境行为与生态毒性研究
  • 批准号:
    42377385
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
基于机器学习的大学生自杀风险识别研究
  • 批准号:
    32300917
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向多机器人视觉感知的自主学习机制
  • 批准号:
    62373009
  • 批准年份:
    2023
  • 资助金额:
    51 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Continuing Grant
RII Track-4:NSF: Physics-Informed Machine Learning with Organ-on-a-Chip Data for an In-Depth Understanding of Disease Progression and Drug Delivery Dynamics
RII Track-4:NSF:利用器官芯片数据进行物理信息机器学习,深入了解疾病进展和药物输送动力学
  • 批准号:
    2327473
  • 财政年份:
    2024
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Standard Grant
CC* Campus Compute: UTEP Cyberinfrastructure for Scientific and Machine Learning Applications
CC* 校园计算:用于科学和机器学习应用的 UTEP 网络基础设施
  • 批准号:
    2346717
  • 财政年份:
    2024
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Standard Grant
Learning to create Intelligent Solutions with Machine Learning and Computer Vision: A Pathway to AI Careers for Diverse High School Students
学习利用机器学习和计算机视觉创建智能解决方案:多元化高中生的人工智能职业之路
  • 批准号:
    2342574
  • 财政年份:
    2024
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Standard Grant
Collaborative Research: Conference: DESC: Type III: Eco Edge - Advancing Sustainable Machine Learning at the Edge
协作研究:会议:DESC:类型 III:生态边缘 - 推进边缘的可持续机器学习
  • 批准号:
    2342498
  • 财政年份:
    2024
  • 资助金额:
    $ 250.07万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了