Collaborative Research: CIBR: CloudForest: A Portable Cyberinfrastructure Workflow To Advance Biological Insight from Massive, Heterogeneous Phylogenomic Datasets

合作研究:CIBR:CloudForest:一种便携式网络基础设施工作流程,可从海量、异质的系统发育数据集中推进生物学洞察

基本信息

  • 批准号:
    1934157
  • 负责人:
  • 金额:
    $ 23.16万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-01 至 2023-08-31
  • 项目状态:
    已结题

项目摘要

Variation across inferred gene trees is arguably the most consistent and striking observation from empirical phylogenomic studies, yet many unanswered questions remain about the causes of this variation. The questions persist in part because modern phylogenetic inference is still deeply influenced by a decades-old paradigm. Data from one or a few genes were typically gathered at the same time, combined into a single dataset, and analyzed by a single program that estimated a shared tree. While the size and complexity of datasets has changed radically in recent years, many aspects of this general workflow pervade. Most current approaches do not naturally integrate inferences from different sources, whether different studies or software packages, and even cutting-edge methods that model differences in gene histories still summarize these histories as a single "species tree" topology. More versatile tools are needed to understand the heterogeneity inherent to modern genomic datasets. Key to this versatility is the ability to flexibly and seamlessly move between different stages of a phylogenetic workflow, from inference of individual gene trees to exploration of the genome-wide phylogenetic landscape and, ultimately, to learning about the biological processes that have shaped variation across the genome. Each of these stages may rely on different analytical tools and software.The major aim of this project is to develop a cyberinfrastructure workflow called Cloudforest to address outstanding challenges in phylogenomics and provide researchers with a set of streamlined tools to explore and understand variation in evolutionary history across different regions of the genome (i.e., gene tree variation). CloudForest will allow users to leverage diverse computing resources that range from laptops, to HPC clusters, to cloud-based resources like JetStream or Amazon Web Services. CloudForest will meet many of the outstanding needs of empirical phylogenomic studies, such as (1) visualizing variation across gene trees, (2) revealing structure in sets of trees (forests), (3) conducting hypothesis tests regarding the causes of gene-tree variation, and (4) detecting genes that may have outlying (and potentially aberrant) histories. By addressing these challenges in a consistent way across computing platforms, CloudForest will allow biologists to make efficient use of any computational resource at their disposal with workflows appropriate for addressing a variety of important, unresolved questions in both evolutionary biology and other applied fields. This project also aims to advance broader goals by (1) supporting broad educational and training opportunities for researchers from around the world in the use of advanced computing solutions, (2) actively promoting the involvement and achievements of researchers from underrepresented groups in computational biology, (3) providing unique, interdisciplinary training opportunities for graduate students at the intersection of computing, math, and biology, (4) contributing to the development of an interactive and visually rich website for learning about phylogenetics and phylogenomics, and (5) facilitating applied phylogenetic research that will advance human health and well-being. A public facing web site for this project can be found at https://github.com/jwilgenb/CloudForest.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
可以说,推断的基因树之间的变化是经验系统基因组学研究中最一致,最引人注目的观察结果,但是关于这种变异的原因,许多未解决的问题仍然存在。 这些问题仍然存在部分是因为现代系统发育推断仍然受到数十年历史的范式的深刻影响。通常同时收集一个或几个基因的数据,合并到一个数据集中,并通过一个估计共享树的单个程序进行分析。尽管近年来数据集的大小和复杂性发生了根本变化,但该一般工作流程的许多方面都散发出来。大多数当前的方法并不自然地从不同来源(无论是不同的研究还是软件包,甚至建模基因历史差异的尖端方法)的推论仍然将这些历史总结为单个“物种树”拓扑。需要更多的用途工具来了解现代基因组数据集固有的异质性。这种多功能性的关键是能够灵活,无缝地在系统发育工作流的不同阶段中无缝移动,从单个基因树的推理到探索全基因组的系统发育景观,并最终了解整个基因组变异的生物学过程。这些阶段中的每个阶段都可能依赖于不同的分析工具和软件。该项目的主要目的是开发一种称为CloudForest的网络基础设施工作流,以应对系统基因组学的杰出挑战,并为研究人员提供一系列简化的工具,以探索和了解基因组不同区域的进化史的变化(即基因树的变异)。 CloudForest将允许用户利用从笔记本电脑到HPC群集到基于云的资源(例如Jetstream或Amazon Web服务)的各种计算资源。 CloudForest将满足经验系统基因组学研究的许多杰出需求,例如(1)可视化基因树之间的变化,(2)揭示树木集(森林)中的结构,(3)进行有关基因树的原因的假设检验,以及(4)检测基因的检测基因,这些基因可能已经超出了(和潜在的(潜在的)历史学家。通过在计算平台之间以一致的方式解决这些挑战,CloudForest将使生物学家能够有效利用任何计算资源,可以使用适合于解决进化生物学和其他应用领域的各种重要,未解决的问题的工作流程。 该项目还旨在通过(1)通过(1)支持全球研究人员的广泛教育和培训机会,以利用高级计算解决方案,(2)积极促进计算生物学中代表性不足的研究人员的研究人员的参与和成就,(3)(3)提供独特的,跨学科的培训机会,以促进计算机的研究,并在计算机上进行培训,并(4),以及(4个计算机,数学,数学,数学,数学,数学,数学,数学,数学的培训机会,并提供了培训,并提供了培训,并提供了培训,并提供了培训,并提供了培训,并提供了数学的培训机会,并提供了培训。视觉上丰富的网站,用于了解系统发育学和系统发育学学,以及(5)促进应用系统发育研究,以改善人类健康和福祉。 可以在https://github.com/jwilgenb/cloudforest..this奖上找到该项目的公开网站,以反映NSF的法定任务,并认为使用基金会的知识分子优点和更广泛的影响审查标准,认为值得通过评估来获得支持。

项目成果

期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Community Detection by a Riemannian Projected Proximal Gradient Method
  • DOI:
    10.1016/j.ifacol.2021.06.115
  • 发表时间:
    2020-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Meng Wei;Wen Huang;K. Gallivan;P. Dooren
  • 通讯作者:
    Meng Wei;Wen Huang;K. Gallivan;P. Dooren
Analysis of the Neighborhood Pattern Similarity Measure for the Role Extraction Problem
  • DOI:
    10.1137/20m1358785
  • 发表时间:
    2020-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Melissa Marchand;K. Gallivan;Wen Huang;P. Dooren
  • 通讯作者:
    Melissa Marchand;K. Gallivan;Wen Huang;P. Dooren
Simplifying Transformations for a Family of Elastic Metrics on the Space of Surfaces
简化曲面空间上弹性度量族的变换
A limited-memory Riemannian symmetric rank-one trust-region method with an efficient algorithm for its subproblem
  • DOI:
    10.1016/j.ifacol.2021.06.118
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wen Huang;K. Gallivan
  • 通讯作者:
    Wen Huang;K. Gallivan
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Kyle Gallivan其他文献

Kyle Gallivan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Kyle Gallivan', 18)}}的其他基金

Collaborative Research: ABI Innovation: Quantifying and Exploiting the Structure of Phylogenetic Tree Space Through Network Analyses
合作研究:ABI创新:通过网络分析量化和利用系统发育树空间的结构
  • 批准号:
    1262476
  • 财政年份:
    2013
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
ITR/AP: Collaborative Research: Model Reduction of Dynamical Systems for Real Time Control
ITR/AP:协作研究:实时控制动态系统的模型简化
  • 批准号:
    0324944
  • 财政年份:
    2003
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Continuing Grant
Efficient Algorithms for Large Scale Dynamical Systems
大规模动力系统的高效算法
  • 批准号:
    9912415
  • 财政年份:
    2000
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Continuing Grant
High Performance Computing for Large Scale Systems
大规模系统的高性能计算
  • 批准号:
    9619596
  • 财政年份:
    1997
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
High Performance Computing for Large Scale Systems
大规模系统的高性能计算
  • 批准号:
    9796315
  • 财政年份:
    1997
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant

相似国自然基金

钛基骨植入物表面电沉积镁氢涂层及其促成骨性能研究
  • 批准号:
    52371195
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
CLMP介导Connexin45-β-catenin复合体对先天性短肠综合征的致病机制研究
  • 批准号:
    82370525
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
人工局域表面等离激元高灵敏传感及其系统小型化的关键技术研究
  • 批准号:
    62371132
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
优先流对中俄原油管道沿线多年冻土水热稳定性的影响机制研究
  • 批准号:
    42301138
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
用于稳定锌负极的界面层/电解液双向调控研究
  • 批准号:
    52302289
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts
合作研究:CIBR:跨越标本数字化差距:将新工具、机器学习和公众参与与标签数字化工作联系起来
  • 批准号:
    2027241
  • 财政年份:
    2021
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts
合作研究:CIBR:跨越标本数字化差距:将新工具、机器学习和公众参与与标签数字化工作联系起来
  • 批准号:
    2027234
  • 财政年份:
    2021
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
Collaborative Research: CIBR: Incorporating Crystallography and Cryo-EM Tools in Foldit
合作研究:CIBR:在 Foldit 中结合晶体学和冷冻电镜工具
  • 批准号:
    2051305
  • 财政年份:
    2021
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
Collaborative Research: CIBR: Incorporating Crystallography and Cryo-EM tools into Foldit
合作研究:CIBR:将晶体学和冷冻电镜工具纳入 Foldit
  • 批准号:
    2051282
  • 财政年份:
    2021
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Standard Grant
Collaborative Research: CIBR: The OpenBehavior Project
合作研究:CIBR:开放行为项目
  • 批准号:
    1948181
  • 财政年份:
    2021
  • 资助金额:
    $ 23.16万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了