TRTech-PGR: Connecting sequences to functions within and between species through computational modeling and experimental studies
TRTech-PGR:通过计算模型和实验研究将序列与物种内部和物种之间的功能连接起来
基本信息
- 批准号:2107215
- 负责人:
- 金额:$ 140万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-12-01 至 2024-11-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Life as we know it would be impossible without plants. They are a source of food, oxygen, timber, fiber, and medicine. Therefore, improving plant traits, such as yield, nutritional quality, and resilience, is crucial for sustainable production of plant products. Key to our ability to improve plants is a thorough understanding of how plant DNA controls traits. For example, corn DNA contains ~2 billion letters, and different sets of these letters affect different plant traits. But we have limited knowledge about which letters matter and how they control traits. When we do have a good understanding of the connection between DNA and traits, such understanding is limited to a handful of model plants chosen for their relative ease of study. Thus, to have more complete knowledge of how plants work, we will connect DNA sequences with traits they control using an Artificial Intelligence-based approach, machine learning where computers are used to uncover hidden patterns from a wide range of biological data. In addition, we will apply transfer learning to translate knowledge from one plant species to another so we can later transfer what we know about model plants to other species. The outcome of the project will be computer programs that can predict the connections between DNA sequence and traits and transfer information across species. Using these programs, scientists can better understand how plants work and this knowledge can ultimately be used to create more productive and resilient plants.The rapid growth in omics data has led to discoveries transforming plant science. However, as more genomes become available, connecting sequences to their functions globally remains challenging. Thus, our first goal is to build and validate computational models that can predict sequence functions. The second project goal is to develop and apply transfer learning to address sequence-to-function problems across species and environments. To achieve the first goal, existing multi-omics and phenotype data from four model species–Arabidopsis, maize, rice, and tomato—will be integrated with machine learning to address two sequence-to-function problems: predictions of biological process functions such as enzyme or signaling pathway membership, and physiological and morphological phenotypes. These prediction models will be dissected using model interpretation methods to provide mechanistic insights through understanding why and how the models work. To achieve our second goal, using the same data from target model species and addressing the same focal problems, transfer learning strategies will be developed and optimized to assess how knowledge can be best transferred across species and environments. There is relatively abundant experimental data available for the four models we will focus on, and by holding out different amounts and types of data, a wide range of “data-poor” scenarios can be recreated and evaluated. For both project goals, the predictions will be validated with holdout experimental data independent from data used for modeling and new data from genetic experiments conducted for this project.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
我们知道,没有植物是不可能的。它们是食物,氧气,木材,纤维和药物的来源。因此,改善植物特征(例如产量,营养质量和弹性)对于植物产品的可持续生产至关重要。我们改善植物能力的关键是对植物DNA如何控制特征的透彻理解。例如,玉米DNA包含约20亿个字母,这些字母的不同集会影响不同的植物特征。但是,我们对哪些字母重要以及它们如何控制特征有限。当我们对DNA和特征之间的联系有很好的了解时,这种理解仅限于为了相对易于学习而选择的少数模型植物。为了使对植物的工作原理有了更完整的了解,我们将使用基于人工智能的方法(机器学习)将DNA序列与他们控制的特征联系起来,在该方法中,计算机可用于从广泛的生物学数据中发现隐藏的模式。此外,我们将采用转移学习将知识从一种植物物种转化为另一种植物,以便稍后可以将我们对模型植物的知识转移到其他物种。该项目的结果将是计算机程序,可以预测DNA序列与性状之间的联系,并跨物种传输信息。使用这些程序,科学家可以更好地理解植物的工作原理,最终可以使用这些知识来创造更有生产力和弹性的植物。《 OMICS数据的快速增长》导致发现了植物科学的发现。但是,随着越来越多的基因组的可用,将序列连接到其全球功能仍然是挑战。这是我们的第一个目标是构建和验证可以预测序列函数的计算模型。第二个项目目标是开发和应用转移学习以解决跨物种和环境之间的序列功能问题。为了实现第一个目标,将与机器学习相结合,从四个模型物种,玉米,大米和番茄中获得的现有多词和表型数据,以解决两个序列到功能问题:对生物过程功能的预测,例如酶或信号途径成员,以及物理和形态学现象。这些预测模型将使用模型解释方法进行解剖,以通过了解模型的工作和方式来提供机械见解。为了实现我们的第二个目标,使用来自目标模型物种的相同数据并解决相同的焦点问题,将开发和优化转移学习策略,以评估如何最好地在物种和环境中传递知识。我们将重点关注的四个模型可用相对丰富的实验数据,并且通过持有不同数量和类型的数据,可以重新创建和评估广泛的“数据折磨”场景。对于这两个项目目标,将通过保留实验数据来验证这些预测,独立于用于建模的数据和该项目进行的基因实验的新数据。该奖项反映了NSF的法定任务,并被认为是通过基金会的智力优点和更广泛的影响审查标准通过评估来获得的支持。
项目成果
期刊论文数量(16)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Selection-enriched genomic loci (SEGL) reveals genetic loci for environmental adaptation and photosynthetic productivity in Chlamydomonas reinhardtii
- DOI:10.1016/j.algal.2022.102709
- 发表时间:2022-04-21
- 期刊:
- 影响因子:5.1
- 作者:Lucker,Ben F.;Temple,Joshua A.;Kramer,David M.
- 通讯作者:Kramer,David M.
Computational prediction of plant metabolic pathways
- DOI:10.1016/j.pbi.2021.102171
- 发表时间:2022-01-22
- 期刊:
- 影响因子:9.5
- 作者:Wang, Peipei;Schumacher, Ally M.;Shiu, Shin-Han
- 通讯作者:Shiu, Shin-Han
Plant Science Knowledge Graph Corpus: a gold standard entity and relation corpus for the molecular plant sciences
植物科学知识图谱语料库:分子植物科学的黄金标准实体和关系语料库
- DOI:10.1093/insilicoplants/diad021
- 发表时间:2023
- 期刊:
- 影响因子:3.1
- 作者:Lotreck, Serena;Segura Abá, Kenia;Lehti-Shiu, Melissa D.;Seeger, Abigail;Brown, Brianna N. I.;Ranaweera, Thilanka;Schumacher, Ally;Ghassemi, Mohammad;Shiu, Shin-Han;Marshall-Colon, ed., Amy
- 通讯作者:Marshall-Colon, ed., Amy
Challenges and opportunities to build quantitative self-confidence in biologists
- DOI:10.1093/biosci/biad015
- 发表时间:2023-04-29
- 期刊:
- 影响因子:10.1
- 作者:Cuddington,Kim;Abbott,Karen C.;White,Easton R.
- 通讯作者:White,Easton R.
Graph Neural Networks for Multimodal Single-Cell Data Integration
- DOI:10.1145/3534678.3539213
- 发表时间:2022-01-01
- 期刊:
- 影响因子:0
- 作者:Wen, Hongzhi;Ding, Jiayuan;Tang, Jiliang
- 通讯作者:Tang, Jiliang
共 9 条
- 1
- 2
Shin-Han Shiu其他文献
PTEMD: a novel method for identifyingpolymorphic transposable elements via scanning of high-throughput short reads
PTEMD:一种通过扫描高通量短读段来识别多态性转座元件的新方法
- DOI:
- 发表时间:
- 期刊:
- 影响因子:4.1
- 作者:Stephen Obol Opiyo;Ning Jiang;Shin-Han Shiu;Guo-Liang WangStephen Obol Opiyo;Ning Jiang;Shin-Han Shiu;Guo-Liang Wang
- 通讯作者:Guo-Liang WangGuo-Liang Wang
共 1 条
- 1
Shin-Han Shiu的其他基金
RESEARCH-PGR: Combining machine learning and experimental analysis to define trichome and root-specific gene regulatory networks in cultivated tomato and related Solanaceae species
RESEARCH-PGR:结合机器学习和实验分析来定义栽培番茄和相关茄科物种中的毛状体和根特异性基因调控网络
- 批准号:22182062218206
- 财政年份:2023
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
Collaborative Research: Assessing the connections between genetic interactions, environments, and phenotypes in Arabidopsis thaliana
合作研究:评估拟南芥遗传相互作用、环境和表型之间的联系
- 批准号:22104312210431
- 财政年份:2022
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
NRT-HDR: Intersecting computational and data science to address grand challenges in plant biology
NRT-HDR:交叉计算和数据科学以应对植物生物学的巨大挑战
- 批准号:18281491828149
- 财政年份:2018
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
Collaborative Research: Fitness effects of loss-of-function mutations in duplicate genes
合作研究:重复基因功能丧失突变的适应性影响
- 批准号:16553861655386
- 财政年份:2017
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
Computational and Experimental Studies of Plastid Functional Networks
质体功能网络的计算和实验研究
- 批准号:11197781119778
- 财政年份:2011
- 资助金额:$ 140万$ 140万
- 项目类别:Continuing GrantContinuing Grant
Experimental Characterization of Novel Coding Small ORFs in the Arabidopsis thaliana Genome
拟南芥基因组中新编码小 ORF 的实验表征
- 批准号:07496340749634
- 财政年份:2008
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
相似国自然基金
通过构建Pgr-Cas9工具小鼠研究Hippo通路效应因子Yap1/Wwtr1在蜕膜化过程中的作用
- 批准号:32370913
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
海洋硅藻PGR5/PGRL1蛋白感知和适应波动光的作用机制研究
- 批准号:42276146
- 批准年份:2022
- 资助金额:56 万元
- 项目类别:面上项目
KLF12通过调控PGR和GDF10的表达抑制孕激素诱导子宫内膜癌细胞分化的机制研究
- 批准号:
- 批准年份:2021
- 资助金额:55 万元
- 项目类别:面上项目
HBP1调节PGR转录活性在胚胎植入及妊娠维持中的作用机制
- 批准号:82160296
- 批准年份:2021
- 资助金额:34.00 万元
- 项目类别:地区科学基金项目
KLF12通过调控PGR和GDF10的表达抑制孕激素诱导子宫内膜癌细胞分化的机制研究
- 批准号:82172819
- 批准年份:2021
- 资助金额:55.00 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
- 批准号:23314372331437
- 财政年份:2024
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
Collaborative Research: TRTech-PGR TRACK: Discovery and characterization of small CRISPR systems for virus-based delivery of heritable editing in plants.
合作研究:TRTech-PGR TRACK:小型 CRISPR 系统的发现和表征,用于基于病毒的植物遗传编辑传递。
- 批准号:23340282334028
- 财政年份:2024
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
RESEARCH-PGR: Cycling to low-temperature tolerance
研究-PGR:循环到耐低温
- 批准号:23326112332611
- 财政年份:2024
- 资助金额:$ 140万$ 140万
- 项目类别:Continuing GrantContinuing Grant
TRTech-PGR: PlantTransform: Boosting Agrobacterium-mediated transformation efficiency in the orphan crop tef (Eragrostis tef) for trait improvement
TRTech-PGR:PlantTransform:提高孤儿作物 tef(画眉草 tef)中农杆菌介导的转化效率,以改善性状
- 批准号:23279062327906
- 财政年份:2024
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant
Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
- 批准号:23314382331438
- 财政年份:2024
- 资助金额:$ 140万$ 140万
- 项目类别:Standard GrantStandard Grant