NSFDEB-NERC: Machine learning tools to discover balancing selection in genomes from spatial and temporal autocorrelations
NSFDEB-NERC:机器学习工具,用于从空间和时间自相关中发现基因组中的平衡选择
基本信息
- 批准号:2302258
- 负责人:
- 金额:$ 64.81万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-07-01 至 2026-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Understanding why individuals within species are so genetically diverse is a fundamental problem in evolutionary biology and genetics. This individual genetic diversity, and its causes, has important consequences for biodiversity conservation, agricultural biology, and biomedicine. Balancing selection is a process that promotes and maintains genetic diversity over time. Despite a few well-known examples, however, little is known about recent or fleeting balancing selection, likely because its genetic clues are subtle and difficult to distinguish from those left by other adaptive and nonadaptive processes. Detecting balancing selection in genome data is further complicated by technical issues, such as missing or degraded DNA sequence data, which are not accounted for by current methods. The primary goal of this project is to tackle these challenges by designing state-of-the-art tools based on recent advances in artificial intelligence, which provide strategies for identifying signals of past evolutionary events in genetic data. These tools will be made freely available in a public repository, enabling widespread use. In addition, this project will actively engage local high school students in coding and machine learning through the iDeepLearn summer workshop, and other students from groups under-represented in STEM through outreach programs at the FAU campus high school. Together, these planned activities will facilitate future advancements in our understanding of balancing selection across diverse taxonomic groups, as well as foster participation of traditionally underrepresented high school students in STEM research. Detecting balancing selection is enhanced by using temporally sampled genetic data often accessed from ancient DNA, which presents numerous technical hurdles. This research seeks to develop novel machine- and deep-learning methods that can identify genomic signatures of recent and transient balancing selection from spatially and temporally sampled genetic data, while accounting for technical issues encountered by researchers working with ancient DNA and nonmodel organimsm. The project will specifically address detecting balancing selection from data that are incomplete, low-quality, unphased, or pooled under settings for which there is uncertainty in genetic and demographic parameters. Developed methods will be applied to human, mosquito, and fruit fly testbeds, as these study systems have evidence for diverse modes of balancing selection, and publicly available datasets with characteristics of the technical hurdles the projects seeks to overcome. These methods will also be implemented as open-source tools applicable to a wide range of data types common across model and nonmodel organisms, empowering future studies of adaptation by removing barriers imposed by limitations of data quality and demographic knowledge, and ultimately leading to novel insights in the understanding of adaptive history across the tree of life. Workshops on machine learning in population genomics will be developed and delivered for high school girls as part of the iDeepLearn summer program at Florida Atlantic University. In both the US and the UK, multiple STEM-related career events aimed at secondary school pupils will be developed and delivered.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
了解为什么物种中的个体在遗传上如此多样化是进化生物学和遗传学中的一个基本问题。这种个体的遗传多样性及其原因对生物多样性保护,农业生物学和生物医学有重要影响。平衡选择是一个随着时间的推移促进和维持遗传多样性的过程。尽管有一些众所周知的例子,但是对于最近或短暂的平衡选择知之甚少,这可能是因为其遗传线索是微妙的,难以与其他适应性和非适应性过程所留下的线索区分开来。技术问题(例如缺失或退化的DNA序列数据)进一步复杂基因组数据中的平衡选择,这些问题并非由当前方法解释。该项目的主要目标是通过根据人工智能的最新进展设计最先进的工具来应对这些挑战,该工具提供了识别遗传数据中过去进化事件的信号的策略。这些工具将在公共存储库中免费提供,从而可以广泛使用。此外,该项目将通过IDEEPLEARN夏季研讨会积极吸引当地高中生进行编码和机器学习,以及来自FAU校园高中的外展计划的其他成员的其他学生。这些计划的活动一起将有助于我们对平衡各种分类群体的选择的理解,并促进传统上代表性不足的高中生参与STEM研究的进步。通过使用经常从古代DNA访问的时间取样的遗传数据来增强检测平衡选择,该数据带来了许多技术障碍。这项研究旨在开发新型的机器和深度学习方法,这些方法可以从空间和时间采样的遗传数据中鉴定出最新和短暂平衡选择的基因组特征,同时考虑了使用古代DNA和非模型OrganimsM工作的研究人员遇到的技术问题。该项目将专门解决从数据不完整,低质量,未经遗传或合并的数据中检测到的平衡选择,在这些数据的设置下,遗传和人口统计学参数存在不确定性。开发的方法将应用于人类,蚊子和果蝇测试床,因为这些研究系统具有各种平衡选择模式的证据,并且具有公共可用的数据集,并且具有项目旨在克服的技术障碍的特征。这些方法还将作为开源工具实施,适用于在模型和非模型生物体中常见的广泛数据类型,从而通过消除数据质量和人口统计学知识限制所施加的障碍来赋予未来的适应研究,并最终导致新的见解,从而在整个生命之树中了解适应性历史。作为佛罗里达州大西洋大学的Ideeplearn夏季计划的一部分,将开发有关人口基因组学机器学习的研讨会。在美国和英国,将开发和交付针对中学学生的多个与STEM相关的职业活动。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子和更广泛影响的评估评估来通过评估来支持的。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael DeGiorgio其他文献
Michael DeGiorgio的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael DeGiorgio', 18)}}的其他基金
SG: Inferring phylogenies under ancestral population structure
SG:推断祖先种群结构下的系统发育
- 批准号:
1949268 - 财政年份:2019
- 资助金额:
$ 64.81万 - 项目类别:
Standard Grant
Collaborative Research: Understanding the Deep Ancestry of the Indigenous People of North America
合作研究:了解北美原住民的深层血统
- 批准号:
1925825 - 财政年份:2019
- 资助金额:
$ 64.81万 - 项目类别:
Standard Grant
Collaborative Research: Understanding the Deep Ancestry of the Indigenous People of North America
合作研究:了解北美原住民的深层血统
- 批准号:
2001063 - 财政年份:2019
- 资助金额:
$ 64.81万 - 项目类别:
Standard Grant
SG: Inferring phylogenies under ancestral population structure
SG:推断祖先种群结构下的系统发育
- 批准号:
1753489 - 财政年份:2018
- 资助金额:
$ 64.81万 - 项目类别:
Standard Grant
NSF Postdoctoral Fellowship in Biology FY 2011
2011 财年 NSF 生物学博士后奖学金
- 批准号:
1103639 - 财政年份:2011
- 资助金额:
$ 64.81万 - 项目类别:
Fellowship Award
相似海外基金
NSFGEO-NERC: Imaging the magma storage region and hydrothermal system of an active arc volcano
NSFGEO-NERC:对活弧火山的岩浆储存区域和热液系统进行成像
- 批准号:
NE/X000656/1 - 财政年份:2025
- 资助金额:
$ 64.81万 - 项目类别:
Research Grant
NSFDEB-NERC: Spatial and temporal tradeoffs in CO2 and CH4 emissions in tropical wetlands
NSFDEB-NERC:热带湿地二氧化碳和甲烷排放的时空权衡
- 批准号:
NE/Z000246/1 - 财政年份:2025
- 资助金额:
$ 64.81万 - 项目类别:
Research Grant
NSFGEO-NERC: Magnetotelluric imaging and geodynamical/geochemical investigations of plume-ridge interaction in the Galapagos
NSFGEO-NERC:加拉帕戈斯群岛羽流-山脊相互作用的大地电磁成像和地球动力学/地球化学研究
- 批准号:
NE/Z000254/1 - 财政年份:2025
- 资助金额:
$ 64.81万 - 项目类别:
Research Grant
Collaborative Research: NSFDEB-NERC: Warming's silver lining? Thermal compensation at multiple levels of organization may promote stream ecosystem stability in response to drought
合作研究:NSFDEB-NERC:变暖的一线希望?
- 批准号:
2312706 - 财政年份:2024
- 资助金额:
$ 64.81万 - 项目类别:
Standard Grant
Collaborative Research: NSFGEO/NERC: After the cataclysm: cryptic degassing and delayed recovery in the wake of Large Igneous Province volcanism
合作研究:NSFGEO/NERC:灾难之后:大型火成岩省火山活动后的神秘脱气和延迟恢复
- 批准号:
2317936 - 财政年份:2024
- 资助金额:
$ 64.81万 - 项目类别:
Continuing Grant