Collaborative Research: Development of New Statistical Methods for Genome-Wide Association Studies

合作研究:全基因组关联研究新统计方法的开发

基本信息

项目摘要

Advances in high-throughput sequencing technologies now make possible cost-effective analysis of whole genomes. The genomes of any two humans are 99.9% identical, with differences in the remaining 0.1% determining the diversity of human traits. For example, DNA sequence differences account for 80% of the variability in human height. Current technology allows the identification of these sequence polymorphisms between individuals, which can then be correlated to differences in a given trait. When done on a genome wide level with a large population of individuals, such genome wide association studies (GWASes) can be a useful tool for the identification of key genes controlling specific traits. However, a requirement for this approach is the availability of powerful and accurate statistical and computational methods to search through a massive amount of sequencing data to correctly identify DNA differences associated with the phenotypic trait of interest. The outcome of the project will (1) provide statistical methods to understand relationships between DNA sequence differences and the full range of diversity observed in a population, and (2) provide corresponding computational tools suitable for use by biologists and biomedical specialists for their specific population studies. This research project will produce intermediate methodological and theoretical results that lay the foundation for the final output. This project will also apply the developed methods to real, experimental data to demonstrate their utility. In addition to these research outcomes, the project will support the training of students in the field, including women and underrepresented minorities. GWAS estimates the correlation between phenotypic traits and sequence polymorphisms to identify genetic variants highly associated with specific traits. Single nucleotide polymorphisms (SNPs) are the most common type of genetic variant, and sequencing technologies allow for large-scale collection of SNP information. The project team will develop new GWAS models and methods to find trait-affecting variants with more power and accuracy. Specifically, the new methods developed in this research project will improve existing approaches by allowing modeling of observed traits from any probabilistic distribution in the exponential family. This extension ensures statistical models are biologically meaningful and interpretable. Second, the new methods will exploit different Bayesian priors, especially contemporary Bayesian priors for ultra-high dimensional model selection, that will share information across the entire genome for stable statistical inferences. Theoretical results of Bayesian priors in these new methods will also be developed. Third, a stochastic search algorithm will be developed to efficiently search through the massively large model space for model selection. This ensures that new methods are practical and useful since analysis can be done within a reasonably short time frame. Meanwhile, this also eliminates the use of subjective thresholds of significance that are now commonly used but an embarrassing practice in GWAS, having no theoretical support. Methods will be implemented into software tools and will be freely available for statisticians, biologists, and biomedical researchers. This project is funded jointly by the Division of Mathematical Sciences Mathematical Biology Program and the Statistics Program.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
高通量测序技术的进步现在可以对整个基因组进行具有成本效益的分析。任何两个人的基因组均相同99.9%,其余0.1%的差异决定人类特征的多样性。例如,DNA序列差异占人类高度变异性的80%。当前的技术允许识别个体之间的这些序列多态性,然后可以将其与给定性状的差异相关。当与大量个体的基因组范围内完成,这种基因组广泛的关联研究(GWASE)可能是鉴定控制特定特征的关键基因的有用工具。但是,这种方法的要求是使用强大而准确的统计和计算方法,以搜索大量的测序数据,以正确识别与感兴趣的表型特征相关的DNA差异。该项目的结果(1)提供统计方法,以了解DNA序列差异与人群中观察到的全部多样性之间的关系,并且(2)提供了适合于其特定人群研究的生物学家和生物医学专家使用的相应计算工具。该研究项目将产生中间的方法论和理论结果,为最终产出奠定基础。该项目还将将开发的方法应用于实际的实验数据以证明其实用性。除了这些研究成果外,该项目还将支持该领域的学生的培训,包括妇女和代表性不足的少数民族。 GWAS估计表型性状与序列多态性之间的相关性,以鉴定与特定性状高度相关的遗传变异。单核苷酸多态性(SNP)是最常见的遗传变异类型,测序技术允许大规模收集SNP信息。项目团队将开发新的GWAS模型和方法,以找到具有更高功能和准确性的影响特质的变体。具体而言,该研究项目中开发的新方法将通过允许对指数家族中任何概率分布的观察性状进行建模来改善现有方法。该扩展可确保统计模型在生物学上具有有意义的和可解释的。其次,新方法将利用不同的贝叶斯先验,尤其是当代贝叶斯先生进行超高维模型选择,这些方法将在整个基因组中共享信息,以构成稳定的统计推断。在这些新方法中,贝叶斯先验的理论结果也将开发。第三,将开发一种随机搜索算法,以有效地搜索大量的模型空间以进行模型选择。这确保了新方法是实用和有用的,因为可以在相当短的时间内进行分析。同时,这也消除了现在通常使用的主观阈值,但在GWAS中是一种令人尴尬的实践,没有理论支持。方法将用于软件工具中,并将为统计学家,生物学家和生物医学研究人员免费提供。该项目由数学科学分公司数学生物学计划和统计计划共同资助。该奖项反映了NSF的法定任务,并使用基金会的智力优点和更广泛的影响审查标准,认为值得通过评估来获得支持。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Fast and scalable computations for Gaussian hierarchical models with intrinsic conditional autoregressive spatial random effects
  • DOI:
    10.1016/j.csda.2021.107264
  • 发表时间:
    2021-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Marco A. R. Ferreira;Erica M. Porter;C. Franck
  • 通讯作者:
    Marco A. R. Ferreira;Erica M. Porter;C. Franck
Objective Bayesian Model Selection for Spatial Hierarchical Models with Intrinsic Conditional Autoregressive Priors
具有内在条件自回归先验的空间分层模型的客观贝叶斯模型选择
  • DOI:
    10.1214/23-ba1375
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    4.4
  • 作者:
    Porter, Erica M.;Franck, Christopher T.;Ferreira, Marco A.
  • 通讯作者:
    Ferreira, Marco A.
Bayesian model selection for generalized linear mixed models
  • DOI:
    10.1111/biom.13896
  • 发表时间:
    2023-06-27
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Xu,Shuangshuang;Ferreira,Marco A. R.;Franck,Christopher T.
  • 通讯作者:
    Franck,Christopher T.
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Marco Ferreira其他文献

Comparing Metaheuristic Algorithms for Error Detection in Java Programs
比较 Java 程序中错误检测的元启发式算法
LEGITIMIDAD DEL PROCESO DE PARTICIPACIÓN POPULAR: UNA INVESTIGACIÓN DE LAS PRÁCTICAS DE PLANIFICACIÓN PÚBLICA EN BRASIL
受欢迎的合法参与进程:巴西公共计划的实践调查
  • DOI:
    10.24965/reala.v0i2.10189
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Marco Ferreira;Ambrozina de Abreu Pereira Silva;Anderson de Oliveira Reis
  • 通讯作者:
    Anderson de Oliveira Reis
An Exploratory Study for the Development of a Survey on Learning Team Process, Impact, and Tutor’s Role (PIT) in Facilitating Online Learning
开展关于学习团队流程、影响和导师在促进在线学习中的角色 (PIT) 的调查的探索性研究
Perceptions of Giftedness and Classroom Practice with Gifted Children – an Exploratory Study of Primary School Teachers
对资优的看法和资优儿童的课堂实践——小学教师的探索性研究
  • DOI:
    10.17583/qre.8097
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    1.4
  • 作者:
    José Reis;Marco Ferreira;Gustau Olcina‐Sempere;B. Marques
  • 通讯作者:
    B. Marques
Distribution of prolactin receptors suggests an intraductal role for prolactin in the mouse and human mammary gland, a finding supported by analysis of signaling in polarized monolayer cultures
催乳素受体的分布表明催乳素在小鼠和人类乳腺中具有导管内作用,这一发现得到了极化单层培养物信号分析的支持
  • DOI:
  • 发表时间:
    2011
  • 期刊:
  • 影响因子:
    3.6
  • 作者:
    E. Ueda;Kuang;Virginia Nguyen;Marco Ferreira;S. André;A. Walker
  • 通讯作者:
    A. Walker

Marco Ferreira的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Marco Ferreira', 18)}}的其他基金

Collaborative Research: Development of New Statistical Methods for Genome-Wide Association Studies
合作研究:全基因组关联研究新统计方法的开发
  • 批准号:
    1853549
  • 财政年份:
    2019
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Standard Grant
Bayesian Optimal Sequential Design for Random Function Estimation
随机函数估计的贝叶斯最优序贯设计
  • 批准号:
    0907064
  • 财政年份:
    2009
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Standard Grant

相似国自然基金

农业绿色发展背景下水稻规模户“双减”行为及干预策略研究:基于纵向协作视角
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    45 万元
  • 项目类别:
    面上项目
农业绿色发展背景下水稻规模户“双减”行为及干预策略研究:基于纵向协作视角
  • 批准号:
    72273141
  • 批准年份:
    2022
  • 资助金额:
    45.00 万元
  • 项目类别:
    面上项目
在线协作学习中群体认知发展机制研究:计算建模、分析反馈及教学干预
  • 批准号:
    62177041
  • 批准年份:
    2021
  • 资助金额:
    47 万元
  • 项目类别:
    面上项目
模糊环境下面向可持续发展的应急组织指派与协作优化策略研究
  • 批准号:
    71904021
  • 批准年份:
    2019
  • 资助金额:
    20.5 万元
  • 项目类别:
    青年科学基金项目
面向人机协作任务规划的认知发展与学习方法研究
  • 批准号:
    61906203
  • 批准年份:
    2019
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
  • 批准号:
    2331437
  • 财政年份:
    2024
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Standard Grant
Collaborative Research: Broadening Instructional Innovation in the Chemistry Laboratory through Excellence in Curriculum Development
合作研究:通过卓越的课程开发扩大化学实验室的教学创新
  • 批准号:
    2337028
  • 财政年份:
    2024
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Continuing Grant
Collaborative Research: CAS: Exploration and Development of High Performance Thiazolothiazole Photocatalysts for Innovating Light-Driven Organic Transformations
合作研究:CAS:探索和开发高性能噻唑并噻唑光催化剂以创新光驱动有机转化
  • 批准号:
    2400166
  • 财政年份:
    2024
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Continuing Grant
Collaborative Research: Broadening Instructional Innovation in the Chemistry Laboratory through Excellence in Curriculum Development
合作研究:通过卓越的课程开发扩大化学实验室的教学创新
  • 批准号:
    2337027
  • 财政年份:
    2024
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Continuing Grant
Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
  • 批准号:
    2331438
  • 财政年份:
    2024
  • 资助金额:
    $ 15.7万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了