Bayesian Joint Estimation of Alignment and Phylogeny
比对和系统发育的贝叶斯联合估计
基本信息
- 批准号:8116012
- 负责人:
- 金额:$ 29.54万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2008
- 资助国家:美国
- 起止时间:2008-08-01 至 2013-07-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAlgorithmsBioinformaticsBiologyBiomedical ResearchCommunicable DiseasesComparative StudyDataData SetEducational process of instructingEvolutionFosteringGenesGeneticHeterogeneityHumanJointsKnowledgeLifeMathematicsMethodsModelingMolecularMolecular BiologyMotivationMutateOrganismPhylogenetic AnalysisPhylogenyPlayProcessPropertyProtein FamilyRoleSequence AlignmentSequence AnalysisSequence HomologsSideSisterSpecific qualifier valueTaxonTechniquesTestingTimeTrainingTreesVariantViruscomparative genomicsconditioningimprovedinsertion/deletion mutationinterestlife historymarkov modelnovelreconstructiontooluser friendly software
项目摘要
DESCRIPTION (provided by applicant): Phylogenetic reconstruction is an invaluable tool for studying molecular sequences. Starting from a description of how the characters in the sequences mutate over time, the methods attempt to uncover the sequences' relatedness. Common applications range from describing the evolutionary histories of living organisms in evolutionary biology to estimating genetic distances and constructing protein families in molecular biology and bioinformatics. Standard reconstruction methods rely on sequence alignments that specify which characters in the sequences are homologous, deriving from common ancestors. A fundamental difficulty is that sequence alignments are not directly observed; they are inferred properties of the raw sequence data and must be estimated along with the phylogeny. Current tools handle this inference sequentially, first determining a sometimes poor estimate of the alignment and then conditioning on the truth of alignment to reconstruct the phylogeny. This project provides practical tools for end-users to simultaneously infer alignment and phylogeny, side-stepping biases that sequential estimation introduces. The tools assume both a character substitution model and an insertion/deletion (indel) process through which characters are added or removed generating an alignment. Further, these indels supply previously under-utilized information from the data to infer phytogenies. Major advances make this phylo-alignment framework useful for real-life datasets. The framework draws heavily on hidden Markov models, Bayesian computation and clever parameter integration to produce a computationally efficient inference engine. Expert prior knowledge helps inform the indel process. From this, realistic priors enable Bayes factor tests to address if specific indels are shared by descent or are homoplastic, reducing controversy over their value in phylogenetics. Modeling assumptions better reflect the underlying biology. Allowing spatial variation in the indel process provides more accurate phytogenies and alignments. The extensions also provide for heterogeneity tests to identify evolutionary interesting sequence regions. Examples of the methods span all time-scales of evolution, across billions of years to infer early branches in the Tree of Life to matters of months to describe the diversification of rapidly evolving viruses within infected hosts.
This project markedly impacts many fields across biomedical research. For example, the project furnishes mathematical and statistical training in bioinformatics which will play a prime role in discovery during the 21st century, and rigorous inference tools employing phylo-alignment deliver improved molecular, comparative studies, a more accurate understanding of human evolution and new perspectives from which to battle infectious diseases.
描述(由申请人提供):系统发育重建是研究分子序列的宝贵工具。这些方法从序列中的字符如何随时间变异的描述开始,试图揭示序列的相关性。常见的应用范围包括从进化生物学中描述生物体的进化历史到分子生物学和生物信息学中估计遗传距离和构建蛋白质家族。标准重建方法依赖于序列比对,序列比对指定序列中的哪些字符是同源的,源自共同的祖先。一个根本的困难是序列比对不能直接观察到。它们是原始序列数据的推断属性,必须与系统发育一起进行估计。当前的工具按顺序处理此推论,首先确定有时对对齐的较差估计,然后根据对齐的真实情况进行调节以重建系统发育。该项目为最终用户提供了实用的工具,可以同时推断比对和系统发育,避免顺序估计引入的偏差。这些工具采用字符替换模型和插入/删除 (indel) 过程,通过该过程添加或删除字符以生成对齐。此外,这些插入缺失提供了以前未充分利用的数据信息来推断植物发生。重大进展使得这种系统排列框架对于现实数据集非常有用。该框架大量利用隐马尔可夫模型、贝叶斯计算和巧妙的参数集成来产生计算高效的推理引擎。专家先验知识有助于告知插入缺失过程。由此看来,现实的先验使贝叶斯因子测试能够解决特定的插入缺失是否是由血统共享的还是同质的,从而减少了对其在系统发育学中的价值的争议。建模假设更好地反映了潜在的生物学原理。允许插入缺失过程中的空间变化可提供更准确的植物发生和排列。这些扩展还提供异质性测试来识别进化有趣的序列区域。这些方法的例子跨越了进化的所有时间尺度,从数十亿年到几个月的时间来推断生命之树的早期分支,以描述受感染宿主内快速进化的病毒的多样化。
该项目对生物医学研究的许多领域产生了显着影响。例如,该项目提供生物信息学方面的数学和统计培训,这将在 21 世纪的发现中发挥主要作用,而采用系统排列的严格推理工具可改进分子、比较研究,更准确地理解人类进化和新观点以此来对抗传染病。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Marc A. Suchard其他文献
Marc A. Suchard的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Marc A. Suchard', 18)}}的其他基金
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10584588 - 财政年份:2021
- 资助金额:
$ 29.54万 - 项目类别:
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10390334 - 财政年份:2021
- 资助金额:
$ 29.54万 - 项目类别:
Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference
统计创新整合序列和表型以进行可扩展的系统动力学推断
- 批准号:
10177121 - 财政年份:2021
- 资助金额:
$ 29.54万 - 项目类别:
Consortium for Viral Systems Biology Modeling Core
病毒系统生物学建模核心联盟
- 批准号:
10579085 - 财政年份:2018
- 资助金额:
$ 29.54万 - 项目类别:
Consortium for Viral Systems Biology Modeling Core
病毒系统生物学建模核心联盟
- 批准号:
10374718 - 财政年份:2018
- 资助金额:
$ 29.54万 - 项目类别:
Consortium for Viral Systems Biology Modeling Core
病毒系统生物学建模核心联盟
- 批准号:
10310604 - 财政年份:2018
- 资助金额:
$ 29.54万 - 项目类别:
Bayesian Joint Estimation of Alignment and Phylogeny
比对和系统发育的贝叶斯联合估计
- 批准号:
7596504 - 财政年份:2008
- 资助金额:
$ 29.54万 - 项目类别:
Bayesian Joint Estimation of Alignment and Phylogeny
比对和系统发育的贝叶斯联合估计
- 批准号:
7660485 - 财政年份:2008
- 资助金额:
$ 29.54万 - 项目类别:
Bayesian Joint Estimation of Alignment and Phylogeny
比对和系统发育的贝叶斯联合估计
- 批准号:
7883433 - 财政年份:2008
- 资助金额:
$ 29.54万 - 项目类别:
Bayesian Joint Estimation of Alignment and Phylogeny
比对和系统发育的贝叶斯联合估计
- 批准号:
8302280 - 财政年份:2008
- 资助金额:
$ 29.54万 - 项目类别:
相似国自然基金
基于深度和多示例学习的m6A-seq数据分析质量提升算法研究
- 批准号:61902323
- 批准年份:2019
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
复杂组织高通量数据的异质性分解及应用算法研究
- 批准号:61902061
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
面向致癌基因识别的多组学数据矩阵分解算法研究
- 批准号:61902215
- 批准年份:2019
- 资助金额:27.0 万元
- 项目类别:青年科学基金项目
整合基因突变与差异表达数据的癌症关键基因模块预测算法研究
- 批准号:61902390
- 批准年份:2019
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
大规模蛋白质功能预测的高效算法研究
- 批准号:61872094
- 批准年份:2018
- 资助金额:65.0 万元
- 项目类别:面上项目
相似海外基金
Elucidating causal mechanisms of ethanol-induced analgesia in BXD recombinant inbred mouse lines
阐明 BXD 重组近交系小鼠乙醇诱导镇痛的因果机制
- 批准号:
10825737 - 财政年份:2023
- 资助金额:
$ 29.54万 - 项目类别:
Discovering clinical endpoints of toxicity via graph machine learning and semantic data analysis
通过图机器学习和语义数据分析发现毒性的临床终点
- 批准号:
10745593 - 财政年份:2023
- 资助金额:
$ 29.54万 - 项目类别:
Connecting the universe of proteins to address annotation inequality in the microbial proteome
连接蛋白质领域以解决微生物蛋白质组中的注释不平等问题
- 批准号:
10658439 - 财政年份:2023
- 资助金额:
$ 29.54万 - 项目类别:
Can one size fit all? - High-Resolution 3D Genome Spatial Organization Inference with Generalizable Models
一种尺寸可以适合所有人吗?
- 批准号:
10707587 - 财政年份:2023
- 资助金额:
$ 29.54万 - 项目类别: