Integrated variation detection annotation and analysis for high-throughout seque
高通量序列的集成变异检测注释和分析
基本信息
- 批准号:8448070
- 负责人:
- 金额:$ 34.46万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2012
- 资助国家:美国
- 起止时间:2012-03-23 至 2017-02-28
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsBase SequenceBenchmarkingBiologicalCategoriesCodeCommunitiesComputational algorithmComputer softwareComputing MethodologiesCopy Number PolymorphismDNA SequenceDataData AnalysesData SourcesDatabasesDetectionDevelopmentDiseaseDisease susceptibilityFunctional RNAGene FrequencyGenesGenetic VariationGenomeGenomicsGenotypeGleanHumanInformaticsKnowledgeLeadLocationMetabolic PathwayMethodsModelingOntologyPathway AnalysisPathway interactionsPhenotypePopulationPropertyReadingRegulator GenesResearch PersonnelResourcesRoleSamplingScientific Advances and AccomplishmentsScoring MethodSequence AlignmentSoftware ToolsSourceStatistical MethodsSystemTestingUpdateVariantWorkbasedata formatdosageexperiencegenetic variantgenome-widehigh throughput analysishuman subjectimprovedmarkov modelmethod developmentopen sourcepopulation basedsimulationtechnique developmenttooltranscriptome sequencingvector
项目摘要
DESCRIPTION (provided by applicant): High-throughput sequencing (HTS) data on the genomes of a diverse number of species are being produced at an unprecedented rate. However, the development of computational and statistical approaches for handling these data lags behind, creating a gap between the massive data being generated and the biological knowledge that could be gleaned. Here we propose to develop an integrated system for genetic variation detection, annotation and analysis for HTS data, therefore reducing the critical gap faced by the community. In Aim 1, we will develop a hidden Markov model (HMM) based computational algorithm that incorporates multiple sources of information, including sequence depth, allelic dosage, population allele frequency and paired-end reads distance, for reliable yet efficient detection of copy number variations (CNVs). Given a large list of SNPs, indels and CNVs, researchers are faced with the challenge of identifying a subset of functionally important variants. In Aim 2, we will develop a comprehensive functional annotation pipeline to annotate functional importance of coding and non-coding variants, utilizing database information from many large-scale genomics projects, and generate a "functional vector" for each variant. These functional vectors can help biologists interpret sequencing results and help statistical geneticists develop informed association tests using sequencing data. Appropriate statistical methods are needed to analyze population-level sequencing data, in order to identify genomic variants that may contribute to disease susceptibility or phenotypic variability. In Aim 3, we will develop a hierarchical modeling strategy, which utilizes functional vector information for each variant, to perform association tests on genes, genomic regions, or biological pathways, such as ontology categories and gene regulatory/metabolic pathways. Finally, in Aim 4, we will test the properties of each approach via simulation and real data analysis, and develop, distribute and support freely available software packages implementing the proposed methods. We believe that well-documented and supported software implementations will allow other researchers to yield the maximum information from the methodological and scientific advances that result from this project. Successful completion of the aims will enable researchers to fully investigate the massive amounts of sequencing data that have been or will be generated, thus contributing to our understanding on how genetic variants influence phenotype variability.
描述(由申请人提供):关于多种物种基因组的高通量测序(HTS)数据正在以前所未有的速度产生。然而,处理这些数据的计算和统计方法的发展滞后,在生成的海量数据和可以收集的生物学知识之间造成了差距。在这里,我们建议开发一个用于 HTS 数据遗传变异检测、注释和分析的集成系统,从而减少社区面临的关键差距。在目标 1 中,我们将开发一种基于隐马尔可夫模型 (HMM) 的计算算法,该算法融合了多个信息源,包括序列深度、等位基因剂量、群体等位基因频率和配对末端读取距离,以可靠而高效地检测拷贝数变异(CNV)。鉴于大量的 SNP、插入缺失和 CNV,研究人员面临着识别功能重要变异子集的挑战。在目标 2 中,我们将开发一个全面的功能注释管道,利用来自许多大型基因组学项目的数据库信息来注释编码和非编码变体的功能重要性,并为每个变体生成一个“功能向量”。这些功能载体可以帮助生物学家解释测序结果,并帮助统计遗传学家使用测序数据开发知情的关联测试。需要适当的统计方法来分析群体水平的测序数据,以便识别可能导致疾病易感性或表型变异的基因组变异。在目标 3 中,我们将开发一种分层建模策略,利用每个变体的功能向量信息,对基因、基因组区域或生物途径(例如本体类别和基因调控/代谢途径)进行关联测试。最后,在目标 4 中,我们将通过模拟和真实数据分析来测试每种方法的属性,并开发、分发和支持实现所提出方法的免费软件包。我们相信,有据可查且受支持的软件实施将使其他研究人员能够从该项目带来的方法和科学进步中获得最大的信息。成功完成这些目标将使研究人员能够充分研究已经或将要生成的大量测序数据,从而有助于我们了解遗传变异如何影响表型变异。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kai Wang其他文献
Kai Wang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kai Wang', 18)}}的其他基金
Dietary prevention for colorectal cancer: targeting the bile acid/gut microbiome axis
结直肠癌的饮食预防:针对胆汁酸/肠道微生物组轴
- 批准号:
10723195 - 财政年份:2023
- 资助金额:
$ 34.46万 - 项目类别:
Novel bioinformatics methods to detect DNA and RNA modifications using Nanopore long-read sequencing
使用 Nanopore 长读长测序检测 DNA 和 RNA 修饰的新型生物信息学方法
- 批准号:
10792416 - 财政年份:2023
- 资助金额:
$ 34.46万 - 项目类别:
Improving chemical exposome target prediction by application of Coupled Matrix/Tensor-Matrix/Tensor Completion algorithms
通过应用耦合矩阵/张量矩阵/张量完成算法改进化学暴露组目标预测
- 批准号:
10734136 - 财政年份:2023
- 资助金额:
$ 34.46万 - 项目类别:
Detection and annotation of structural variants from long-read sequencing
长读长测序结构变异的检测和注释
- 批准号:
10378720 - 财政年份:2019
- 资助金额:
$ 34.46万 - 项目类别:
Integrated Variation Detection Annotation and Analysis
集成变异检测注释和分析
- 批准号:
9402354 - 财政年份:2016
- 资助金额:
$ 34.46万 - 项目类别:
UNDERSTANDING THE FUNCTIONAL IMPACTS OF GENETIC VARIANTS IN MENTAL DISORDERS
了解遗传变异对精神疾病的功能影响
- 批准号:
9389287 - 财政年份:2016
- 资助金额:
$ 34.46万 - 项目类别:
Role of MTA3 in trophoblast function and placental development
MTA3 在滋养层功能和胎盘发育中的作用
- 批准号:
8919934 - 财政年份:2014
- 资助金额:
$ 34.46万 - 项目类别:
Integrated variation detection annotation and analysis for high-throughout seque
高通量序列的集成变异检测注释和分析
- 批准号:
8813611 - 财政年份:2012
- 资助金额:
$ 34.46万 - 项目类别:
Integrated variation detection annotation and analysis for high-throughout seque
高通量序列的集成变异检测注释和分析
- 批准号:
8220672 - 财政年份:2012
- 资助金额:
$ 34.46万 - 项目类别:
Integrated variation detection annotation and analysis for high-throughout seque
高通量序列的集成变异检测注释和分析
- 批准号:
8628856 - 财政年份:2012
- 资助金额:
$ 34.46万 - 项目类别:
相似国自然基金
DNA物理性质的分子动力学模拟和第一原理计算
- 批准号:90203013
- 批准年份:2002
- 资助金额:21.0 万元
- 项目类别:重大研究计划
中国大陆果蝇D.nasuta亚群分子进化和生殖行为的研究
- 批准号:39670395
- 批准年份:1996
- 资助金额:12.0 万元
- 项目类别:面上项目
我国中华按蚊物种分化及区域分布的研究
- 批准号:39570647
- 批准年份:1995
- 资助金额:8.5 万元
- 项目类别:面上项目
犬C-yes致癌基因的序列分析
- 批准号:39570554
- 批准年份:1995
- 资助金额:9.0 万元
- 项目类别:面上项目
从碱基序列的变化探讨水稻抗菌基因家族的进化
- 批准号:39270054
- 批准年份:1992
- 资助金额:5.0 万元
- 项目类别:面上项目
相似海外基金
Proteasomal recruiters of PAX3-FOXO1 Designed via Sequence-Based Generative Models
通过基于序列的生成模型设计的 PAX3-FOXO1 蛋白酶体招募剂
- 批准号:
10826068 - 财政年份:2023
- 资助金额:
$ 34.46万 - 项目类别:
Understanding genomic stability betweengenerations by assessing mutational burdens in single sperms
通过评估单个精子的突变负担来了解代际基因组稳定性
- 批准号:
10740598 - 财政年份:2023
- 资助金额:
$ 34.46万 - 项目类别:
Method development for simultaneous automatic assignment and structure determination in protein NMR
蛋白质 NMR 中同时自动分配和结构测定的方法开发
- 批准号:
10373305 - 财政年份:2022
- 资助金额:
$ 34.46万 - 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
- 批准号:
10411262 - 财政年份:2022
- 资助金额:
$ 34.46万 - 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
- 批准号:
10842047 - 财政年份:2022
- 资助金额:
$ 34.46万 - 项目类别: