Identifying structural variants influencing human health in population cohorts
识别影响人群健康的结构变异
基本信息
- 批准号:10889519
- 负责人:
- 金额:$ 40万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:Algorithm DesignAllelesBase PairingCollaborationsComplexComputer softwareComputing MethodologiesCopy Number PolymorphismDNA SequenceDataData SetDetectionDiseaseDistantExclusionGene FrequencyGeneticGenetic DiseasesGenetic PolymorphismGenomeGenomicsGenotypeHaplotypesHealthHumanHuman GenomeIndividualInheritedJapanLettersLinkage DisequilibriumMapsMethodsMutationOutcomeParticipantPhenotypePolymorphism AnalysisPopulationPublicationsResearchResearch PersonnelResourcesSNP arraySNP genotypingSamplingSignal TransductionSingle Nucleotide PolymorphismStatistical AlgorithmStatistical MethodsStructureVariantVeteransbiobankcohortcomputational pipelinesdata integrationdetection sensitivityexomeexome sequencingexperiencegenetic associationgenetic resourcegenetic variantgenome sequencinggenome wide association studygenomic dataimprovedinsertion/deletion mutationinsightneural networkphenotypic dataprogramsrepositorystatisticstargeted treatmenttraitwhole genome
项目摘要
Project Summary/Abstract
Large-scale biobank resources of genetic and phenotypic data hold great promise for revealing insights into
disease genetics and enabling genetically-informed, targeted therapeutics. To more fully realize this potential,
new statistical methods are needed to recover latent information about genomic structural variants – i.e.,
polymorphisms modifying >50 base pairs of DNA sequence – within these data sets. Because of their large
size, structural variants collectively contribute more base pairs of variation within an individual’s genome than
single-nucleotide polymorphisms (SNPs) or short indels. However, structural variants have been difficult to
identify and genotype from the SNP-array and short-read sequencing data generated by biobanks to date.
We will undertake a research program to develop a new suite of “haplotype-informed” statistical algorithms
designed to accurately and efficiently genotype structural variants in large biobank data sets. This approach
will leverage the fact that population-polymorphic genetic variants are typically carried by multiple individuals
within a large cohort who co-inherited an extended SNP-haplotype. Identification of such shared haplotypes
will enable information about a structural variant carried by one individual to inform detection of the same
variant carried by other distantly related individuals, simultaneously facilitating structural variant genotyping,
variant harmonization, and haplotype-resolved analysis.
This project will have three specific aims. First, we will develop haplotype-informed computational methods that
improve detection sensitivity and genotyping accuracy for several classes of structural variation using short-
read sequencing data. These methods will be particularly helpful for analysis of short copy-number variants
(CNVs) from exome-sequencing data and for analysis of multi-allelic CNVs and large repeats from exome- or
genome-sequencing data. Second, we will develop methods for imputing structural variants into genotype-
phenotype association data sets – a statistical approach that has been extremely effective in genome-wide
association studies (GWAS) of SNPs and indels but has been difficult to apply to structural variants. We will
develop new methods to impute structural variants from short- or long-read-based reference panels and will
also develop a pipeline for imputing and fine-mapping structural variant associations into GWAS summary
statistics. Third, we will genotype structural variants in multiple large genetic biobank data sets and identify
associated health outcomes. We will return haplotype-resolved structural variant call sets for use by other
researchers. We anticipate that these efforts will reveal new structural variant polymorphisms with large
phenotypic effects, augment existing biobank resources, and enable imputation into further data sets.
项目摘要/摘要
遗传和表型数据的大规模生物库资源有很大的希望,可以揭示洞察力
疾病遗传学和实现遗传信息的靶向治疗。为了更充分地意识到这一潜力,
需要新的统计方法来恢复有关基因组结构变体的潜在信息 - 即
在这些数据集中,多态性修饰了> 50个DNA序列的碱基对。因为他们的大
大小,结构变体共同贡献了个体基因组内的基本对多对,而不是
单核苷酸多态性(SNP)或短插入。但是,结构变体很难
从生物库生成的SNP阵列和短阅读测序数据中识别和基因型。
我们将进行一项研究计划,以开发一套新的“单倍型信息”统计算法
旨在在大型生物库数据集中准确有效地基因型结构变体。这种方法
将利用以下事实,即种群 - 造型遗传变异通常由多个个体携带
在一个共同介绍扩展的SNP-Haplotype的大型队列中。这种共享单倍型的识别
将启用有关一个人携带的结构变体的信息,以告知对同一的检测
由其他明显相关的个体携带的变体,仅支持结构变异基因分型,
变体协调和单倍型分辨分析。
该项目将具有三个具体目标。首先,我们将开发单倍型信息的计算方法
提高使用短期的几类结构变异的检测灵敏度和基因分型精度
阅读测序数据。这些方法将特别有用
(CNV)来自外显子序列数据,用于分析多行CNV和外显子或外显型重复序列
基因组测序数据。其次,我们将开发用于将结构变体归为基因型的方法
表型关联数据集 - 一种在全基因组中非常有效的统计方法
SNP和Indels的关联研究(GWAS),但很难应用于结构变体。我们将
开发新的方法,以从基于短读或长阅读的参考面板中估算结构变体,并将
还开发了将插图和精细映射结构变体关联归入GWAS摘要的管道
统计数据。第三,我们将在多个大型遗传生物库数据集中的基因型结构变体并识别
相关的健康结果。我们将返回单倍型分辨的结构变体呼叫集,以供其他
研究人员。我们预计这些努力将揭示新的结构变异多态性
表型效应,增强现有的生物库资源,并将归纳为进一步的数据集。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Po-Ru Loh其他文献
Po-Ru Loh的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Po-Ru Loh', 18)}}的其他基金
Leveraging biobank-scale whole-genome sequencing for polygenic risk prediction
利用生物库规模的全基因组测序进行多基因风险预测
- 批准号:
10716534 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Fast and powerful extensions of mixed model methods for GWAS
GWAS 混合模型方法的快速而强大的扩展
- 批准号:
8712922 - 财政年份:2014
- 资助金额:
$ 40万 - 项目类别:
Fast and powerful extensions of mixed model methods for GWAS
GWAS 混合模型方法的快速而强大的扩展
- 批准号:
8974184 - 财政年份:2014
- 资助金额:
$ 40万 - 项目类别:
Fast and powerful extensions of mixed model methods for GWAS
GWAS 混合模型方法的快速而强大的扩展
- 批准号:
9186420 - 财政年份:2014
- 资助金额:
$ 40万 - 项目类别:
相似国自然基金
等位基因聚合网络模型的构建及其在叶片茸毛发育中的应用
- 批准号:32370714
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于等位基因非平衡表达的鹅掌楸属生长量杂种优势机理研究
- 批准号:32371910
- 批准年份:2023
- 资助金额:50.00 万元
- 项目类别:面上项目
基于人诱导多能干细胞技术研究突变等位基因特异性敲除治疗1型和2型长QT综合征
- 批准号:82300353
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
ACR11A不同等位基因调控番茄低温胁迫的机理解析
- 批准号:32302535
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
肠杆菌多粘菌素异质性耐药中phoPQ等位基因差异介导不同亚群共存的机制研究
- 批准号:82302575
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
New approaches for leveraging single-cell data to identify disease-critical genes and gene sets
利用单细胞数据识别疾病关键基因和基因集的新方法
- 批准号:
10768004 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Structural Variation analysis of Orofacial Cleft associated genomic regions in African and Asian populations
非洲和亚洲人群口面部裂相关基因组区域的结构变异分析
- 批准号:
10643334 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Targeting DNA Mismatches for Auger Electron Radiotherapy
针对 DNA 错配进行俄歇电子放射治疗
- 批准号:
10751210 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Develop new bioinformatics infrastructures and computational tools for epitranscriptomics data
为表观转录组数据开发新的生物信息学基础设施和计算工具
- 批准号:
10633591 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别:
Core C - Mutagenesis, Screening and Cryopreservation
核心 C - 诱变、筛选和冷冻保存
- 批准号:
10642552 - 财政年份:2023
- 资助金额:
$ 40万 - 项目类别: