A scalable, integrative, multi-omic analysis platform
可扩展、综合、多组学分析平台
基本信息
- 批准号:9295640
- 负责人:
- 金额:$ 15.68万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-05-01 至 2019-04-30
- 项目状态:已结题
- 来源:
- 关键词:ATAC-seqAddressAffectAlgorithmic AnalysisAlgorithmic SoftwareAlgorithmsAllelesArchitectureArithmeticBiological AssayCatalogsChIP-seqCodeCollectionComplexComputer softwareCustomDataData AggregationData AnalyticsData SetDevelopmentDiseaseEnhancersFoundationsFutureGene ExpressionGene Expression RegulationGene FrequencyGenesGeneticGenetic VariationGenetic studyGenomeGenomic SegmentGenomicsGenotypeGenotype-Tissue Expression ProjectHaplotypesHereditary DiseaseHeritabilityHeterozygoteHourHuman GenomeImageryIndividualInternetLightLiquid substanceMetadataMethodsNamesPhasePhenotypePopulationProcessPublishingRare DiseasesResearchResearch TrainingRunningSchemeSequence AlignmentSystemTechnologyTissuesTrainingTrans-Omics for Precision MedicineUtahVariantWorkbasecell typecohortdata integrationdisorder riskepigenomicsexperimental studygenetic variantgenome annotationgenome browsergenome sequencingimprovedindexinginnovationinsightnoveloperationprogramsrare variantrisk variantsearch enginestatisticstooltraittranscriptome sequencingweb interfacewhole genome
项目摘要
PROJECT SUMMARY
Despite decades of effort, only a small portion of the heritability of genetic disorders can be currently explained.
Two explanations for this gap are that the underlying genetic variants are rare and currently unknown, and, we
have a poor understanding of the impact of the variants that we do have, in particular those residing outside of
the coding regions. Addressing these issues requires both larger cohorts and more whole-genome functional
assays (e.g RNA-seq, CHiP-seq, ATAC-seq, etc.). In recognition of projects like the Center for Common
Genetic Disorders (CCGD), the Trans-Omics for Precision Medicine (TOPMed) Program and ENCODE are
performing the gathering of massive amounts of genetic data across many different individuals and tissues. In
aggregate, this data will dramatically improve our power to understanding how variation affects genomic
architecture. The challenge is that these data are vast, complex, and multidimensional, and current methods
cannot operate at this scale.
This proposal addresses this challenge by splitting the data into two distinct types of data, genotypes
and genome annotations, and developing technologies that are optimized to store and search each type
independently. These two highly-scalable methods, which will be extremely valuable on their own, will then be
integrated into a single system that enables queries across variation, gene expression, and regulation. For
example, consider the question, “Are there any tissues where de novo variants in case have a differential
enrichment versus those in controls?” This question is decomposed into a genotype query that produces two
sets of variants: de novos in case and de novos in controls. The sets then serve as input queries into a
genome annotation search across all putative enhancers in all tissues.
This proposal builds upon both my recently published Genotype Query Tools (GQT), a method that
achieved vast speedups over other methods by operating directly on a compressed genotype index, and my
past research and training in genome arithmetic algorithms, for which I have published multiple novel
algorithms. Up to now I have focused on methods, so while the K99 phase of this project will include
development, it will have a distinct focus on the analysis of disease cohorts. This additional training will be the
foundation of an independent research program that will unlock the potential of large-scale genomics and
functional data sets, providing for the fast and fluid integration between phenotype, genotype, and functional
data.
项目概要
尽管经过数十年的努力,目前只能解释遗传性疾病遗传性的一小部分。
对于这一差距的两种解释是,潜在的遗传变异非常罕见且目前未知,而且,我们
对我们所拥有的变体的影响了解甚少,特别是那些居住在国外的变体
解决这些问题需要更大的队列和更多的全基因组功能。
检测(例如 RNA-seq、CHiP-seq、ATAC-seq 等)。表彰 Center for Common 等项目。
遗传性疾病 (CCGD)、精准医学跨组学 (TOPMed) 计划和 ENCODE
收集许多不同个体和组织的大量遗传数据。
总的来说,这些数据将极大地提高我们理解变异如何影响基因组的能力
挑战在于这些数据是巨大的、复杂的、多维的,并且是当前的方法。
无法以这种规模运作。
该提案通过将数据分为两种不同类型的数据(基因型)来解决这一挑战
和基因组注释,并开发优化存储和搜索每种类型的技术
这两种高度可扩展的方法本身就非常有价值,然后将被独立地使用。
集成到单个系统中,支持跨变异、基因表达和调控的查询。
例如,考虑这样一个问题:“是否有任何组织中的新生变异存在差异?
这个问题被分解为一个基因型查询,产生两个
变体集合:案例中的 de novos 和控件中的 de novos 然后,这些集合用作 a 的输入查询。
基因组注释搜索所有组织中所有假定的增强子。
该提案建立在我最近发布的基因型查询工具(GQT)的基础上,该方法
通过直接在压缩的基因型索引上操作,大大实现了比其他方法更快的速度,并且我的
过去在基因组算术算法方面的研究和培训,为此我发表了多本小说
到目前为止,我主要关注的是方法,所以虽然这个项目的 K99 阶段将包括
开发中,它将特别关注疾病队列的分析。
独立研究计划的基础,该计划将释放大规模基因组学的潜力
功能数据集,提供表型、基因型和功能之间的快速、流畅的集成
数据。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ryan M Layer其他文献
Ryan M Layer的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Ryan M Layer', 18)}}的其他基金
Mining Thousands of Genomes to Classify Somatic and Pathogenic Structural Variants
挖掘数千个基因组以对体细胞和致病结构变异进行分类
- 批准号:
10709480 - 财政年份:2022
- 资助金额:
$ 15.68万 - 项目类别:
Mining Thousands of Genomes to Classify Somatic and Pathogenic Structural Variants
挖掘数千个基因组以对体细胞和致病结构变异进行分类
- 批准号:
10453323 - 财政年份:2022
- 资助金额:
$ 15.68万 - 项目类别:
A scalable, integrative, multi-omic analysis platform
可扩展、综合、多组学分析平台
- 批准号:
9769844 - 财政年份:2018
- 资助金额:
$ 15.68万 - 项目类别:
相似国自然基金
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
相似海外基金
Sex, Physiological State, and Genetic Background Dependent Molecular Characterization of CircuitsGoverning Parental Behavior
控制父母行为的回路的性别、生理状态和遗传背景依赖性分子特征
- 批准号:
10661884 - 财政年份:2023
- 资助金额:
$ 15.68万 - 项目类别:
Characterizing HIV-1 reservoirs in the central nervous system
中枢神经系统中 HIV-1 储存库的特征
- 批准号:
10772268 - 财政年份:2023
- 资助金额:
$ 15.68万 - 项目类别:
Gene regulatory networks in early lung epithelial cell fate decisions
早期肺上皮细胞命运决定中的基因调控网络
- 批准号:
10587615 - 财政年份:2023
- 资助金额:
$ 15.68万 - 项目类别:
Distinct Glycophenotypes with Abnormal Signaling Define a Subpopulation of B cells Responsible for Production of Galactose-Deficient IgA1, the Main Autoantigen in IgA Nephropathy
具有异常信号传导的独特糖表型定义了负责产生半乳糖缺陷型 IgA1(IgA 肾病的主要自身抗原)的 B 细胞亚群
- 批准号:
10563618 - 财政年份:2023
- 资助金额:
$ 15.68万 - 项目类别:
Understanding the effects of Gender Affirming Hormone Therapy (GAHT) on immune function using a systems immunology approach
使用系统免疫学方法了解性别肯定激素疗法 (GAHT) 对免疫功能的影响
- 批准号:
10749957 - 财政年份:2023
- 资助金额:
$ 15.68万 - 项目类别: