High dimensional statistical data integration for studying regulatory variation
用于研究监管变化的高维统计数据集成
基本信息
- 批准号:9344668
- 负责人:
- 金额:$ 32.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2007
- 资助国家:美国
- 起止时间:2007-04-26 至 2020-06-30
- 项目状态:已结题
- 来源:
- 关键词:AddressBindingBinding SitesBioconductorBiologic CharacteristicCellsChIP-seqChromatinCollectionCommunitiesComputer AnalysisComputer softwareDNADNA MethylationDNA-Protein InteractionDataData SetData SourcesDerivation procedureDevelopmentDiagnosisDiseaseElementsGalaxyGenerationsGeneticGenetic TranscriptionGenomeGenomicsGenotypeHistonesHumanIndividualInternationalInvestigationJointsKnowledgeLettersLocationMapsMessenger RNAMethodologyMethodsPhenotypeProtein IsoformsRNARNA analysisRNA-Binding ProteinsRNA-Protein InteractionRegulationRepetitive SequenceResearchResearch PersonnelResourcesSamplingSourceStatistical Data InterpretationStatistical MethodsStatistical ModelsTechnologyTissuesTrainingUntranslated RNAValidationVariantbasecell typecrosslinking and immunoprecipitation sequencingdata integrationepigenomeexperienceexperimental studygenetic variantgenome wide association studygenome-widegenomic datagenomic profileshigh dimensionalityhigh throughput technologyhistone modificationhuman diseaseimprovedinnovationnext generation sequencingnovelprotein profilingreference genomesimulationtooltraittranscription factorwhole genome
项目摘要
Project Summary
Next generation sequencing (NGS) technologies revolutionized the fields of genetics and
genomics by allowing rapid and inexpensive sequencing of billions of bases. Although
basic analysis tools for each individual data type are abundant, statistical methods that
can integrate different sources of data for addressing key, challenging questions are
lacking. We propose to develop integrative methods for critical, widely used, applications
urgently requiring reliable statistical integration tools. At the core of our methods is
effective integration of multiple appropriate data types with novel statistical methods.
First, although, to date, large numbers of protein-DNA interactions and histone
modifications are mapped, systematic methods that allow users to query these data and
generate testable hypotheses are lacking. Second, in parallel to generation of
(epi)genomic profiles, genome-wide association studies (GWAS) have been successful
at identifying disease and trait-associated genetic variants (GVs). However, our ability to
identify causal variants and elucidate the mechanisms by which genotypes influence
phenotypes is hampered by significant obstacles. Third, although the utility of reads that
map to multiple locations on the reference genome (multi-reads) has been well
established for some NGS applications such as RNA-seq and ChIP-seq, all the analysis
methods for the emerging data type CLIP-seq that interrogates RNA binding proteins
rely on using only reads that map uniquely to reference genome (uni-reads) leading to
unreliable inference. We plan to address these critical challenges by developing (1) Fast
and scalable integrative statistical methods for joint analysis of multiple ChIP-seq
datasets to enable both individual data level inference and identification of joint effects;
(2) A statistical analysis framework for integrating GWAS results with the increasing
numbers of genome-wide maps of functional annotations; (3) An integrative multi-read
mapping framework for studying RNA-protein interactions through CLIP-seq
experiments. The projects will be accomplished through a combination of methodological
development, simulation, computational analysis, and experimental validation. Methods
will be developed and evaluated using datasets from the ENCODE and REMC as well
as novel datasets from collaborators. Statistical resources generated from the project will
be disseminated in publicly available software. Collectively, these aims will significantly
improve the utility of genome-wide data types that are available to researchers.
项目概要
下一代测序 (NGS) 技术彻底改变了遗传学和
通过允许快速且廉价地对数十亿个碱基进行测序来实现基因组学。虽然
每种数据类型的基本分析工具都很丰富,统计方法
可以整合不同来源的数据来解决关键的、具有挑战性的问题
缺乏。我们建议为关键的、广泛使用的应用开发综合方法
迫切需要可靠的统计集成工具。我们方法的核心是
将多种适当的数据类型与新颖的统计方法有效整合。
首先,尽管迄今为止,大量的蛋白质-DNA 相互作用和组蛋白
修改是映射的、系统的方法,允许用户查询这些数据并
缺乏生成可检验的假设。其次,与生成并行
(表观)基因组图谱、全基因组关联研究(GWAS)已取得成功
识别疾病和性状相关的遗传变异(GV)。然而,我们的能力
识别因果变异并阐明基因型影响的机制
表型受到重大障碍的阻碍。第三,虽然读起来的效用
映射到参考基因组上的多个位置(多重读取)已经很好
为一些 NGS 应用(例如 RNA-seq 和 ChIP-seq)建立,所有分析
用于询问 RNA 结合蛋白的新兴数据类型 CLIP-seq 的方法
依赖于仅使用唯一映射到参考基因组的读取(uni-reads),从而导致
不可靠的推论。我们计划通过开发 (1) 快速解决这些关键挑战
和可扩展的综合统计方法,用于多个 ChIP-seq 的联合分析
数据集,以实现单独数据级别的推断和联合效应的识别;
(2) 将 GWAS 结果与不断增长的数据相结合的统计分析框架
功能注释的全基因组图谱数量; (3) 综合多读
通过 CLIP-seq 研究 RNA-蛋白质相互作用的绘图框架
实验。这些项目将通过方法论的结合来完成
开发、模拟、计算分析和实验验证。方法
也将使用来自 ENCODE 和 REMC 的数据集进行开发和评估
作为合作者的新颖数据集。该项目产生的统计资源将
在公开可用的软件中传播。总的来说,这些目标将显着
提高研究人员可用的全基因组数据类型的实用性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sunduz Keles其他文献
Sunduz Keles的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sunduz Keles', 18)}}的其他基金
Statistical methods for co-expression network analysis of population-scale scRNA-seq data
群体规模 scRNA-seq 数据共表达网络分析的统计方法
- 批准号:
10740240 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
Functionally relevant mapping of human GWAS SNPs on model organisms
人类 GWAS SNP 在模式生物上的功能相关图谱
- 批准号:
10056966 - 财政年份:2020
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Power Calculations for ChIP-seq experiments
ChIP-seq 实验的统计功效计算
- 批准号:
8284083 - 财政年份:2012
- 资助金额:
$ 32.5万 - 项目类别:
High dimensional statistical data modeling and integration for studying regulatory variation
用于研究监管变化的高维统计数据建模和集成
- 批准号:
10413927 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Analysis Methods and Software for ChIP-seq Data
ChIP-seq 数据的统计分析方法和软件
- 批准号:
8605900 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Analysis Methods and Software for ChIP-seq Data
ChIP-seq 数据的统计分析方法和软件
- 批准号:
8785690 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Methods for the Analysis of ChlP-chip Data
ChlP 芯片数据分析的统计方法
- 批准号:
7253510 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Analysis Methods and Software for ChIP-seq Data
ChIP-seq 数据的统计分析方法和软件
- 批准号:
8370723 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
Statistical Methods for the Analysis of ChlP-chip Data
ChlP 芯片数据分析的统计方法
- 批准号:
7799293 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
High dimensional statistical data modeling and integration for studying regulatory variation
用于研究监管变化的高维统计数据建模和集成
- 批准号:
10610872 - 财政年份:2007
- 资助金额:
$ 32.5万 - 项目类别:
相似国自然基金
帽结合蛋白(cap binding protein)调控乙烯信号转导的分子机制
- 批准号:
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:
利用分子装订二硫键新策略优化改造α-芋螺毒素的研究
- 批准号:82104024
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
CST蛋白复合体在端粒复制中对端粒酶移除与C链填补调控的分子机制研究
- 批准号:31900521
- 批准年份:2019
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
Wdr47蛋白在神经元极化中的功能及作用机理的研究
- 批准号:31900503
- 批准年份:2019
- 资助金额:26.0 万元
- 项目类别:青年科学基金项目
ID1 (Inhibitor of DNA binding 1) 在口蹄疫病毒感染中作用机制的研究
- 批准号:31672538
- 批准年份:2016
- 资助金额:62.0 万元
- 项目类别:面上项目
相似海外基金
The Role of Glycosyl Ceramides in Heart Failure and Recovery
糖基神经酰胺在心力衰竭和恢复中的作用
- 批准号:
10644874 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
Development of antibody drug conjugates as pan-filo antivirals
开发作为泛型抗病毒药物的抗体药物偶联物
- 批准号:
10759731 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
Unraveling how Lipophilic Modulators Alter pLGIC Function via Interactions with the M4 Transmembrane Helix
揭示亲脂性调节剂如何通过与 M4 跨膜螺旋相互作用改变 pLGIC 功能
- 批准号:
10785755 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
Targeting Trained Immunity in Trauma-Induced Immune Dysregulation
针对创伤引起的免疫失调中训练有素的免疫力
- 批准号:
10714384 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别:
A novel role of cholesterol and SR-BI in adipocyte biology
胆固醇和 SR-BI 在脂肪细胞生物学中的新作用
- 批准号:
10733720 - 财政年份:2023
- 资助金额:
$ 32.5万 - 项目类别: