A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
基本信息
- 批准号:10842047
- 负责人:
- 金额:$ 20.28万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-06-01 至 2027-03-31
- 项目状态:未结题
- 来源:
- 关键词:ATAC-seqAlgorithmsAtlasesAutomobile DrivingBase PairingBenchmarkingBindingCRISPR/Cas technologyCatalogsCellsChIP-seqChromatinCodeCollaborationsCollectionCommunitiesCommunity OutreachComputer ModelsDNADNA SequenceDataData AnalysesData SetDevelopmentDiseaseEducation and OutreachEducational workshopElementsEpigenetic ProcessExonsFunctional disorderFutureGenesGenomicsHistonesHumanHuman BioMolecular Atlas ProgramHuman GenomeHuman Genome ProjectHuman bodyIndividualInternationalInterruptionMapsMediatingMethodsModelingNematodaOnline SystemsOrganismPatternPhysiologyProcessQuality ControlRegistriesRegulatory ElementResearchResearch PersonnelResolutionResourcesRoleSchemeSignal TransductionSpecific qualifier valueTechniquesTechnologyTestingTimeTissuesTrainingTrans-Omics for Precision MedicineTranscriptional RegulationUntranslated RNAVariantVisualizationWorkbasecell typecommunity buildingcommunity settingdata analysis pipelinedata repositorydeep learningdeep learning modeldeep sequencingdesignepigenomeepigenomicsexperimental studyfollow-upgenome wide association studyin silicoin vivo Modelmachine learning modelmodel developmentnovelonline resourceoutreachpredictive modelingpublic repositoryrepositorysequence learningsyntaxtooltraittranscription factor
项目摘要
Project Summary/Abstract
The Human Genome Project (HGP) completed the first draft human genome sequence two decades ago. The
HGP revealed that human complexity arises from only approximately 20,000 coding genes, roughly the same
number as much simpler organisms such as nematodes. Intricate patterns of transcriptional regulation mediated
by non-coding regulatory elements specify the myriad cell types and states required for human complexity.
Genome-wide association studies have subsequently identified thousands of disease-associated variants, many
of which interrupt the function of these non-coding elements to disrupt transcriptional regulation. Thus, in order
to better understand human physiology and pathophysiology, comprehensive atlases of regulatory elements are
essential. Many previous efforts, including the International Human Epigenome Consortium (IHEC), the
FANTOM Consortium, the Roadmap Epigenomics Project, and the ENCODE Project, have aimed to build
comprehensive collections of regulatory elements, as well as computational models to better predict regulatory
activity and understand the sequence features underlying regulatory function. ENCODE (2003-2022) is a large-
scale consortium effort which aims to annotate every functional non-coding element of the human genome;
during our work on the project, we built a Registry of approximately 1 million human candidate cis-regulatory
elements (cCREs). We further developed deep-learning approaches which model the transcription factor motif
syntax that underlies element function at base-pair resolution and built two web-based resources, SCREEN and
Factorbook, to make our results accessible to the scientific community. Here, we propose to extend this
framework to build the Community Resource for Transcriptional Regulation (CRTR), a comprehensive atlas of
non-coding regulatory elements and machine-learning models which will encompass community and consortium
deep-sequencing data, both bulk and single cell, across a broad array of cell types and states. Our project has
five aims. First, we aim to curate community and consortium data for inclusion in CRTR and perform uniform
processing and quality control. Second, we aim to train deep-learning sequence models on bulk epigenetic
datasets to identify transcription factor motif syntax driving regulatory element activity in distinct tissues and cell
types. Third, we aim to train sequence models on single cell datasets to identify transcription factor motif syntax
driving transcriptional regulation in high-resolution cell states and during cell state transitions. Fourth, we aim to
use the aforementioned results to build comprehensive benchmark datasets and machine-learning model
collections, which will aid future analysts in designing new models to predict regulatory readouts. Fifth, we aim
to build a state-of-the-art web-based user interface to enable users to perform integrative analyses and in silico
experimentation with CRTR, and hold workshops and other outreach to maximize the impact of the resource and
its accessibility to the broader scientific community.
项目摘要/摘要
人类基因组项目(HGP)二十年前完成了人类基因组序列的第一稿。这
HGP表明人类的复杂性仅来自大约20,000个编码基因,大致相同
数字更简单的生物,例如线虫。转录调节的复杂模式介导
通过非编码调节元素,指定人类复杂性所需的无数细胞类型和状态。
全基因组关联研究随后发现了数千种与疾病相关的变体,许多变异
其中中断了这些非编码元件在破坏转录调节的功能。因此,按顺序
为了更好地了解人类的生理学和病理生理学,监管元素的全面图书馆是
基本的。以前的许多努力,包括国际人类表观基因组联盟(IHEC),
Fantom联盟,路线图表观基因组学项目和编码项目旨在建立
监管元素的全面集合以及计算模型,以更好地预测监管
活动并了解调节函数的序列特征。 Encode(2003-2022)是一个大型
规模财团的努力旨在注释人类基因组的每个功能性非编码元素;
在该项目的工作期间,我们建立了约100万人候选人顺式调节的注册表
元素(ccres)。我们进一步开发了深入学习方法,该方法对转录因子基序建模
基础元素函数以基本分辨率为基础的语法,并构建了两个基于Web的资源,屏幕和
Factorbook,以使我们的结果可以访问科学界。在这里,我们建议扩展这一点
建立转录监管社区资源(CRTR)的框架,这是一个全面的地图集
非编码监管元素和机器学习模型,这些模型将包括社区和财团
深入的数据,包括大量和单个细胞,遍及各种细胞类型和状态。我们的项目有
五个目标。首先,我们旨在策划社区和财团数据以包含在CRTR中并执行统一
处理和质量控制。其次,我们的目标是训练散装表观遗传学的深度学习序列模型
识别转录因子图案语法驱动调节元件活动的数据集在不同的组织和细胞中
类型。第三,我们旨在在单细胞数据集上训练序列模型来识别转录因子图案语法
驱动高分辨率细胞状态和细胞状态过渡期间的转录调节。第四,我们的目标是
使用上述结果构建全面的基准数据集和机器学习模型
收藏品将帮助未来的分析师设计新模型以预测监管读数。第五,我们的目标
构建基于Web的最先进的用户界面,以使用户能够执行集成分析和计算机中
对CRTR进行实验,并举办研讨会和其他宣传,以最大化资源的影响和
它对更广泛的科学界的可及性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Anshul Kundaje其他文献
Anshul Kundaje的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Anshul Kundaje', 18)}}的其他基金
Multi-Omics DACC: The Data Analysis and Coordination Center for the collaborative multi-omics for health and disease initiative
多组学 DACC:健康和疾病协作多组学计划的数据分析和协调中心
- 批准号:
10744561 - 财政年份:2023
- 资助金额:
$ 20.28万 - 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
- 批准号:
10411262 - 财政年份:2022
- 资助金额:
$ 20.28万 - 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
- 批准号:
10625529 - 财政年份:2022
- 资助金额:
$ 20.28万 - 项目类别:
Identifying causal genetic variants and molecular mechanisms impacting mental health
识别影响心理健康的因果遗传变异和分子机制
- 批准号:
10571911 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Identifying causal genetic variants and molecular mechanisms impacting mental health
识别影响心理健康的因果遗传变异和分子机制
- 批准号:
10380573 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
- 批准号:
10659170 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
- 批准号:
10297562 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
- 批准号:
10474459 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Multi-omic functional assessment of novel AD variants using high-throughput and single-cell technologies
使用高通量和单细胞技术对新型 AD 变体进行多组学功能评估
- 批准号:
10684210 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
Multi-omic functional assessment of novel AD variants using high-throughput and single-cell technologies
使用高通量和单细胞技术对新型 AD 变体进行多组学功能评估
- 批准号:
10436207 - 财政年份:2021
- 资助金额:
$ 20.28万 - 项目类别:
相似国自然基金
地表与大气层顶短波辐射多分量一体化遥感反演算法研究
- 批准号:42371342
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
高速铁路柔性列车运行图集成优化模型及对偶分解算法
- 批准号:72361020
- 批准年份:2023
- 资助金额:27 万元
- 项目类别:地区科学基金项目
随机密度泛函理论的算法设计和分析
- 批准号:12371431
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
基于全息交通数据的高速公路大型货车运行风险识别算法及主动干预方法研究
- 批准号:52372329
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
高效非完全信息对抗性团队博弈求解算法研究
- 批准号:62376073
- 批准年份:2023
- 资助金额:51 万元
- 项目类别:面上项目
相似海外基金
New software tools for differential analysis of single-cell genomics perturbation experiments
用于单细胞基因组扰动实验差异分析的新软件工具
- 批准号:
10735033 - 财政年份:2023
- 资助金额:
$ 20.28万 - 项目类别:
Cell type harmonization of single cell data in HuBMAP and GTEx
HuBMAP 和 GTEx 中单细胞数据的细胞类型协调
- 批准号:
10777089 - 财政年份:2023
- 资助金额:
$ 20.28万 - 项目类别:
Precision Medicine Digital Twins for Alzheimer’s Target and Drug Discovery and Longevity
用于阿尔茨海默氏症靶点和药物发现及长寿的精准医学数字孪生
- 批准号:
10727793 - 财政年份:2023
- 资助金额:
$ 20.28万 - 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
- 批准号:
10411262 - 财政年份:2022
- 资助金额:
$ 20.28万 - 项目类别: