Predicting 3D physical gene-enhancer interactions through integration of GTEx and 4DN data
通过整合 GTEx 和 4DN 数据预测 3D 物理基因增强子相互作用
基本信息
- 批准号:10776871
- 负责人:
- 金额:$ 29.82万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-20 至 2024-09-19
- 项目状态:已结题
- 来源:
- 关键词:3-DimensionalATAC-seqAddressAffectAlgorithmsBiological AssayCRISPR interferenceCellsChIP-seqChromatinChromatin ModelingChromosomesComputer AnalysisDNADNase I hypersensitive sites sequencingDataDatabasesElementsEnhancersEpigenetic ProcessFrequenciesFundingGene ExpressionGene Expression RegulationGene TargetingGenesGenomeGenomicsGenotypeGenotype-Tissue Expression ProjectHealthHi-CHumanInvestigationLinkLocationMachine LearningMapsMethodsModelingMolecular ConformationPolymersPrincipal InvestigatorQuantitative Trait LociRegulatory ElementResourcesStructureTechniquesTestingTissuesTrainingTrustUntranslated RNAValidationVariantcausal variantcell typechromosome conformation capturecomputerized toolscostdata integrationdata resourcedata standardsdeep learningdeep learning modelepigenomicsgene interactiongenetic variantgenome sequencinggenome wide association studygenome-widegenome-wide analysishistone modificationimprovedinnovationinsightlarge scale simulationmachine learning predictionprogramsrisk variantsimulationtooltranscriptome sequencingtrustworthinesswhole genome
项目摘要
Program Director/Principal Investigator (Liang, Jie):
PROJECT SUMMARY/ABSTRACT
We will develop computational tools that facilitate investigation of the fundamental relationship
between gene expression and genome topology. Specifically, we will develop machine learning tools
that can link enhancer and its targeted gene at genome wide scale. The ability of establishing
relationship between enhancers and their target genes is critically important, as it will aid in our
understanding of gene regulation and in establishing the relationship between noncoding risk variants
from GWAS studies to potential causal genes. Our approach will be based on 3D polymer models of
chromatin interactions derived from Hi-C data in the common fund 4D Nucleome (4DN) database,
and will integrate data from the common fund supported Genotype-Tissue Expression (GTEx)
databaseas, as well as data from ENCODE database. We will 1) construct a database of trusted high-
quality database of candidate enhancer-gene target pairs. We will then 2) use this database to train a
machine learning predictor that can predict enhancer-gene target pairs at genome wide scale. For 1),
we will develop a pipeline to identify a small set of critical specific chromatin 3D interactions through
simulation of large scale folding of 3D chromatin ensembles. The small set of specific interactions will
be tested for sufficiency of chromatin folding. We will then identify computationally enhancers based
on epigenetic histone modifications and chromatin accessibility data from ENCODE as well as the
Roadmap Epigenomics Project. We will then select enhancers containing eQTLs from the GTEx
databases, which are known to affect the expression of the target gene. The end result will be a high-
quality and trustworthy database of enhance-gene pairs, which will be provided by the predicted
critical specific 3D physical chromatin interactions connecting the eQTL-containing enhancer and the
target gene. For 2), we will develop a machine-learning predictor that predicts enhancer-gene
interactions from genomic, epigenomic, and Hi-C data at genome-wide scale. We will combine
epigenetic data with genomic information (such as sequence motifs of TFs) as features. We will then
train a machine learning predictor through hold-outs and cross-validations of the constructed
database of enhancer-target gene pairs from 1). The efficacy of the predictor will then be assessed
with the gold-standard of the CRISPRi-FlowFISH data. We will then carry out large scale
computational and will construct databases of predicted enhancer-gene relationship for selected cell
types. Overall, we will demonstrate significant added-power of integrating two important Common
Fund data resources and will provide tools to facilitate understanding the relationship between
genome topology and gene expression. Our computational tools will lead to new insight into the
relationship of genome structure and genome function important for improving human health.
0925-0001 (Rev. 03/16) Page Continuation Format Page
项目主任/首席研究员(梁杰):
项目概要/摘要
我们将开发有助于研究基本关系的计算工具
基因表达和基因组拓扑之间的关系。具体来说,我们将开发机器学习工具
可以在全基因组范围内连接增强子及其目标基因。建立能力
增强子与其靶基因之间的关系至关重要,因为它将有助于我们
了解基因调控并建立非编码风险变异之间的关系
从 GWAS 研究到潜在的致病基因。我们的方法将基于 3D 聚合物模型
染色质相互作用源自共同基金 4D 核组 (4DN) 数据库中的 Hi-C 数据,
并将整合来自共同基金支持的基因型组织表达 (GTEx) 的数据
数据库,以及来自 ENCODE 数据库的数据。我们将1)构建一个可信的高级数据库
候选增强子-基因靶点对的质量数据库。然后我们将 2)使用这个数据库来训练
机器学习预测器,可以在全基因组范围内预测增强子-基因目标对。对于 1),
我们将开发一个管道来识别一小组关键的特定染色质 3D 相互作用
3D 染色质整体大规模折叠的模拟。这一小组特定的互动将
测试染色质折叠的充分性。然后我们将基于计算确定增强器
来自 ENCODE 的表观遗传组蛋白修饰和染色质可及性数据以及
表观基因组学项目路线图。然后我们将从 GTEx 中选择包含 eQTL 的增强子
已知会影响目标基因表达的数据库。最终的结果将是一个高
质量和值得信赖的增强基因对数据库,将由预测提供
连接包含 eQTL 的增强子和
目标基因。对于2),我们将开发一个机器学习预测器来预测增强子基因
基因组、表观基因组和 Hi-C 数据在全基因组范围内的相互作用。我们将结合
以基因组信息(例如 TF 的序列基序)为特征的表观遗传数据。我们随后将
通过构建的保留和交叉验证来训练机器学习预测器
来自 1) 的增强子-靶基因对数据库。然后将评估预测器的功效
具有 CRISPRi-FlowFISH 数据的黄金标准。接下来我们将进行大规模的
计算并将构建所选细胞的预测增强子-基因关系的数据库
类型。总的来说,我们将展示集成两个重要的共同点的显着附加功能
资助数据资源并将提供工具来促进理解之间的关系
基因组拓扑和基因表达。我们的计算工具将带来新的见解
基因组结构和基因组功能的关系对于改善人类健康很重要。
0925-0001(修订版。03/16)页面延续格式页面
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jie Liang其他文献
Jie Liang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jie Liang', 18)}}的其他基金
Constructing High-Resolution Ensemble Models of 3D Single-Cell Chromatin Conformations of eQTL Loci from Integrated Analysis of 4DN-GTEx Data towards Structural Basis of Differential Gene Expression
从 4DN-GTEx 数据的集成分析构建 eQTL 位点 3D 单细胞染色质构象的高分辨率整体模型,以构建差异基因表达的结构基础
- 批准号:
10357063 - 财政年份:2021
- 资助金额:
$ 29.82万 - 项目类别:
Models and Algorithms for Beta-Barrel Membrane Proteins and Stochastic Networks
β-桶膜蛋白和随机网络的模型和算法
- 批准号:
10395949 - 财政年份:2018
- 资助金额:
$ 29.82万 - 项目类别:
Models and Algorithms for Beta-Barrel Membrane Proteins and Stochastic Networks
β-桶膜蛋白和随机网络的模型和算法
- 批准号:
9923024 - 财政年份:2018
- 资助金额:
$ 29.82万 - 项目类别:
Constructing Ensembles of 3D Structures of Igh Locus and Predicting Novel Chromosomal Interactions
构建 Igh 基因座 3D 结构的集合并预测新的染色体相互作用
- 批准号:
9317936 - 财政年份:2017
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
8728892 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
8506731 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7213136 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7777302 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7356031 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
High-Accuracy Models of Proteins from Remote Homology
来自远程同源性的高精度蛋白质模型
- 批准号:
7495070 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
相似国自然基金
基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
- 批准号:
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于ATAC-seq技术研究交叉反应物质197调控TFEB介导的自噬抑制子宫内膜异位症侵袭的分子机制
- 批准号:82001520
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
单细胞RNA和ATAC测序解析肌肉干细胞激活和增殖中的异质性研究
- 批准号:31900570
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
人类胎盘合体滋养层形成分子机制及其与子痫前期发生关联的研究
- 批准号:31900602
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Defining molecular mechanisms by which stimulant evoked dopamine drives inflammation and neuronal dysfunction in neuroHIV
定义兴奋剂诱发多巴胺驱动神经艾滋病毒炎症和神经元功能障碍的分子机制
- 批准号:
10685160 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别:
Regulation of Adherent Cell Proliferation by Matrix Viscoelasticity
基质粘弹性对贴壁细胞增殖的调节
- 批准号:
10735701 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别:
Simultaneous mapping of somatic mosaicism and kb-resolution 3D genome in single cells.
单细胞中体细胞嵌合体和 kb 分辨率 3D 基因组的同时作图。
- 批准号:
10660575 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别: