Systematic Identification of Core Regulatory Circuitry from ENCODE Data
从 ENCODE 数据系统识别核心监管电路
基本信息
- 批准号:10238262
- 负责人:
- 金额:$ 57.23万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-02-01 至 2023-01-31
- 项目状态:已结题
- 来源:
- 关键词:Base PairingBase SequenceBinding SitesBiological AssayCell Differentiation processCellsChromatinCollaborationsCommunitiesCoupledDNA SequenceDataData SetDiseaseEncapsulatedEnhancersGeneticGenomicsHumanLuciferasesMachine LearningMapsMethodsModelingMouse Cell LineMusNucleic Acid Regulatory SequencesRegulatory ElementReporterResolutionTestingTissuesTrainingUntranslated RNAValidationVariantVocabularyWeightblindcell typecomputerized toolsdesignepigenomicsexperimental studyprograms
项目摘要
While much progress has been made generating high quality chromatin state and accessibility data from the
ENCODE and Roadmap consortia, accurately identifying cell-type specific enhancers from these data remains a
significant challenge. We have recently developed a computational approach (gkmSVM) to predict regulatory
elements from DNA sequence, and we have shown that when gkmSVM is trained on DHS data from each
of the human and mouse ENCODE and Roadmap cells and tissues, it can predict both cell specific enhancer
activity and the impact of regulatory variants (deltaSVM) with greater precision than alternative approaches.
The gkmSVM model encapsulates a set of cell-type specific weights describing the regulatory binding
site vocabulary controlling chromatin accessibility in each cell type. A striking observation is that the significant
gkmSVM weights are generally identifiable with a small (~20) set of TF binding sites which vary by cell-type,
consistent with the hypothesis that cell-type specific expression programs are controlled by a small set
of core factors tightly coupled in mutually interacting regulatory circuits. Perturbations of these core regulators
enable transitions between stable differentiated cell-type states of this genetic circuit. Here, we will use
gkmSVM to systematically identify the core regulatory circuitry in all existing ENCODE and Roadmap human
and mouse cell lines and tissues, and produce DNA sequence based genomic regulatory maps and fine-scale
predictions of core regulator binding sites within predicted regulatory regions. We will generate binding
site models for core regulators in each cell type, assess the accuracy of our predictions through direct
experimental validation. The value of this map critically depends on its accuracy, so we demonstrate that
gkmSVM predictions consistently outperform alternative methods in massively parallel enhancer reporter and
luciferase validation assays, in blind community assessments of regulatory element predictions (CAGI), and in
predicting validated causal disease associated variants. In contrast, we show that methods using PWM
descriptions of TF binding sites are significantly less accurate. We will produce base-pair resolution predictions
of the cell specific TF binding sites (TFBS) within broader regulatory regions detected by multiple ENCODE
epigenomic Mapping datasets, and to test these TFBS predictions in collaboration with Functional
Characterization Centers (FCC). Our regulatory maps will help design and inform focused experiments
probing regulatory mechanisms, and aid in the interpretation of disease associated non-coding variants.
尽管已经取得了很大的进步,从而产生了高质量的染色质状态和可访问性数据
编码和路线图联盟,从这些数据中准确识别细胞类型的特定增强子仍然
重大挑战。我们最近开发了一种计算方法(GKMSVM)来预测调节
DNA序列的元素,我们已经表明,当GKMSVM在每个DHS数据上训练时
在人类和小鼠编码以及路线图细胞和组织中,它可以预测两个细胞特异性增强子
与替代方法相比,活性和调节变体(DeltasVm)的影响更高。
GKMSVM模型封装了一组细胞类型的特定权重,描述了调节结合
位点词汇控制每种单元类型中的染色质可及性。一个惊人的观察是重要的
GKMSVM权重通常可以通过一组小(约20)个TF结合位点来识别,这些位点因细胞类型而变化,
与细胞类型特定表达程序受到小集控制的假设一致
核心因素与相互相互作用的调节电路紧密耦合。这些核心调节器的扰动
在该遗传回路的稳定分化细胞类型状态之间启用过渡。在这里,我们将使用
GKMSVM系统地识别所有现有编码和路线图人的核心监管电路
和小鼠细胞系和组织,并产生基于DNA序列的基因组调节图和细尺度
预测调节区域内的核心调节剂结合位点的预测。我们将生成绑定
每种单元格中核心调节器的站点模型,通过直接评估我们预测的准确性
实验验证。该地图的价值在很大程度上取决于其准确性,因此我们证明
GKMSVM预测在大规模并行增强器报告中的替代方法始终优于替代方法
在监管要素预测(CAGI)的盲目社区评估中,荧光素酶验证测定法(CAGI)和
预测已验证的因果疾病相关变体。相反,我们显示了使用PWM的方法
TF结合位点的描述明显降低了精度。我们将产生基本对分辨率预测
由多个编码检测到的更广泛调节区域内的细胞特异性TF结合位点(TFB)
表观基因组映射数据集,并与功能合作测试这些TFB的预测
表征中心(FCC)。我们的监管地图将有助于设计并为重点实验提供信息
探测调节机制,并有助于解释疾病相关的非编码变体。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Loop competition and extrusion model predicts CTCF interaction specificity.
- DOI:10.1038/s41467-021-21368-0
- 发表时间:2021-02-16
- 期刊:
- 影响因子:16.6
- 作者:Xi W;Beer MA
- 通讯作者:Beer MA
Embryonic loss of human females with partial trisomy 19 identifies region critical for the single active X.
- DOI:10.1371/journal.pone.0170403
- 发表时间:2017
- 期刊:
- 影响因子:3.7
- 作者:Migeon BR;Beer MA;Bjornsson HT
- 通讯作者:Bjornsson HT
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael A Beer其他文献
Machine Learning Sequence Modeling Identifies Gene Regulatory Responses to Bone Marrow Stromal Interactions in Multiple Myeloma
- DOI:
10.1182/blood-2023-186981 - 发表时间:
2023-11-02 - 期刊:
- 影响因子:
- 作者:
Milad Razavi-Mohseni;Dustin Shigaki;Michael A Beer - 通讯作者:
Michael A Beer
Michael A Beer的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael A Beer', 18)}}的其他基金
Sequence-based Machine Learning for Inference of Dynamic Cell State Gene Network Models
基于序列的机器学习用于动态细胞状态基因网络模型的推理
- 批准号:
10665735 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10297375 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineagedecisions
控制早期人类谱系决定的基因调控网络的基因组控制
- 批准号:
10833813 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10471939 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10630157 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineagedecisions
控制早期人类谱系决定的基因调控网络的基因组控制
- 批准号:
10840531 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
9097757 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
8556758 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
9304811 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
8889287 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
相似国自然基金
基于微观仿真的城市交通宏观基本图特性研究
- 批准号:11672289
- 批准年份:2016
- 资助金额:56.0 万元
- 项目类别:面上项目
新医改多重政策实施背景下基本药物可及性评价:指标及方法的建立与实证
- 批准号:71473170
- 批准年份:2014
- 资助金额:63.0 万元
- 项目类别:面上项目
国家基本药物制度对医疗服务利用与药品合理使用的长期影响追踪研究
- 批准号:71273016
- 批准年份:2012
- 资助金额:55.0 万元
- 项目类别:面上项目
组合恒等式的研究
- 批准号:11226295
- 批准年份:2012
- 资助金额:3.0 万元
- 项目类别:数学天元基金项目
Ti-Al合金系的基本原子团序列和亚稳相高温结构材料的设计
- 批准号:50471058
- 批准年份:2004
- 资助金额:25.0 万元
- 项目类别:面上项目
相似海外基金
Dissecting the Mechanisms of Pioneer Factor Facilitated Chromatin Opening
剖析先锋因子促进染色质开放的机制
- 批准号:
10670907 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别:
Regulation of the Salmonella Pathogenicity Island 1 Type III Secretion System via the hilD 3' untranslated region
通过 hilD 3 非翻译区调节沙门氏菌致病性岛 1 III 型分泌系统
- 批准号:
10625450 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别:
Dissecting the Mechanisms of Pioneer Factor Facilitated Chromatin Opening
剖析先锋因子促进染色质开放的机制
- 批准号:
10501478 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别:
Regulation of the Salmonella Pathogenicity Island 1 Type III Secretion System via the hilD 3' untranslated region
通过 hilD 3 非翻译区调节沙门氏菌致病性岛 1 III 型分泌系统
- 批准号:
10527931 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别: