Gene Regulatory Sequences And Protein Binding in Genome Sequences

基因调控序列和基因组序列中的蛋白质结合

基本信息

  • 批准号:
    10688917
  • 负责人:
  • 金额:
    $ 37.64万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

Continuing the work on HMGN proteins, we investigated the roles of HMGN1 and HMGN2 in higher-order chromatin structure. Deep sequencing data from our collaborator, Michael Bustins Group, who performed high resolution in situ Hi-C, Promoter Capture Hi-C (PCHC) and ChIP-seq experiments in four different mouse cell types, MEF, resting B cell (rB), embryonic stem cells (ESC) and induced pluripotent stem cells (iPSC), in both wild type (WT) and HMGN1/HMGN2 double knock out (DKO) cells. We first identified the chromatin A/B compartments. The ratio of the total genomic lengths of compartment A to B is close to the ratio of 1:1 in all the four cell types. Integrating HMGN1/2 ChIP-seq data with Hi-C compartment analysis shows that 75-95% of HMGN1 and HMGN2 peaks are located within the A compartments. Moreover, there is a sharp increase in HMGN signals across the boundaries between B and A compartments in all cell types. We next examined the difference in high order chromatin structures between WT and DKO cell. The stratum adjusted correlation coefficient (SCC), an indicator of similarity levels between Hi-C interaction matrices, showed that the SCCs between any WT and DKO samples range from 0.985 to 0.995, similar to that from between replicates, which suggest that the depletion of HMGN proteins does not significantly alter 3D chromatin contact matrixes. Further comparison of A/B compartment score (C-score) between WT and DKO cells, which aims to find small differences, showed that HMGN protein depletion has little effect on chromatin compartmentation in all cell types (Figure 1). To examine how HMGN affects the spatial enhancer-promoter interactions in 3D nuclear space, we identified significant promoter interaction regions (PIRs) using the CHiCAGO pipeline on the PCHC data of MEF, rB and iPSCs. The enrichment analysis showed that the identified PIRs are highly enriched in cell-type specific regulatory features, including HMGN, H3K4me3, H3K27ac, and H4K4me1, and p300 and CTCF signals. We applied the software Chicdiff in PCHC data and found no statistically differential interactions between WT and DKO sample that can be related to gene expression or other biological functions. Using ChIP-seq combined with mass spectrometry, we discovered protein partners that are directly associated with or neighbors of HMGNs on nucleosomes. In summary, we determined how HMGN chromatin architectural proteins are positioned within a 3D nucleus space, including the identification of their binding partners in mononucleosomes. Our research indicates that HMGN proteins localize to active chromatin compartments but do not have major effects on 3D higher-order chromatin structure and that their binding to chromatin is not dependent on specific protein partners. In another project, promoter-enhancer interactions are usually formed through chromatin looping within a topologically associated domain (TAD). However, recent studies have found that disruption of chromatin topology on a large scale has only a modest effect on the transcriptome, and disruption of many predicted enhancers has no effect on gene transcription. How exactly tens of thousands of enhancers and a few thousand active promoters in a cell organize themselves into transcription-regulatory loops remains unclear. In this study, we filtered enhancers with the criteria of TF-binding enrichment and discovered a pattern of active promoter and enhancer organization that would not be obvious if all H3K27ac peaks are counted as effective enhancers. Our results provide insights for transcriptome robustness and how the transcriptome is in large part decoupled from TAD structures. We identified about 12,000-14,000 strong enhancers in mouse ESC and MEF cells as H3K27ac peak sites associated with high TF enrichment, which are used for further analysis in this study. We examined the genes associated with enhancer clusters or super enhancers (SEs) and found that SE-associated genes usually are the only one or two genes actively transcribed in the flanking region (100-150kb). In other words, they are isolated active genes, including many housekeeping genes, contrary to a widely accepted view that SEs are mainly associated with cell type specific genes that determine cell identity. In contrast to the situation of single active gene(s) with enhancer clusters, at genomic regions with densely packed active genes, there are usually few distant enhancers. We identified 120 active gene clusters (AGC) regions with 6 active promoters in each cluster and adjacent ones < 40kb apart away, which include 1050 genes that account for 10% of total active genes. In these regions: Active promoters greatly outnumber the distant enhancers. Remarkably, these genomic regions are enriched in housekeeping genes. To explore the relationship between the number of promoters and enhancers at actively transcribed regions genome wide, we selected genomic windows of 200kb centered at TSSs of the top 20% most highly expressed genes. Windows with centers <50kb apart are combined, which results in 1500 200-300kb long regions and account for about three quarters of total mRNA expression. There is an overall inverse correlation between the number of active promoters and distant enhancers in these regions. When there are only one or two promoters, often large number of enhancers are nearby. As the number of nearby active promoters increases, the number of enhancers decreases. When there are densely packed active promoters, there are few enhancers. With the number of TF peaks at each promoter/enhancer as a semi-quantification of the strength of a regulatory element, the total strength of promoters and enhancers at those windows also shows an inverse relationship. With Hi-C analysis, we demonstrate that the interactions among the regulatory elements (active promoters and enhancers) occur predominantly in clusters and multiway among linearly close elements and the distance between adjacent elements shows a preference of 30kb. We propose a simple rule of spatial organization of active promoters and enhancers: gene transcription and regulation mainly occurs at local active transcription hubs contributed dynamically by multiple elements from linearly close enhancers and/or active promoters. The hub model can be represented with a flower-shaped structure and implies an enhancer-like role of active promoters.
继续对 HMGN 蛋白的研究,我们研究了 HMGN1 和 HMGN2 在高级染色质结构中的作用。来自我们的合作者 Michael Bustins Group 的深度测序数据,他们在四种不同的小鼠细胞类型、MEF、静息 B 细胞 (rB)、野生型 (WT) 和 HMGN1/HMGN2 双敲除 (DKO) 细胞中的胚胎干细胞 (ESC) 和诱导多能干细胞 (iPSC)。我们首先识别了染色质 A/B 区室。在所有四种细胞类型中,区室 A 与 B 的基因组总长度之比接近 1:1。将 HMGN1/2 ChIP-seq 数据与 Hi-C 区室分析相结合显示,75-95% 的 HMGN1 和 HMGN2 峰位于 A 区室内。此外,所有细胞类型中 B 区室和 A 区室之间边界的 HMGN 信号急剧增加。 接下来我们检查了 WT 和 DKO 细胞之间高级染色质结构的差异。层调整相关系数 (SCC) 是 Hi-C 相互作用矩阵之间相似性水平的指标,表明任何 WT 和 DKO 样本之间的 SCC 范围为 0.985 至 0.995,与重复之间的相似,这表明HMGN 蛋白不会显着改变 3D 染色质接触矩阵。进一步比较 WT 和 DKO 细胞之间的 A/B 区室评分(C 评分),旨在发现微小差异,结果表明 HMGN 蛋白耗尽对所有细胞类型的染色质区室影响很小(图 1)。 为了研究 HMGN 如何影响 3D 核空间中的空间增强子-启动子相互作用,我们使用 CHiCAGO 管道对 MEF、rB 和 iPSC 的 PCHC 数据识别了显着的启动子相互作用区域 (PIR)。富集分析表明,鉴定的 PIR 在细胞类型特异性调控特征方面高度富集,包括 HMGN、H3K4me3、H3K27ac 和 H4K4me1,以及 p300 和 CTCF 信号。我们在 PCHC 数据中应用了 Chicdiff 软件,发现 WT 和 DKO 样本之间没有统计差异的相互作用,这些相互作用可能与基因表达或其他生物学功能相关。 使用 ChIP-seq 与质谱分析相结合,我们发现了与核小体上 HMGN 直接相关或相邻的蛋白质伙伴。总之,我们确定了 HMGN 染色质结构蛋白如何在 3D 细胞核空间内定位,包括识别它们在单核小体中的结合伴侣。我们的研究表明,HMGN 蛋白定位于活性染色质区室,但对 3D 高阶染色质结构没有重大影响,并且它们与染色质的结合不依赖于特定的蛋白质伙伴。 在另一个项目中,启动子-增强子相互作用通常是通过拓扑相关域(TAD)内的染色质环形成的。然而,最近的研究发现,大规模染色质拓扑结构的破坏对转录组仅产生适度的影响,并且许多预测的增强子的破坏对基因转录没有影响。细胞中数以万计的增强子和数千个活性启动子究竟如何组织成转录调节环仍不清楚。在这项研究中,我们以 TF 结合富集的标准过滤增强子,发现了一种活性启动子和增强子组织的模式,如果将所有 H3K27ac 峰都算作有效增强子,这种模式将不会很明显。我们的结果为转录组稳健性以及转录组如何在很大程度上与 TAD 结构解耦提供了见解。 我们在小鼠 ESC 和 MEF 细胞中鉴定出约 12,000-14,000 个强增强子作为与高 TF 富集相关的 H3K27ac 峰位点,用于本研究的进一步分析。我们检查了与增强子簇或超级增强子(SE)相关的基因,发现SE相关基因通常是侧翼区域(100-150kb)中唯一活跃转录的一两个基因。换句话说,它们是孤立的活性基因,包括许多看家基因,这与广泛接受的观点相反,即 SE 主要与决定细胞身份的细胞类型特异性基因相关。 与具有增强子簇的单个活性基因的情况相反,在活性基因密集的基因组区域,通常很少有远处的增强子。我们鉴定了120个活性基因簇(AGC)区域,每个簇中有6个活性启动子,相邻的启动子相距<40kb,其中包括1050个基因,占活性基因总数的10%。在这些区域: 活性启动子的数量大大超过远程增强子。值得注意的是,这些基因组区域富含管家基因。 为了探索全基因组转录活跃区域的启动子和增强子数量之间的关系,我们选择了以前 20% 最高表达基因的 TSS 为中心的 200kb 基因组窗口。中心间隔 <50kb 的窗口被组合起来,产生 1500 个 200-300kb 长的区域,约占总 mRNA 表达的四分之三。这些区域中的活性启动子和远程增强子的数量之间总体呈负相关。当只有一两个启动子时,附近往往有大量增强子。随着附近活性启动子数量的增加,增强子的数量减少。当活性启动子密集时,增强子就很少。将每个启动子/增强子处的 TF 峰数量作为调节元件强度的半量化,这些窗口处的启动子和增强子的总强度也显示出反比关系。 通过 Hi-C 分析,我们证明调控元件(活性启动子和增强子)之间的相互作用主要发生在线性接近的元件之间的簇和多路中,并且相邻元件之间的距离显示出 30kb 的偏好。 我们提出了活性启动子和增强子的空间组织的简单规则:基因转录和调控主要发生在由线性紧密增强子和/或活性启动子的多个元件动态贡献的局部活性转录中心。中心模型可以用花形结构表示,并暗示活性启动子具有类似增强子的作用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David LANDSMAN其他文献

David LANDSMAN的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David LANDSMAN', 18)}}的其他基金

Analysis Of Repeated Elements In The Human Genome
人类基因组中重复元素的分析
  • 批准号:
    6843588
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural and Functional Analysis of Gene and Protein Sequence Families
基因和蛋白质序列家族的结构和功能分析
  • 批准号:
    10018390
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural-Functional Analysis-Protein Sequence Families
结构-功能分析-蛋白质序列家族
  • 批准号:
    7148031
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Analysis Of Gene Regulatory Sequences From Whole Chromosomes And Genomes
全染色体和基因组的基因调控序列分析
  • 批准号:
    7735074
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural And Functional Analysis Of Protein Sequence Families
蛋白质序列家族的结构和功能分析
  • 批准号:
    7735069
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural and Functional Analysis of Gene and Protein Sequence Families
基因和蛋白质序列家族的结构和功能分析
  • 批准号:
    9353157
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Gene Regulatory Sequences From Whole Chromosome /Genome
来自全染色体/基因组的基因调控序列
  • 批准号:
    6843578
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural And Functional Analysis Of Protein Sequence F
蛋白质序列 F 的结构和功能分析
  • 批准号:
    6681342
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Gene Regulatory Sequences And Protein Binding in Genome Sequences
基因调控序列和基因组序列中的蛋白质结合
  • 批准号:
    8943221
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Genome Assemblies, Analyses, and Comparisons
基因组组装、分析和比较
  • 批准号:
    10927044
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:

相似国自然基金

基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    33 万元
  • 项目类别:
    地区科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于ATAC-seq技术研究交叉反应物质197调控TFEB介导的自噬抑制子宫内膜异位症侵袭的分子机制
  • 批准号:
    82001520
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目
人类胎盘合体滋养层形成分子机制及其与子痫前期发生关联的研究
  • 批准号:
    31900602
  • 批准年份:
    2019
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
单细胞RNA和ATAC测序解析肌肉干细胞激活和增殖中的异质性研究
  • 批准号:
    31900570
  • 批准年份:
    2019
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Research Project 2
研究项目2
  • 批准号:
    10403256
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Project 2: Impact of H1/H2 haplotypes on cellular disease-associated phenotypes driven by FTD-causing MAPT mutations
项目 2:H1/H2 单倍型对 FTD 引起的 MAPT 突变驱动的细胞疾病相关表型的影响
  • 批准号:
    10834336
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Role of POU4F1 in a Novel Form of Ataxia
POU4F1 在新型共济失调中的作用
  • 批准号:
    10741382
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Engineering 3D Osteosarcoma Models to Elucidate Biology and Inform Drug Discovery
工程 3D 骨肉瘤模型以阐明生物学并为药物发现提供信息
  • 批准号:
    10564801
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Multi-modal profiling of spatially resolved cell types mediating opioid withdrawal
介导阿片类药物戒断的空间分辨细胞类型的多模式分析
  • 批准号:
    10787010
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了