Gene Regulatory Sequences And Protein Binding in Genome Sequences

基因调控序列和基因组序列中的蛋白质结合

基本信息

  • 批准号:
    10688917
  • 负责人:
  • 金额:
    $ 37.64万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

Continuing the work on HMGN proteins, we investigated the roles of HMGN1 and HMGN2 in higher-order chromatin structure. Deep sequencing data from our collaborator, Michael Bustins Group, who performed high resolution in situ Hi-C, Promoter Capture Hi-C (PCHC) and ChIP-seq experiments in four different mouse cell types, MEF, resting B cell (rB), embryonic stem cells (ESC) and induced pluripotent stem cells (iPSC), in both wild type (WT) and HMGN1/HMGN2 double knock out (DKO) cells. We first identified the chromatin A/B compartments. The ratio of the total genomic lengths of compartment A to B is close to the ratio of 1:1 in all the four cell types. Integrating HMGN1/2 ChIP-seq data with Hi-C compartment analysis shows that 75-95% of HMGN1 and HMGN2 peaks are located within the A compartments. Moreover, there is a sharp increase in HMGN signals across the boundaries between B and A compartments in all cell types. We next examined the difference in high order chromatin structures between WT and DKO cell. The stratum adjusted correlation coefficient (SCC), an indicator of similarity levels between Hi-C interaction matrices, showed that the SCCs between any WT and DKO samples range from 0.985 to 0.995, similar to that from between replicates, which suggest that the depletion of HMGN proteins does not significantly alter 3D chromatin contact matrixes. Further comparison of A/B compartment score (C-score) between WT and DKO cells, which aims to find small differences, showed that HMGN protein depletion has little effect on chromatin compartmentation in all cell types (Figure 1). To examine how HMGN affects the spatial enhancer-promoter interactions in 3D nuclear space, we identified significant promoter interaction regions (PIRs) using the CHiCAGO pipeline on the PCHC data of MEF, rB and iPSCs. The enrichment analysis showed that the identified PIRs are highly enriched in cell-type specific regulatory features, including HMGN, H3K4me3, H3K27ac, and H4K4me1, and p300 and CTCF signals. We applied the software Chicdiff in PCHC data and found no statistically differential interactions between WT and DKO sample that can be related to gene expression or other biological functions. Using ChIP-seq combined with mass spectrometry, we discovered protein partners that are directly associated with or neighbors of HMGNs on nucleosomes. In summary, we determined how HMGN chromatin architectural proteins are positioned within a 3D nucleus space, including the identification of their binding partners in mononucleosomes. Our research indicates that HMGN proteins localize to active chromatin compartments but do not have major effects on 3D higher-order chromatin structure and that their binding to chromatin is not dependent on specific protein partners. In another project, promoter-enhancer interactions are usually formed through chromatin looping within a topologically associated domain (TAD). However, recent studies have found that disruption of chromatin topology on a large scale has only a modest effect on the transcriptome, and disruption of many predicted enhancers has no effect on gene transcription. How exactly tens of thousands of enhancers and a few thousand active promoters in a cell organize themselves into transcription-regulatory loops remains unclear. In this study, we filtered enhancers with the criteria of TF-binding enrichment and discovered a pattern of active promoter and enhancer organization that would not be obvious if all H3K27ac peaks are counted as effective enhancers. Our results provide insights for transcriptome robustness and how the transcriptome is in large part decoupled from TAD structures. We identified about 12,000-14,000 strong enhancers in mouse ESC and MEF cells as H3K27ac peak sites associated with high TF enrichment, which are used for further analysis in this study. We examined the genes associated with enhancer clusters or super enhancers (SEs) and found that SE-associated genes usually are the only one or two genes actively transcribed in the flanking region (100-150kb). In other words, they are isolated active genes, including many housekeeping genes, contrary to a widely accepted view that SEs are mainly associated with cell type specific genes that determine cell identity. In contrast to the situation of single active gene(s) with enhancer clusters, at genomic regions with densely packed active genes, there are usually few distant enhancers. We identified 120 active gene clusters (AGC) regions with 6 active promoters in each cluster and adjacent ones < 40kb apart away, which include 1050 genes that account for 10% of total active genes. In these regions: Active promoters greatly outnumber the distant enhancers. Remarkably, these genomic regions are enriched in housekeeping genes. To explore the relationship between the number of promoters and enhancers at actively transcribed regions genome wide, we selected genomic windows of 200kb centered at TSSs of the top 20% most highly expressed genes. Windows with centers <50kb apart are combined, which results in 1500 200-300kb long regions and account for about three quarters of total mRNA expression. There is an overall inverse correlation between the number of active promoters and distant enhancers in these regions. When there are only one or two promoters, often large number of enhancers are nearby. As the number of nearby active promoters increases, the number of enhancers decreases. When there are densely packed active promoters, there are few enhancers. With the number of TF peaks at each promoter/enhancer as a semi-quantification of the strength of a regulatory element, the total strength of promoters and enhancers at those windows also shows an inverse relationship. With Hi-C analysis, we demonstrate that the interactions among the regulatory elements (active promoters and enhancers) occur predominantly in clusters and multiway among linearly close elements and the distance between adjacent elements shows a preference of 30kb. We propose a simple rule of spatial organization of active promoters and enhancers: gene transcription and regulation mainly occurs at local active transcription hubs contributed dynamically by multiple elements from linearly close enhancers and/or active promoters. The hub model can be represented with a flower-shaped structure and implies an enhancer-like role of active promoters.
继续对HMGN蛋白的研究,我们研究了HMGN1和HMGN2在高阶染色质结构中的作用。 Deep sequencing data from our collaborator, Michael Bustins Group, who performed high resolution in situ Hi-C, Promoter Capture Hi-C (PCHC) and ChIP-seq experiments in four different mouse cell types, MEF, resting B cell (rB), embryonic stem cells (ESC) and induced pluripotent stem cells (iPSC), in both wild type (WT) and HMGN1/HMGN2 double knock out (DKO) cells.我们首先确定了染色质A/B室。在所有四种细胞类型中,腔室A与B的总基因组长度的比率接近1:1的比率。将HMGN1/2与HI-C隔室分析集成的芯片seq数据表明,HMGN1和HMGN2峰的75-95%位于A隔室内。此外,在所有细胞类型中,B和A隔室之间的边界之间的HMGN信号急剧增加。 接下来,我们检查了WT和DKO细胞之间高阶染色质结构的差异。层调整后的相关系数(SCC)是HI-C相互作用矩阵之间相似性水平的指标,表明任何WT和DKO样品之间的SCC范围为0.985至0.995,与重复之间的SCC相似,这表明HMGN蛋白质的耗竭并没有显着改变3D Chromatin的差异。旨在发现小差异的WT和DKO细胞之间A/B室评分(C分数)的进一步比较表明,HMGN蛋白耗竭对所有细胞类型中的染色质分室的影响很小(图1)。 为了检查HMGN如何影响3D核空间中的空间增强子促销相互作用,我们使用芝加哥管道在MEF,RB和IPSCS的PCHC数据上确定了重要的启动子相互作用区域(PIRS)。富集分析表明,所鉴定的PIR高度富集在细胞类型的特定调节特征中,包括HMGN,H3K4ME3,H3K27AC和H4K4ME1,以及P300和CTCF信号。我们在PCHC数据中应用了软件CHICDIFF,发现WT和DKO样品之间没有统计上的差异相互作用,与基因表达或其他生物学功能有关。 使用ChIP-Seq结合质谱法,我们发现了与核小体上HMGN的或邻居直接相关的蛋白质伴侣。总而言之,我们确定了HMGN染色质结构蛋白如何定位在3D核空间中,包括在单核小体中鉴定其结合伴侣。我们的研究表明,HMGN蛋白质定位于活性染色质区室,但对3D高阶染色质结构没有重大影响,并且它们与染色质的结合不取决于特定的蛋白质伴侣。 在另一个项目中,启动子增强剂的相互作用通常是通过在拓扑相关的域(TAD)内的染色质循环形成的。然而,最近的研究发现,大规模染色质拓扑的破坏对转录组只有适中的影响,而许多预测增强子的破坏对基因转录没有影响。一个细胞中成千上万的增强子和几千个活跃启动子如何将自己组织成转录调节环仍不清楚。在这项研究中,我们用TF结合富集的标准过滤了增强子,并发现了一种主动启动子和增强子组织的模式,如果将所有H3K27AC峰都视为有效增强子,则不会显而易见。我们的结果为转录组鲁棒性以及转录组的很大一部分与TAD结构的分离提供了见解。 我们将小鼠ESC和MEF细胞中约有12,000-14,000个强增强剂确定为与高TF富集相关的H3K27AC峰位点,在这项研究中可用于进一步分析。我们检查了与增强子簇或超级增强子(SES)相关的基因,发现SE相关基因通常是侧翼区域(100-150kb)积极转录的唯一一个或两个基因。换句话说,它们是孤立的活性基因,包括许多管家基因,与广泛接受的观点相反,SES主要与确定细胞同一性的细胞类型特异性基因有关。 与具有密集堆积活性基因的基因组区域,与增强子簇的单个活性基因的情况相反,通常很少有远处增强子。我们确定了120个活性基因簇(AGC)区域,每个群集中有6个活跃启动子,相邻<40kb <40kb,其中包括1050个基因,占总活性基因的10%。在这些地区:主动启动子的人数大大超过了遥远的增强剂。值得注意的是,这些基因组区域丰富了管家基因。 为了探索积极转录区域宽区域的启动子数量和增强子数量之间的关系,我们选择了以TSSS为中心的200KB的基因组窗口,这些窗口以TSSS的最高20%最高表达的基因。合并的窗户相距<50kb,相距<50kb,导致1500 200-300kb长的区域,约占总mRNA表达的四分之三。在这些地区,主动启动子的数量与远处增强子的数量之间存在总体反比相关性。当只有一个或两个启动子时,附近通常会有大量增强剂。随着附近活动启动子的数量的增加,增强子的数量减少。当有密集的活跃启动子时,几乎没有增强剂。由于每个启动子/增强子在调节元件强度的半定量中的TF峰数量,这些窗口的启动子和增强子的总强度也显示出反相反的关系。 通过HI-C分析,我们证明了调节元件(主动启动子和增强子)之间的相互作用主要发生在线性接近元素之间的群集和多路中,而相邻元素之间的距离显示30kb的偏好。 我们提出了一个简单的主动启动子和增强子空间组织的规则:基因转录和调节主要发生在局部主动转录中心,该轮毂由线性接近增强子和/或主动启动子的多个元素动态贡献。轮毂模型可以用花形结构表示,并意味着活跃启动子的增强子样作用。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

David LANDSMAN其他文献

David LANDSMAN的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('David LANDSMAN', 18)}}的其他基金

Analysis Of Repeated Elements In The Human Genome
人类基因组中重复元素的分析
  • 批准号:
    6843588
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural and Functional Analysis of Gene and Protein Sequence Families
基因和蛋白质序列家族的结构和功能分析
  • 批准号:
    10018390
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural-Functional Analysis-Protein Sequence Families
结构-功能分析-蛋白质序列家族
  • 批准号:
    7148031
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Analysis Of Gene Regulatory Sequences From Whole Chromosomes And Genomes
全染色体和基因组的基因调控序列分析
  • 批准号:
    7735074
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural And Functional Analysis Of Protein Sequence Families
蛋白质序列家族的结构和功能分析
  • 批准号:
    7735069
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural and Functional Analysis of Gene and Protein Sequence Families
基因和蛋白质序列家族的结构和功能分析
  • 批准号:
    9353157
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Gene Regulatory Sequences From Whole Chromosome /Genome
来自全染色体/基因组的基因调控序列
  • 批准号:
    6843578
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Structural And Functional Analysis Of Protein Sequence F
蛋白质序列 F 的结构和功能分析
  • 批准号:
    6681342
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Gene Regulatory Sequences And Protein Binding in Genome Sequences
基因调控序列和基因组序列中的蛋白质结合
  • 批准号:
    8943221
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:
Genome Assemblies, Analyses, and Comparisons
基因组组装、分析和比较
  • 批准号:
    10927044
  • 财政年份:
  • 资助金额:
    $ 37.64万
  • 项目类别:

相似国自然基金

面向图神经网络ATAC-seq模体识别的最小间隔单细胞聚类研究
  • 批准号:
    62302218
  • 批准年份:
    2023
  • 资助金额:
    30.00 万元
  • 项目类别:
    青年科学基金项目
基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
  • 批准号:
    82260745
  • 批准年份:
    2022
  • 资助金额:
    33.00 万元
  • 项目类别:
    地区科学基金项目
基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    33 万元
  • 项目类别:
    地区科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
  • 批准号:
    32100438
  • 批准年份:
    2021
  • 资助金额:
    24.00 万元
  • 项目类别:
    青年科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Research Project 2
研究项目2
  • 批准号:
    10403256
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Project 2: Impact of H1/H2 haplotypes on cellular disease-associated phenotypes driven by FTD-causing MAPT mutations
项目 2:H1/H2 单倍型对 FTD 引起的 MAPT 突变驱动的细胞疾病相关表型的影响
  • 批准号:
    10834336
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Engineering 3D Osteosarcoma Models to Elucidate Biology and Inform Drug Discovery
工程 3D 骨肉瘤模型以阐明生物学并为药物发现提供信息
  • 批准号:
    10564801
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Role of POU4F1 in a Novel Form of Ataxia
POU4F1 在新型共济失调中的作用
  • 批准号:
    10741382
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
Multi-modal profiling of spatially resolved cell types mediating opioid withdrawal
介导阿片类药物戒断的空间分辨细胞类型的多模式分析
  • 批准号:
    10787010
  • 财政年份:
    2023
  • 资助金额:
    $ 37.64万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了