Transcriptional cooperativity among several transcription factors (TFs) is believed to be the main mechanism of complexity and precision in transcriptional regulatory programs. Here, we present a Bayesian network framework to reconstruct a high-confidence whole-genome map of transcriptional cooperativity in Saccharomyces cerevisiae by integrating a comprehensive list of 15 genomic features. We design a Bayesian network structure to capture the dominant correlations among features and TF cooperativity, and introduce a supervised learning framework with a well-constructed gold-standard dataset. This framework allows us to assess the predictive power of each genomic feature, validate the superior performance of our Bayesian network compared to alternative methods, and integrate genomic features for optimal TF cooperativity prediction. Data integration reveals 159 high-confidence predicted cooperative relationships among 105 TFs, most of which are subsequently validated by literature search. The existing and predicted transcriptional cooperativities can be grouped into three categories based on the combination patterns of the genomic features, providing further biological insights into the different types of TF cooperativity. Our methodology is the first supervised learning approach for predicting transcriptional cooperativity, compares favorably to alternative unsupervised methodologies, and can be applied to other genomic data integration tasks where high-quality gold-standard positive data are scarce.
几个转录因子(TFs)之间的转录协同性被认为是转录调控程序复杂性和精确性的主要机制。在此,我们提出一个贝叶斯网络框架,通过整合15种基因组特征的综合列表,在酿酒酵母中重建一个高可信度的全基因组转录协同性图谱。我们设计了一个贝叶斯网络结构来捕捉特征和转录因子协同性之间的主要相关性,并引入一个带有精心构建的金标准数据集的监督学习框架。这个框架使我们能够评估每种基因组特征的预测能力,验证我们的贝叶斯网络相较于其他方法的优越性能,并整合基因组特征以实现最佳的转录因子协同性预测。数据整合揭示了105个转录因子之间159种高可信度的预测协同关系,其中大多数随后通过文献检索得到验证。现有的和预测的转录协同性可根据基因组特征的组合模式分为三类,为不同类型的转录因子协同性提供了进一步的生物学见解。我们的方法是第一种用于预测转录协同性的监督学习方法,与其他无监督方法相比具有优势,并且可应用于高质量金标准阳性数据稀缺的其他基因组数据整合任务。