The detection of associations between protein complexes and human inherited diseases is of great importance in understanding mechanisms of diseases. Dysfunctions of a protein complex are usually defined by its member disturbance and consequently result in certain diseases. Although individual disease proteins have been widely predicted, computational methods are still absent for systematically investigating disease-related protein complexes.
We propose a method, MAXCOM, for the prioritization of candidate protein complexes. MAXCOM performs a maximum information flow algorithm to optimize relationships between a query disease and candidate protein complexes through a heterogeneous network that is constructed by combining protein-protein interactions and disease phenotypic similarities. Cross-validation experiments on 539 protein complexes show that MAXCOM can rank 382 (70.87%) protein complexes at the top against protein complexes constructed at random. Permutation experiments further confirm that MAXCOM is robust to the network structure and parameters involved. We further analyze protein complexes ranked among top ten for breast cancer and demonstrate that the SWI/SNF complex is potentially associated with breast cancer.
MAXCOM is an effective method for the discovery of disease-related protein complexes based on network optimization. The high performance and robustness of this approach can facilitate not only pathologic studies of diseases, but also the design of drugs targeting on multiple proteins.
检测蛋白质复合物与人类遗传性疾病之间的关联对于理解疾病机制具有重要意义。蛋白质复合物的功能失调通常由其成员的紊乱所定义,并因此导致某些疾病。尽管单个疾病相关蛋白质已被广泛预测,但仍然缺乏用于系统研究疾病相关蛋白质复合物的计算方法。
我们提出了一种名为MAXCOM的方法,用于对候选蛋白质复合物进行优先级排序。MAXCOM通过一个异质网络执行最大信息流算法,以优化查询疾病与候选蛋白质复合物之间的关系,该异质网络是通过结合蛋白质 - 蛋白质相互作用和疾病表型相似性构建的。对539个蛋白质复合物进行的交叉验证实验表明,与随机构建的蛋白质复合物相比,MAXCOM能够将382个(70.87%)蛋白质复合物排在首位。置换实验进一步证实MAXCOM对所涉及的网络结构和参数具有稳健性。我们进一步分析了乳腺癌排名前十的蛋白质复合物,并证明SWI/SNF复合物可能与乳腺癌相关。
MAXCOM是一种基于网络优化发现疾病相关蛋白质复合物的有效方法。这种方法的高性能和稳健性不仅可以促进疾病的病理学研究,还可以促进针对多种蛋白质的药物设计。