BackgroundTechnological advances in medicine have led to a rapid proliferation of high-throughput “omics” data. Tools to mine this data and discover disrupted disease networks are needed as they hold the key to understanding complicated interactions between genes, mutations and aberrations, and epi-genetic markers.ResultsWe developed an R software package, XMRF, that can be used to fit Markov Networks to various types of high-throughput genomics data. Encoding the models and estimation techniques of the recently proposed exponential family Markov Random Fields (Yang et al., 2012), our software can be used to learn genetic networks from RNA-sequencing data (counts via Poisson graphical models), mutation and copy number variation data (categorical via Ising models), and methylation data (continuous via Gaussian graphical models).ConclusionsXMRF is the only tool that allows network structure learning using the native distribution of the data instead of the standard Gaussian. Moreover, the parallelization feature of the implemented algorithms computes the large-scale biological networks efficiently. XMRF is available from CRAN and Github (https://github.com/zhandong/XMRF).
背景
医学技术的进步导致高通量“组学”数据迅速激增。挖掘这些数据并发现紊乱的疾病网络的工具是必要的,因为它们是理解基因、突变和畸变以及表观遗传标记之间复杂相互作用的关键。
结果
我们开发了一个R软件包XMRF,它可用于将马尔可夫网络拟合到各种类型的高通量基因组学数据。对最近提出的指数族马尔可夫随机场(Yang等人,2012)的模型和估计技术进行编码,我们的软件可用于从RNA测序数据(通过泊松图模型计数)、突变和拷贝数变异数据(通过伊辛模型分类)以及甲基化数据(通过高斯图模型连续)中学习遗传网络。
结论
XMRF是唯一允许使用数据的原始分布而非标准高斯分布进行网络结构学习的工具。此外,所实现算法的并行化特性可高效地计算大规模生物网络。XMRF可从CRAN和Github(https://github.com/zhandong/XMRF)获取。