Motivation: The combination of liquid chromatography and mass spectrometry (LC/MS) has been widely used for large-scale comparative studies in systems biology, including proteomics, glycomics and metabolomics. In almost all experimental design, it is necessary to compare chromatograms across biological or technical replicates and across sample groups. Central to this is the peak alignment step, which is one of the most important but challenging preprocessing steps. Existing alignment tools do not take into account the structural dependencies between related peaks that coelute and are derived from the same metabolite or peptide. We propose a direct matching peak alignment method for LC/MS data that incorporates related peaks information (within each LC/MS run) and investigate its effect on alignment performance (across runs). The groupings of related peaks necessary for our method can be obtained from any peak clustering method and are built into a pair-wise peak similarity score function. The similarity score matrix produced is used by an approximation algorithm for the weighted matching problem to produce the actual alignment result.
Results: We demonstrate that related peak information can improve alignment performance. The performance is evaluated on a set of benchmark datasets, where our method performs competitively compared to other popular alignment tools.
Availability: The proposed alignment method has been implemented as a stand-alone application in Python, available for download at http://github.com/joewandy/peak-grouping-alignment.
Contact: Simon.Rogers@glasgow.ac.uk
Supplementary information: Supplementary data are available at Bioinformatics online.
动机:液相色谱与质谱联用(LC/MS)已广泛用于系统生物学中的大规模比较研究,包括蛋白质组学、糖组学和代谢组学。在几乎所有的实验设计中,都有必要对生物学或技术重复样本以及不同样本组之间的色谱图进行比较。其中关键的是峰对齐步骤,这是最重要但也最具挑战性的预处理步骤之一。现有的对齐工具没有考虑到共流出且来自同一代谢物或肽段的相关峰之间的结构依赖性。我们提出了一种针对LC/MS数据的直接匹配峰对齐方法,该方法结合了相关峰信息(在每次LC/MS运行中),并研究了其对对齐性能(在不同次运行之间)的影响。我们的方法所需的相关峰分组可以从任何峰聚类方法中获得,并被构建到一个成对峰相似性评分函数中。由此产生的相似性评分矩阵被一个加权匹配问题的近似算法用于产生实际的对齐结果。
结果:我们证明了相关峰信息可以提高对齐性能。在一组基准数据集上对性能进行了评估,我们的方法与其他流行的对齐工具相比具有竞争力。
可用性:所提出的对齐方法已在Python中作为一个独立应用程序实现,可从http://github.com/joewandy/peak - grouping - alignment下载。
联系方式:Simon.Rogers@glasgow.ac.uk
补充信息:补充数据可在Bioinformatics在线获取。