Optimization based pattern discovery has emerged as an important field in knowledge discovery and data mining (KDD), and has been used to enhance the efficiency and accuracy of clustering, classification, association rules and outlier detection. Cluster analysis, which identifies groups of similar data items in large datasets, is one of its recent beneficiaries. The increasing complexity and large amounts of data in the datasets have seen data clustering emerge as a popular focus for the application of optimization based techniques. Different optimization techniques have been applied to investigate the optimal solution for clustering problems. Swarm intelligence (SI) is one such optimization technique whose algorithms have successfully been demonstrated as solutions for different data clustering domains. In this paper we investigate the growth of literature in SI and its algorithms, particularly Particle Swarm Optimization (PSO). This paper makes two major contributions. Firstly, it provides a thorough literature overview focusing on some of the most cited techniques that have been used for PSO-based data clustering. Secondly, we analyze the reported results and highlight the performance of different techniques against contemporary clustering techniques. We also provide an brief overview of our PSO-based hierarchical clustering approach (HPSO-clustering) and compare the results with traditional hierarchical agglomerative clustering (HAC), K-means, and PSO clustering. (C) 2014 Elsevier B.V. All rights reserved.
基于优化的模式发现已成为知识发现和数据挖掘(KDD)中的一个重要领域,并已被用于提高聚类、分类、关联规则和异常检测的效率和准确性。聚类分析是其近期的受益者之一,它用于识别大型数据集中相似数据项的群组。随着数据集日益复杂且数据量庞大,数据聚类已成为应用基于优化技术的一个热门焦点。不同的优化技术已被用于探究聚类问题的最优解。群体智能(SI)就是这样一种优化技术,其算法已成功地被证明可作为不同数据聚类领域的解决方案。在本文中,我们研究了群体智能及其算法,特别是粒子群优化(PSO)相关文献的增长情况。本文有两个主要贡献。首先,它提供了一个全面的文献综述,重点关注一些用于基于PSO的数据聚类且被引用次数最多的技术。其次,我们分析了所报道的结果,并强调了不同技术相对于当代聚类技术的性能。我们还简要概述了我们基于PSO的层次聚类方法(HPSO - 聚类),并将结果与传统的层次凝聚聚类(HAC)、K - 均值和PSO聚类进行了比较。© 2014 Elsevier B.V.保留所有权利。