Advanced botnets adopt a peer-to-peer (P2P) infrastructure for more resilient command and control (C&C). Traditional detection techniques become less effective in identifying bots that communicate via a P2P structure. In this paper, we present PeerClean, a novel system that detects P2P botnets in real time using only high-level features extracted from C&C network flow traffic. PeerClean reliably distinguishes P2P bot-infected hosts from legitimate P2P hosts by jointly considering flow-level traffic statistics and network connection patterns. Instead of working on individual connections or hosts, PeerClean clusters hosts with similar flow traffic statistics into groups. It then extracts the collective and dynamic connection patterns of each group by leveraging a novel dynamic group behavior analysis. Comparing with the individual host-level connection patterns, the collective group patterns are more robust and differentiable. Multi-class classification models are then used to identify different types of bots based on the established patterns. To increase the detection probability, we further propose to train the model with average group behavior, but to explore the extreme group behavior for the detection. We evaluate PeerClean on real-world flow records from a campus network. Our evaluation shows that PeerClean is able to achieve high detection rates with few false positives.
高级僵尸网络采用点对点(P2P)基础架构,以进行更多的弹性命令和控制(C&C)。传统的检测技术在识别通过P2P结构通信的机器人方面的有效性降低。在本文中,我们提出了Peerclean,这是一种新型系统,该系统可实时检测P2P僵尸网络,仅使用C&C网络流量流量提取的高级功能。 Peerclean可靠地通过共同考虑流量级的流量统计和网络连接模式来可靠地将P2P BOT感染的主机与合法的P2P主机区分开。 Peerclean集群主机无需在单个连接或主机上工作,而流量流量统计相似。然后,它通过利用新型的动态组行为分析来提取每个组的集体和动态连接模式。与单个主机级连接模式相比,集体组模式更加稳健和可区分。然后,多级分类模型用于根据已建立的模式识别不同类型的机器人。为了增加检测概率,我们进一步建议以平均群体行为训练模型,但要探索检测的极端群体行为。我们从校园网络中评估了Peerclean在现实世界流记录上。我们的评估表明,Peerclean能够以很少的假阳性实现高检测率。