The detection of duplicate elements has important applications in fields such as distributed intrusion detection, public interest exploration, and traffic condition estimation. The existing detection models have problems such as false positives and false negatives, high communication overhead, and great limitations, and it is difficult to meet the needs of distributed application scenarios. In response to these problems, aiming to minimize the communication overhead, a duplicate element detection mechanism suitable for distributed monitoring systems is designed. Firstly, the mechanism mainly screens out a large number of irrelevant elements through multiple rounds of compressed data transmission between the monitor and the coordinator, thus reducing the overall communication overhead; then, the parameters are adjusted with the help of theoretical derivation to ensure the necessity of each round of screening and the optimization of the effect, and combined with technologies such as scalable Bloom filters and continuous traffic estimation, so that the mechanism can achieve good results when facing different element distribution situations; finally, the effectiveness of the designed mechanism is verified through simulation experiment results.
重复元素检测在分布式入侵检测、公众兴趣发掘以及交通状况估计等领域有着重要的应用.现有的检测模型存在误报漏报、通信开销大和局限性大等问题,难以满足分布式应用场景的需要.针对这些问题,以最小化通信开销为目标,设计了一个适用于分布式监测系统的重复元素检测机制.首先,机制主要通过监测器与协调器间多轮次的压缩数据传输筛去了大量无关元素,从而降低了整体通信开销;接着,借助理论推导调整参数保证每一轮筛选的必要性及效果的最优化,并结合了可扩展布隆过滤器和持续流量估计等技术,使机制在面对不同的元素分布状况时,都可以取得良好效果;最后,通过仿真实验结果验证了所设计机制的有效性.