当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
THS-IDPC: A three-stage hierarchical sampling method based on improved density peaks clustering algorithm for encrypted malicious traffic detection
The Journal of Supercomputing ( IF 2.5 ) Pub Date : 2020-06-29 , DOI: 10.1007/s11227-020-03372-1
Liangchen Chen , Shu Gao , Baoxu Liu , Zhigang Lu , Zhengwei Jiang

With the rapid increase in amount of network encrypted traffic and malware samples using encryption to evade identification, detecting encrypted malicious traffic presents challenges. The quality of the encrypted traffic sampling method directly affects the result of malware detection, but most existing machine learning methods for sampling flow-based encrypted traffic data are inherently inaccurate. To solve these problems, an innovative three-stage hierarchical sampling approach based on the improved density peaks clustering algorithm (THS-IDPC) is proposed to enhance the accuracy and efficiency of encrypted malicious traffic detection model. First, we propose an improved density peaks clustering algorithm based on grid screening, custom center decision value and mutual neighbor degree (DPC-GS-MND). In DPC-GS-MND, grid screening effectively reduces the computational complexity and mutual neighbor degree improves the clustering accuracy. Then, we extract and research the three categories features of encrypted traffic data related to malicious activities, and adopt a three-layer hierarchical clustering algorithm based on DPC-GS-MND. Finally, a three-stage sampling approach based on the three-layer hierarchical clustering algorithm (THS-IDPC) is proposed to sample the encrypted traffic data for further deep detection. The experimental results demonstrated that the proposed THS-IDPC is very effective to reduce normal traffic from massive network encrypted traffic simultaneously, and the encrypted malicious traffic detection model with THS-IDPC sampling method can detect multiple encrypted malicious traffic families with higher accuracy and efficiency. Meanwhile, DPC-GS-MND and THS-IDPC have good application prospects in network intrusion detection system under the big data environment.

中文翻译:

THS-IDPC:一种基于改进密度峰值聚类算法的加密恶意流量检测三阶段分层采样方法

随着网络加密流量和使用加密逃避识别的恶意软件样本数量的快速增加,检测加密的恶意流量提出了挑战。加密流量采样方法的质量直接影响恶意软件检测的结果,但现有的大多数基于流的加密流量数据采样的机器学习方法本质上是不准确的。针对这些问题,提出了一种基于改进密度峰值聚类算法(THS-IDPC)的创新三级分层采样方法,以提高加密恶意流量检测模型的准确性和效率。首先,我们提出了一种基于网格筛选、自定义中心决策值和互邻度的改进密度峰值聚类算法(DPC-GS-MND)。在 DPC-GS-MND 中,网格筛选有效降低了计算复杂度,互邻度提高了聚类精度。然后,我们提取和研究与恶意活动相关的加密流量数据的三类特征,并采用基于DPC-GS-MND的三层层次聚类算法。最后,提出了一种基于三层层次聚类算法(THS-IDPC)的三阶段采样方法对加密的流量数据进行采样,以进行进一步的深度检测。实验结果表明,所提出的THS-IDPC对于同时从海量网络加密流量中减少正常流量非常有效,并且采用THS-IDPC采样方法的加密恶意流量检测模型可以更准确、更高效地检测多个加密恶意流量家族。
更新日期:2020-06-29
down
wechat
bug