当前位置: X-MOL 学术Wirel. Commun. Mob. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Application Research of File Fingerprint Identification Detection Based on a Network Security Protection System
Wireless Communications and Mobile Computing Pub Date : 2020-12-01 , DOI: 10.1155/2020/8841417
Chunwei Wang 1, 2 , Lina Yu 1 , Huixian Chang 1 , Sheng Shen 1 , Fang Hou 1 , Yingwei Li 1
Affiliation  

A DLP (data loss prevention) system usually arranges network monitors at the network boundary to perform network traffic capture, file parsing, and strategy matching procedures. Strategy matching is a key process to prevent corporate secret-related documents from leaking. This paper adopts the document fingerprint similarity detection method based on the SimHash principle and customizes the KbS (Keyword-based SimHash) fingerprint, PbS (Paragraph-based SimHash) fingerprint, and SoP (SimHash of Paragraph) fingerprint, three different feature extraction SimHash algorithms for strategy matching to detect. The parsed unstructured data is stored as a file type in.txt format, and then a file fingerprint is generated. Matching the established sensitive document library to calculate the Hamming distance between the fingerprints, the Hamming distance values under different modification degrees are summarized. The experimental results reveal that the hybrid algorithmic strategy matching rules with different levels and accuracy are established. This paper has a reference role for the leakage prevention research of enterprise sensitive data.

中文翻译:

基于网络安全保护系统的文件指纹识别检测的应用研究

DLP(数据丢失防护)系统通常在网络边界布置网络监视器,以执行网络流量捕获,文件解析和策略匹配过程。策略匹配是防止公司机密相关文档泄漏的关键过程。本文采用基于SimHash原理的文档指纹相似度检测方法,并定制了KbS(基于关键字的SimHash)指纹,PbS(基于段落的SimHash)指纹和SoP(段落的SimHash)指纹,三种不同的特征提取SimHash算法用于策略匹配检测。解析后的非结构化数据以.txt格式存储为文件类型,然后生成文件指纹。匹配已建立的敏感文档库以计算指纹之间的汉明距离,总结了不同修正度下的汉明距离值。实验结果表明,建立了具有不同层次和准确性的混合算法策略匹配规则。本文对企业敏感数据的泄漏防护研究具有参考作用。
更新日期:2020-12-01
down
wechat
bug