当前位置: X-MOL 学术arXiv.cs.IT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View
arXiv - CS - Information Theory Pub Date : 2020-09-16 , DOI: arxiv-2009.07753
Erick Galinkin

The attention that deep learning has garnered from the academic community and industry continues to grow year over year, and it has been said that we are in a new golden age of artificial intelligence research. However, neural networks are still often seen as a "black box" where learning occurs but cannot be understood in a human-interpretable way. Since these machine learning systems are increasingly being adopted in security contexts, it is important to explore these interpretations. We consider an Android malware traffic dataset for approaching this problem. Then, using the information plane, we explore how homeomorphism affects learned representation of the data and the invariance of the mutual information captured by the parameters on that data. We empirically validate these results, using accuracy as a second measure of similarity of learned representations. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same. Furthermore, our results show that since mutual information remains invariant under homeomorphism, only feature engineering methods that alter the entropy of the dataset will change the outcome of the neural network. This means that for some datasets and tasks, neural networks require meaningful, human-driven feature engineering or changes in architecture to provide enough information for the neural network to generate a sufficient statistic. Applying our results can serve to guide analysis methods for machine learning engineers and suggests that neural networks that can exploit the convolution theorem are equally accurate as standard convolutional neural networks, and can be more computationally efficient.

中文翻译:

通过深度学习检测恶意网络流量:信息论观点

深度学习在学术界和产业界的关注度逐年增加,可以说我们正处于人工智能研究的新黄金时代。然而,神经网络仍然经常被视为一个“黑匣子”,在那里学习发生但不能以人类可解释的方式理解。由于这些机器学习系统在安全环境中越来越多地被采用,因此探索这些解释非常重要。我们考虑使用 Android 恶意软件流量数据集来解决此问题。然后,使用信息平面,我们探索同胚如何影响数据的学习表示以及由该数据上的参数捕获的互信息的不变性。我们凭经验验证了这些结果,使用准确性作为学习表示相似性的第二个度量。我们的结果表明,尽管学习表示的细节和在所有参数的流形上定义的特定坐标系略有不同,但函数近似是相同的。此外,我们的结果表明,由于互信息在同胚下保持不变,只有改​​变数据集熵的特征工程方法才会改变神经网络的结果。这意味着对于某些数据集和任务,神经网络需要有意义的、人为驱动的特征工程或架构变化,以便为神经网络提供足够的信息以生成足够的统计数据。
更新日期:2020-09-17
down
wechat
bug