Robust Online Learning against Malicious Manipulation and Feedback Delay With Application to Network Flow Classification,IEEE Journal on Selected Areas in Communications

当前位置： X-MOL 学术 › IEEE J. Sel. Area. Comm. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Robust Online Learning against Malicious Manipulation and Feedback Delay With Application to Network Flow Classification
IEEE Journal on Selected Areas in Communications ( IF 13.8 ) Pub Date : 2021-06-14 , DOI: 10.1109/jsac.2021.3087268
Yupeng Li , Ben Liang , Ali Tizghadam

Malicious data manipulation reduces the effectiveness of machine learning techniques, which rely on accurate knowledge of the input data. Motivated by real-world applications in network flow classification, we address the problem of robust online learning with delayed feedback in the presence of malicious data generators that attempt to gain favorable classification outcome by manipulating the data features. When the feedback delay is static, we propose online algorithms termed ROLC-NC and ROLC-C when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We then consider the dynamic delay case, for which we propose online algorithms termed ROLC-NC-D and ROLC-C-D when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We derive regret bounds for these four algorithms and show that they are sub-linear under mild conditions. We further evaluate the proposed algorithms in network flow classification via extensive experiments using real-world data traces. Our experimental results demonstrate that the proposed algorithms can approach the performance of an optimal static offline classifier that is not under attack, while outperforming the same offline classifier when tested with a mixture of normal and manipulated data.

中文翻译：

针对恶意操纵和反馈延迟的鲁棒在线学习及其在网络流分类中的应用

恶意数据操纵会降低机器学习技术的有效性，而机器学习技术依赖于对输入数据的准确了解。受网络流分类中现实世界应用的推动，我们解决了在存在恶意数据生成器的情况下通过延迟反馈进行鲁棒在线学习的问题，这些生成器试图通过操纵数据特征来获得有利的分类结果。当反馈延迟是静态的时，当恶意数据生成器是非透视和透视时，我们分别提出了称为 ROLC-NC 和 ROLC-C 的在线算法。然后，我们考虑动态延迟情况，当恶意数据生成器是非透视和透视时，我们分别提出称为 ROLC-NC-D 和 ROLC-CD 的在线算法。我们推导了这四种算法的遗憾界限，并表明它们在温和条件下是次线性的。我们通过使用真实世界数据跟踪的大量实验进一步评估网络流分类中提出的算法。我们的实验结果表明，所提出的算法可以接近未受到攻击的最佳静态离线分类器的性能，同时在使用正常数据和操纵数据的混合进行测试时优于相同的离线分类器。

更新日期：2021-06-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11