Augmented Data Selector to Initiate Text-Based CAPTCHA Attack,Security and Communication Networks

当前位置： X-MOL 学术 › Secur. Commun. Netw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Augmented Data Selector to Initiate Text-Based CAPTCHA Attack
Security and Communication Networks ( IF 1.968 ) Pub Date : 2021-06-16 , DOI: 10.1155/2021/9930608
Aolin Che ₁ , Yalin Liu ₁ , Hong Xiao ₂ , Hao Wang ₃ , Ke Zhang ₄ , Hong-Ning Dai ₁

Affiliation

In the past decades, due to the low design cost and easy maintenance, text-based CAPTCHAs have been extensively used in constructing security mechanisms for user authentications. With the recent advances in machine/deep learning in recognizing CAPTCHA images, growing attack methods are presented to break text-based CAPTCHAs. These machine learning/deep learning-based attacks often rely on training models on massive volumes of training data. The poorly constructed CAPTCHA data also leads to low accuracy of attacks. To investigate this issue, we propose a simple, generic, and effective preprocessing approach to filter and enhance the original CAPTCHA data set so as to improve the accuracy of the previous attack methods. In particular, the proposed preprocessing approach consists of a data selector and a data augmentor. The data selector can automatically filter out a training data set with training significance. Meanwhile, the data augmentor uses four different image noises to generate different CAPTCHA images. The well-constructed CAPTCHA data set can better train deep learning models to further improve the accuracy rate. Extensive experiments demonstrate that the accuracy rates of five commonly used attack methods after combining our preprocessing approach are 2.62% to 8.31% higher than those without preprocessing approach. Moreover, we also discuss potential research directions for future work.

中文翻译：

增强数据选择器以发起基于文本的 CAPTCHA 攻击

在过去的几十年中，由于设计成本低和易于维护，基于文本的 CAPTCHA 已被广泛用于构建用户身份验证的安全机制。随着机器/深度学习在识别 CAPTCHA 图像方面的最新进展，提出了越来越多的攻击方法来破解基于文本的 CAPTCHA。这些基于机器学习/深度学习的攻击通常依赖于基于大量训练数据的训练模型。错误构造的 CAPTCHA 数据也导致攻击的准确性低。为了研究这个问题，我们提出了一种简单、通用且有效的预处理方法来过滤和增强原始 CAPTCHA 数据集，以提高先前攻击方法的准确性。特别是，所提出的预处理方法由数据选择器和数据增强器组成。数据选择器可以自动过滤出具有训练意义的训练数据集。同时，数据增强器使用四种不同的图像噪声来生成不同的 CAPTCHA 图像。构建良好的 CAPTCHA 数据集可以更好地训练深度学习模型，进一步提高准确率。大量实验表明，结合我们的预处理方法后，五种常用攻击方法的准确率比没有预处理方法的方法高 2.62% 到 8.31%。此外，我们还讨论了未来工作的潜在研究方向。构建良好的 CAPTCHA 数据集可以更好地训练深度学习模型，进一步提高准确率。大量实验表明，结合我们的预处理方法后，五种常用攻击方法的准确率比没有预处理方法的方法高 2.62% 到 8.31%。此外，我们还讨论了未来工作的潜在研究方向。构建良好的 CAPTCHA 数据集可以更好地训练深度学习模型，进一步提高准确率。大量实验表明，结合我们的预处理方法后，五种常用攻击方法的准确率比没有预处理方法的方法高 2.62% 到 8.31%。此外，我们还讨论了未来工作的潜在研究方向。

更新日期：2021-06-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>