SoK: Machine vs. Machine – A Systematic Classification of Automated Machine Learning-Based CAPTCHA Solvers,Computers & Security

当前位置： X-MOL 学术 › Comput. Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SoK: Machine vs. Machine – A Systematic Classification of Automated Machine Learning-Based CAPTCHA Solvers
Computers & Security ( IF 5.6 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.cose.2020.101947
Antreas Dionysiou , Elias Athanasopoulos

Abstract Internet services heavily rely on CAPTCHAs for determining whether or not a user is a human being. The recent advances in ML and AI make the efficacy of CAPTCHAs in strengthening Internet services against bots questionable. In this paper, we conduct a systematic analysis and classification of the state-of-the-art ML-based techniques for the automated text-based CAPTCHA breaking problem. The current state and robustness of text-based CAPTCHAs as are utilized by modern Internet applications, against ML-based automated breaking tools, is examined and reported. Our study suggests that ML can be very effective in increasing: (a) accuracy, (b) speed, and (c) abstraction in CAPTCHA solving. Especially, as far as (c) is concerned, ML-based techniques are easier to be applied in different classes of text-based CAPTCHA schemes. To assess the importance of ML in breaking CAPTCHAs, we build our own ML-only classifiers. Surprisingly, an ML-only approach for solving CAPTCHAs is not sufficient. Overall, our study suggests that fundamentally different ways of conducting reverse Turing test, that will be painless for legitimate users (i.e., humans) but at the same time challenging for automated systems (i.e., software), should be considered for ensuring the healthy operation of current Internet services.

中文翻译：

SoK：机器与机器 - 基于自动机器学习的 CAPTCHA 求解器的系统分类

摘要 Internet 服务严重依赖 CAPTCHA 来确定用户是否为人类。机器学习和人工智能的最新进展使验证码在加强互联网服务对抗机器人方面的功效受到质疑。在本文中，我们对最先进的基于机器学习的技术进行了系统分析和分类，用于解决基于文本的自动验证码破解问题。检查并报告了现代 Internet 应用程序使用的基于文本的 CAPTCHA 与基于 ML 的自动破解工具的当前状态和稳健性。我们的研究表明，机器学习可以非常有效地提高：(a) 准确性、(b) 速度和 (c) CAPTCHA 解决方案的抽象性。特别是，就（c）而言，基于 ML 的技术更容易应用于不同类别的基于文本的 CAPTCHA 方案。为了评估机器学习在破解 CAPTCHA 中的重要性，我们构建了自己的仅限机器学习的分类器。令人惊讶的是，解决 CAPTCHA 的仅 ML 方法是不够的。总体而言，我们的研究表明，应该考虑采用完全不同的方式进行逆向图灵测试，这对合法用户（即人类）来说是无痛的，但同时对自动化系统（即软件）具有挑战性，以确保健康运行当前的互联网服务。

更新日期：2020-10-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>