当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the efficacy of old features for the detection of new bots
Information Processing & Management ( IF 8.6 ) Pub Date : 2021-07-27 , DOI: 10.1016/j.ipm.2021.102685
Rocco De Nicola 1, 2 , Marinella Petrocchi 1, 3 , Manuel Pratelli 1
Affiliation  

For more than a decade now, academicians and online platform administrators have been studying solutions to the problem of bot detection. Bots are computer algorithms whose use is far from being benign: malicious bots are purposely created to distribute spam, sponsor public characters and, ultimately, induce a bias within the public opinion. To fight the bot invasion on our online ecosystem, several approaches have been implemented, mostly based on (supervised and unsupervised) classifiers, which adopt the most varied account features, from the simplest to the most expensive ones to be extracted from the raw data obtainable through the Twitter public APIs. In this exploratory study, using Twitter as a benchmark, we compare the performances of four state-of-art feature sets in detecting novel bots: one of the output scores of the popular bot detector Botometer, which considers more than 1,000 features of an account to take a decision; two feature sets based on the account profile and timeline; and the information about the Twitter client from which the user tweets. The results of our analysis, conducted on six recently released datasets of Twitter accounts, hint at the possible use of general-purpose classifiers and cheap-to-compute account features for the detection of evolved bots.



中文翻译:

旧特征检测新机器人的功效

十多年来,院士和在线平台管理员一直在研究机器人检测问题的解决方案。机器人是计算机算法,其使用远非良性:故意创建恶意机器人来分发垃圾邮件、赞助公众人物,并最终在公众舆论中引起偏见。为了对抗我们在线生态系统中的机器人入侵,已经实施了几种方法,主要基于(监督和无监督)分类器,它们采用最多样化的帐户特征,从最简单的到最昂贵的特征,从可获得的原始数据中提取通过 Twitter 公共 API。在这项探索性研究中,我们使用 Twitter 作为基准,比较了四种最先进的特征集在检测新型机器人方面的性能:流行的机器人检测器 Botometer 的输出分数之一,它会考虑帐户的 1,000 多个特征来做出决定;基于帐户配置文件和时间线的两个功能集;以及有关用户发推文的 Twitter 客户端的信息。我们对最近发布的六个 Twitter 帐户数据集进行的分析结果暗示了可能使用通用分类器和计算成本低的帐户功能来检测进化机器人。

更新日期:2021-07-28
down
wechat
bug