当前位置: X-MOL 学术Comput. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using data mining techniques to explore security issues in smart living environments in Twitter
Computer Communications ( IF 6 ) Pub Date : 2021-09-02 , DOI: 10.1016/j.comcom.2021.08.021
Jose Ramon Saura 1 , Daniel Palacios-Marqués 2 , Domingo Ribeiro-Soriano 3
Affiliation  

In present-day in consumers’ homes, there are millions of Internet-connected devices that are known to jointly represent the Internet of Things (IoT). The development of the IoT industry has led to the emergence of connected devices and home assistants that create smart living environments. However, the continuously generated data accumulated by these connected devices create security issues and raise user’s privacy concerns. The present study aims to explore the main security issues in smart living environments using data mining techniques. To this end, we applied a three-sentence data mining analysis of 9,38,258 tweets collected from Twitter under the user-generated data (UGD) framework. First, sentiment analysis was applied using Textblob which was tested with support vector classifier, multinomial naïve bayes, logistic regression, and random forest classifier; as a result, the analyzed tweets were divided into those expressing positive, negative, and neutral sentiment. Next, a Latent Dirichlet Allocation (LDA) algorithm was applied to divide the sample into topics related to security issues in smart living environments. Finally, the insights were extracted by applying a textual analysis process in Python validated with the analysis of frequency and weighted percentage variables and calculating the statistical measure known as mutual information (MI) to analyze the identified n-grams (unigrams and bigrams). As a result of the research 10 topics were identified in which we found that the main security issues are malware, cybersecurity attacks, data storing vulnerabilities, the use of testing software in IoT, and possible leaks due to the lack of user experience. We discussed different circumstances and causes that may affect user security and privacy when using IoT devices and emphasized the importance of UGC in the processing of personal data of IoT device users.



中文翻译:

使用数据挖掘技术探索 Twitter 智能生活环境中的安全问题

如今,在消费者家中,有数百万台联网设备共同代表物联网 (IoT)。物联网行业的发展导致了创造智能生活环境的连接设备和家庭助理的出现。然而,这些连接设备所积累的不断生成的数据会产生安全问题并引起用户的隐私担忧。本研究旨在使用数据挖掘技术探索智能生活环境中的主要安全问题。为此,我们在用户生成数据 (UGD) 框架下对从 Twitter 收集的 9,38,258 条推文进行了三句数据挖掘分析。首先,使用 Textblob 应用情感分析,并使用支持向量分类器、多项朴素贝叶斯、逻辑回归、和随机森林分类器;因此,分析的推文被分为表达积极情绪、消极情绪和中性情绪的推文。接下来,应用潜在狄利克雷分配 (LDA) 算法将样本划分为与智能生活环境中的安全问题相关的主题。最后,通过在 Python 中应用文本分析过程,通过频率和加权百分比变量的分析进行验证,并计算称为互信息 (MI) 的统计度量来分析识别的 n-gram(unigrams 和 bigrams),从而提取见解。研究结果确定了 10 个主题,我们发现其中的主要安全问题是恶意软件、网络安全攻击、数据存储漏洞、物联网中测试软件的使用以及由于缺乏用户体验而可能导致的泄漏。

更新日期:2021-09-10
down
wechat
bug