Antisocial online behavior detection using deep learning

doi:10.1016/j.dss.2020.113362

Decision Support Systems

Volume 138, November 2020, 113362

https://doi.org/10.1016/j.dss.2020.113362 Get rights and content

Highlights

•
Comprehensive benchmark on deep learning regimes for AOB detection.
•
Usage of transformer-based language models.
•
Interpretability module to understand the model's logic.
•
Interpretability module to detect unintended bias.

Abstract

Digitalization shifts human communication to online platforms, which has many benefits but also builds up a space for antisocial online behavior (AOB) such as harassment, insult and other forms of hateful textual content. Online platforms have good reasons to monitor and moderate such content. The paper examines the viability of automatic content monitoring using deep machine learning and natural language processing (NLP). More specifically, we consolidate prior work in the field of antisocial online behavior detection and compare relevant approaches to recent NLP models in an empirical study. Covering important methodological advancements in NLP including bidirectional encoding, attention, hierarchical text representations, and pre-trained transformer-based language models, and extending previous approaches by introducing a pseudo-sentence hierarchical attention network, the paper provides a comprehensive summary of the state-of-affairs in NLP-based AOB detection, clarifies the detection accuracy that is attainable with today's technology, discusses whether this degree is sufficient for deploying deep learning-based text screening systems, and approaches the interpretability topic.

Introduction

The shift of human communication to online platforms is a double-edged sword. Social benefits include the opportunity to share opinions and experiences, get immediate feedback, and the opportunity to discuss the hottest topics. From an economic perspective, the data from online communications enable business organization to learn from customer experiences, improve service offerings, and raise firm performance. Examples of corresponding advancements include Liu et al. [1], who propose a method to assess a product's competitive advantages based on social media. Similarly, Zhang et al. [2] use natural language processing (NLP) to analyze knowledge payment platforms and shed light on customer satisfaction, while Siering et al. [3] examine online reviews to identify what service aspects customers value the most. On the other hand, online communication platforms also create a space for malicious behavior such as the distribution of fake news and reviews, which distort insights gained from the data and may harm the reputation of the platform [4,5]. The focus of this work is related to a similar problem: the detection of antisocial behavior, such as insulting, harassment, or threatening in online communication.

Detection of such antisocial behavior is highly important for social welfare due to social, legislative, and financial reasons. According to the Cyberbullying Research Center's annual data in 2016, 33.8% of young people aged 12–17 in the US have experienced cyberbullying in their lifetime [6]. According to one German law, social media providers like Facebook, Google, Microsoft are obliged in Germany to remove hate speech posts within 24 h and report on their progress every six months [7]. Legal requirements, social norms, and codes of conduct emphasizes the importance for online platforms to identify antisocial online behavior (AOB), which we use as an umbrella term for any malicious behavior that can be found in the textual content on online communications platforms including insult, threat, personal attack, usage of harmful, rude or offensive language, cyberbullying and abuse.

Manual detection and monitoring of online content can be very costly, making autonomous systems for screening user-generated text content for traces of AOB a key attention point. Machine learning-based decision support systems that pre-screen transactions and flag suspicious cases for subsequent human inspection have proven effective in fraud detection [8] and may prove a viable solution for the AOB detection problem of social media platforms.

In the paper, we elaborate on the detection of AOB using NLP. Early academic research in the field was mostly concerned with the use of traditional machine learning methods (TML) such as logistic regressions, support vector machines, and decision trees [e.g., [9]], as well as lexicon-based approaches [e.g., [10]]. These methods heavily rely on extensive feature engineering, and their performance highly depends on the representation of the data. DL methods automate the procedure of feature engineering by learning the representations of the data through non-linear transformations. Such representations often achieve better performance than handcrafted features [11]. The main contribution of the paper is the following: we consolidate prior work on AOB detection and text classification and provide a comprehensive benchmark of alternative text processing regimes. We compare methods of TML with deep learning (DL) while covering significant methodological advancements, including bidirectional encoding, attention, and techniques to exploit the hierarchical structure of text. Many of these DL techniques are new to the field of AOB detection and systematic comparisons of their potential to raise detection accuracy are, to our best knowledge, not available in prior research. Further, we extend hierarchical DL models and introduce a pseudo-sentence hierarchical attention network. We investigate the potential of deep NLP transfer learning for AOB detection by considering transformer-based language models such as BERT in our analysis. Finally, we propose the usage of the LIME framework developed by Ribeiro et al. [12] as a final stage of AOB detection. This framework provides interpretability of the model's underlying logic, which might help moderators to decide whether to filter a post. Machine learning-based systems often reflect existing demographic biases [13], which might lead to “unfair” decisions. Interpretability also ensures that model predictions can be checked for possible unintended bias, which would require adjustment or revision of the detection model. All codes used for the experiment are available on Github at https://github.com/QuantLet/AOBDL_code. Moreover, the reader can find an online appendix containing details on parameter tuning, used DL architectures, and additional literature on AOB detection at https://github.com/QuantLet/AOBDL_code/blob/master/AOBDL_Online_Appendix.pdf.

Section snippets

Related work

In only a few years, DL methods have revolutionized the fields of computer vision and NLP, in which they can now be considered a quasi-standard [14]. Recently, a few DL-based approaches have appeared in the decision support literature. We review corresponding research below and distinguish between approaches that support decision-making based on analyzing structured versus unstructured data. This is to sketch the status-quo of DL-based decision support (DS). Thereafter, we review prior work on

Methodology

To fully appreciate the technical content, the reader might benefit from the following overview of ML technologies. In Fig. 1, we summarize our motivation on what machine learning approaches to include. The figure shows different methods and their drawbacks, which can be handled by more complex models.

We start with methods of TML, and as mentioned in the introduction, these methods heavily rely on handcrafted features, whereas DL models help to learn abstract data representations and extract

Dataset

The first data set used for the experiments comes from a Kaggle competition “Toxic Comment Classification”¹. This competition is dedicated to the identification of different levels of toxicity in the Wikipedia Talk Pages. The second data set is retrieved from Twitter and was created by Davidson et al. [53], where the authors used it for automatic hate-speech detection. The third dataset is the English and Hindi data from

TML vs. DL

As a first experiment, we compare TML methods with CNN and GRU, two basic DNN architectures, which many more sophisticated models are based on. To that end, we select TML methods that have been used frequently in the AOB literature, including support vector machines (SVM), logistic regression with l2 regularization (LR), and random forest (RF) [e.g., [9], [36]]. Moreover, we consider gradient boosting (LightGBM) due to this model's good performance in prediction benchmarks. Finally, we consider

Conclusion and further work

Detection and prevention of AOB in online content have become an essential problem for social welfare and companies that provide platforms where user-generated content is shared. Manual monitoring of such behavior can be very costly and time-consuming. On the other hand, the absence of moderation can lead to regulatory consequences. This is why support systems that screen user-generated text content and identify cases that warrant manual inspection are of high importance. DL methods are a

Declaration of Competing Interest

None.

Acknowledgements

Financial support from the Deutsche Forschungsgemeinschaft via the IRTG 1792 “High Dimensional Nonstationary Time Series”, Humboldt-Universität zu Berlin, is gratefully acknowledged.

Elizaveta Zinovyeva is a PhD student of of the International Research Training Group IRTG1792 “High dimensional nonstationary time series” at the Humboldt-Universität zu Berlin. Previously she has completed her Master's studies in Information Systems and Bachelor's studies in Business Administration at the Humboldt-Universität zu Berlin. Her research focuses on application of deep neural networks on sequential data.

References (58)

Y. Liu et al.
Assessing product competitive advantages from the perspective of customers by mining user-generated content on social media
Decis. Support. Syst.
(2019)
J. Zhang et al.
From free to paid: customer expertise and customer satisfaction on knowledge payment platforms
Decis. Support. Syst.
(2019)
M. Siering et al.
Disentangling consumer recommendations: explaining and predicting airline recommendations based on online reviews
Decis. Support. Syst.
(2018)
C. Zhang et al.
Detecting fake news for reducing misinformation risks using analytics approaches
Eur. J. Oper. Res.
(2019)
A. Heydari et al.
Detection of fake opinions using time series
Expert Syst. Appl.
(2016)
E. Stripling et al.
Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers’ compensation fraud
Decis. Support. Syst.
(2018)
A. Kim et al.
Can deep learning predict risky retail investors? A case study in financial risk behavior forecasting
Eur. J. Oper. Res.
(2020)
H. Jang
A decision support framework for robust r&d budget allocation using machine learning and optimization
Decis. Support. Syst.
(2019)
Z.-L. Sun et al.
Sales forecasting using extreme learning machine with applications in fashion retailing
Decis. Support. Syst.
(2008)
N. Carneiro et al.
A data mining based system for credit-card fraud detection in e-tail
Decis. Support. Syst.
(2017)

J.W. Patchin

2016 Cyberbullying Data

(2016)

Cited by (40)

An end-to-end deep learning model for solving data-driven newsvendor problem with accessibility to textual review data
2023, International Journal of Production Economics
We investigate a data-driven single-period inventory management problem with uncertain demand, where large amounts of textual online reviews and historical data are accessible. Unlike two-step frameworks (i.e., predict-then-optimization), we propose an end-to-end (E2E) framework that directly suggests the order quantity by leveraging a deep learning model that inputs textual online reviews and other demand-related feature data, without any intermediate steps such as text sentiment analysis. The E2E model does not require any prior assumptions about the demand distribution and can automatically determine the order quantity that minimizes the newsvendor cost by employing the information from real-world data. Our experiments, using publicly available real-world data, demonstrate that our method can significantly reduce the sum of overage and underage costs, outperforming other data-driven models proposed in recent years. Specifically, the inclusion of textual online review data improves ordering decisions by a 28.7% cost reduction.
What influence farmers’ relative poverty in China: A global analysis based on statistical and interpretable machine learning methods
2023, Heliyon
Poverty eradication has always been a major challenge to global development and governance, which received widespread attention from each country. With the completion poverty alleviation task in 2020, relative poverty governance becomes an important issue to be solved in China urgently. Because of a large population, poor infrastructures, insufficient resources, and long-term uneven development raising the living standard of farmers in rural areas is critical to China's success in realizing moderate prosperity. Therefore, identifying the poor farmers, exploring the influence factors to relative poverty, and clarifying its effect mechanism in rural areas are significant for the subsequent poverty governance. Most of the previous studies adopted the method of apriori assuming the factor system and verifying the hypothesis. We innovatively constructed a relative poverty index system consistent with China's actual conditions, selecting all the possible variables that could affect relative poverty based on the existing literature, including individual characteristics, psychological endowment, and geographical environment, and rebuilt an experimental database. Then, through data processing and data analysis, the main factors influencing the relative poverty of farmers were systematically sorted out based on the machine learning method. Finally, 25 chosen influencing factors were discussed in detail. Research findings show that: 1) Machine learning algorithm is proved it could be well applied in relative poverty fields, especially XGBoost, which achieves 81.9% accuracy and the score of ROC_AUC reaches 0.819. 2) This study sheds light on many new research directions in applying machine learning for relative poverty research, besides, the paper offers an integral framework and beneficial reference for target identification using machine learning algorithms. 3) In addition, by utilizing the interpretable tools, the “black-box” of ML become transparent through PDP and SHAP explanation, it also reveals that machine learning models can readily handle the non-linear association relationship.
Bystander pro-celebrity cyberbullying: An integrated perspective of susceptibility to retaliation and social capital gains
2023, Information and Management
While celebrity cyberbullying has increased, previous research mainly focuses on bystanders’ role in general cyberbullying, and the motives for bystander pro-celebrity cyberbullying (BPCB) remain under-investigated. Drawing on the social exchange theory, we investigate BPCB on social media, specifically how publicity and network mutuality affect bystanders’ cost–benefit assessment of, and consequent pro-cyberbullying intention towards, celebrities. The results show that publicity and network mutuality influence bystanders’ perceived susceptibility to retaliation and expected social capital gains, with the former reducing BPCB intention and the latter strengthening it. This work contributes to the research on cyberbullying and offers guidance in combating celebrity cyberbullying.
Offensive language identification in dravidian languages using MPNet and CNN
2023, International Journal of Information Management Data Insights
Citation Excerpt :
However, these platforms have also become spaces where people are targeted, defamed, and marginalised based on their physical appearance, religion, sexual orientation, and many other factors (Benikova, Wojatzki, & Zesch, 2018; Keipi, Näsi, Oksanen, & Räsänen, 2016; Pamungkas, Basile, & Patti, 2020). Social media has developed into a specialised instrument for verbally threatening and cornering people, not based on their actions but on their identities (Maitra & McGowan, 2012; Patton, Eschmann, & Butler, 2013; Zinovyeva, Hrdle, & Lessmann, 2020). The depth and breadth of this ‘digital marvel’ have enabled previously ‘invisible and socially paralysed’ populations to participate in social discourses (Barnidge, Kim, Sherrill, Luknar, & Zhang, 2019).
Social media has effectively replaced traditional forms of communication and marketing. As these platforms allow for the free expression of ideas and facts through text, images, and videos, there exists a significant need to screen them to safeguard people and organisations from objectionable information directed at them. Our work aims to categorise code-mixed social media comments and posts in Tamil, Malayalam, and Kannada into offensive or not offensive at different levels. We present a multilingual MPNet and CNN fusion model for detecting offensive language content directed at an individual (or group) in low-resource Dravidian languages at different levels. Our model is capable of handling data that has been code-mixed, such as Tamil and Latin scripts. The model was successfully validated on the datasets, achieving offensive language detection results better than those of other baseline models with weighted average F1-score of 0.85, 0.98, and 0.76, and performed better than the baseline models EWDT, and EWODT by 0.02, 0.02, 0.04 for Tamil, Malayalam, and Kannada respectively.
Online offensive behaviour in socialmedia: Detection approaches, comprehensive review and future directions
2023, Entertainment Computing
Citation Excerpt :
The disadvantage of this paper is that it did not consider optimization techniques to enhance the results. Zinovyeva et al. [68] have proposed utilizing the Local Interpretable Model Agnostic Explanations (LIME) framework to detect antisocial online behaviour. The framework gave the underlying logic interpretability of the model, which might help decide whether or not to filter a post.
The enormous growth of social media provides a platform for displaying harmful, offensive online behaviour, which keeps increasing with time. The popularity of smartphones and the anonymity of the internet have made online offensive behaviour very common. Therefore, research on social media offensive behaviour has increased in recent years. In this paper, we have endeavoured to depict the variety of abusive behaviour one can encounter online and the significance of detecting them by classifying them into four categories: Content-Based, Sentiment and Emotion Based, User or Profile Based, and Network or Graph-Based approach. We review the state-of-the-art methods to detect bullies and abusive content on social media and discuss the factors that drive offenders to indulge in offensive activity, preventive actions to avoid online toxicity, and various cyber laws in different countries. Finally, we identify and discuss the future research directions that serve as a reference to overcome offensive content in social media.
Explainable real-time predictive analytics on employee workload in digital railway control rooms
2023, European Journal of Operational Research
Both workload peaks and lows contribute to lower employee well-being. Predictive employee workload analytics can empower management to undertake proactive prevention. For this purpose, we develop a real-time machine learning framework to predict and explain future workload in a challenging environment with variable and imbalanced workload: the digital control rooms for railway traffic management of Infrabel, Belgium’s railway infrastructure company. The proposed two-stage methodology leverages granular data of workload categories that are very different in nature and separates the effects of workload presence and magnitude. In this way, the set-up addresses the changing workload mix over 15-minute intervals. We extensively benchmark machine learning and deep learning models within this context, leading to LightGBM (Light Gradient Boosting Machine) as the best-performing model. SHAP (SHapley Additive exPlanations) values highlight the benefits of disentangling presence and magnitude and reveal associations with human-machine interaction and team exposure. As a proof of concept, our implemented predictive model offers tailored decision support to the traffic supervisor in an explainable way. In particular, the tool depicts overloaded and/or underloaded workstations and provides in-depth insights through local SHAP values.

View all citing articles on Scopus

Wolfgang Karl Härdle attained his Dr. rer. nat. in Mathematics at Universität Heidelberg in 1982 and in 1988 his habilitation at Universität Bonn. He is Ladislaus von Bortkiewicz Professor of Statistics at Humboldt-Universität zu Berlin and the director of the Sino German International Research Training Group IRTG1792 “High dimensional nonstationary time series”, a joint project with WISE, Xiamen University.

His research focuses on data sciences, dimension reduction and quantitative finance. He has published over 30 books and more than 300 papers in top statistical, econometrics and finance journals. He is highly ranked and cited on Google Scholar, REPEC and SSRN. He has professional experience in financial engineering, smart (specific, measurable, achievable, relevant, timely) data analytics, machine learning and cryptocurrency markets.

Stefan Lessmann received a diploma in business administration and a PhD from the University of Hamburg in 2002 and 2007, respectively. Stefan worked as a lecturer and senior lecture in business informatics at the Institute of Information Systems of the University of Hamburg. Since 2008, Stefan is a guest lecturer at the School of Management of University of Southampton, where he teaches under- and postgraduate courses on quantitative methods, electronic business, and web application development. Stefan completed his habilitation in the area of predictive analytics in 2012. In 2014, Stefan joined the Humboldt-University of Berlin, where he heads the Chair of Information Systems at the School of Business and Economics. Stefan published several papers in leading international journals and conferences, including the European Journal of Operational Research, the IEEE Transactions of Software Engineering, and the International Conference on Information Systems. He actively participates in knowledge transfer and consulting projects with industry partners; from small start-up companies to global players.

View full text