Towards a Framework for Acquisition and Analysis of Speeches to Identify Suspicious Contents through Machine Learning,Complexity

当前位置： X-MOL 学术 › Complexity › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards a Framework for Acquisition and Analysis of Speeches to Identify Suspicious Contents through Machine Learning
Complexity ( IF 1.7 ) Pub Date : 2020-11-16 , DOI: 10.1155/2020/5639787
Md. Rashadur Rahman ₁ , Mohammad Shamsul Arefin ₁ , Md. Billal Hossain ₁ , Mohammad Ashfak Habib ₁ , A. S. M. Kayes ₂

Affiliation

The most prominent form of human communication and interaction is speech. It plays an indispensable role for expressing emotions, motivating, guiding, and cheering. An ill-intentioned speech can mislead people, societies, and even a nation. A misguided speech can trigger social controversy and can result in violent activities. Every day, there are a lot of speeches being delivered around the world, which are quite impractical to inspect manually. In order to prevent any vicious action resulting from any misguided speech, the development of an automatic system that can efficiently detect suspicious speech has become imperative. In this study, we have presented a framework for acquisition of speech along with the location of the speaker, converting the speeches into texts and, finally, we have proposed a system based on long short-term memory (LSTM) which is a variant of recurrent neural network (RNN) to classify speeches into suspicious and nonsuspicious. We have considered speeches of Bangla language and developed our own dataset that contains about 5000 suspicious and nonsuspicious samples for training and validating our model. A comparative analysis of accuracy among other machine learning algorithms such as logistic regression, SVM, KNN, Naive Bayes, and decision tree is performed in order to evaluate the effectiveness of the system. The experimental results show that our proposed deep learning-based model provides the highest accuracy compared to other algorithms.

中文翻译：

迈向获取和分析语音的框架，以通过机器学习识别可疑内容

人类交流和互动的最主要形式是语音。它在表达情感，激励，引导和欢呼中起着不可或缺的作用。恶意的演讲会误导人们，社会乃至整个国家。错误的讲话会引发社会争议，并可能导致暴力活动。每天，世界各地都有很多演讲，要人工检查是非常不切实际的。为了防止由于任何误导的语音而引起的任何恶意行为，迫切需要开发一种能够有效检测可疑语音的自动系统。在这项研究中，我们提供了一个获取语音的框架以及讲话者的位置，将语音转换为文本，最后，我们提出了一种基于长短期记忆（LSTM）的系统，该系统是递归神经网络（RNN）的一种变体，可将语音分为可疑和非可疑。我们已经考虑了孟加拉语的演讲，并开发了自己的数据集，其中包含约5000个可疑和非可疑样本，用于训练和验证我们的模型。为了评估系统的有效性，对其他机器学习算法（例如逻辑回归，SVM，KNN，朴素贝叶斯和决策树）的准确性进行了比较分析。实验结果表明，与其他算法相比，我们提出的基于深度学习的模型提供了最高的准确性。我们已经考虑了孟加拉语的演讲，并开发了自己的数据集，其中包含约5000个可疑和非可疑样本，用于训练和验证我们的模型。为了评估系统的有效性，对其他机器学习算法（例如逻辑回归，SVM，KNN，朴素贝叶斯和决策树）的准确性进行了比较分析。实验结果表明，与其他算法相比，我们提出的基于深度学习的模型提供了最高的准确性。我们已经考虑了孟加拉语的演讲，并开发了自己的数据集，其中包含约5000个可疑和非可疑样本，用于训练和验证我们的模型。为了评估系统的有效性，对其他机器学习算法（例如逻辑回归，SVM，KNN，朴素贝叶斯和决策树）的准确性进行了比较分析。实验结果表明，与其他算法相比，我们提出的基于深度学习的模型提供了最高的准确性。

更新日期：2020-11-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11