当前位置: X-MOL 学术Pers. Ubiquitous Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Digital media news categorization using Bernoulli document model for web content convergence
Personal and Ubiquitous Computing Pub Date : 2020-09-17 , DOI: 10.1007/s00779-020-01461-9
Pradeep Kumar Mallick , Sushruta Mishra , Gyoo- Soo Chae

There are multiple distinct sources through which numerous news contents that occur in digital medium tend to converge. Web contents constitute massive number of features. Complete coverage of all kinds of news is absolutely vital to retain customer confidence and to have a competitive edge over other news agencies. Aggregating such massive news content from different heterogeneous sources requires an integration of convergent computing. Classification of these online news is a challenging task in the age of Internet where news keeps flowing from several heterogeneous sources. Due to constant rise in manipulation of web contents, accurate classification of digital news is the need of the hour. Precise detection of specific news into their respective class is a major challenge in recent times. In this scenario, the need of an automated predictive-based approach can be of great use in effective organization and classification of news in a pool of web portals. This research study comprises the application of Bernoulli model to determine the effectiveness of multi-class digital news categorization that arrive in real time. The system model presented in this analysis was evaluated using python, and the result was demonstrated using six distinct classes of news with a 6000 feature size dataset from TagMyNews dataset. The classification accuracy using the Bernoulli model was computed to be 98.4%, while the evaluated precision metric was 92.7%, and recall value was 90.6%. The F-Score metric generated an optimum value of 91.4%. The execution time for Bernoulli model was only 12 s. The computed result using Bernoulli model was compared with some other related renowned existing works and the results generated by Bernoulli model gave optimum performance and the news classification efficiency is highly enhanced.



中文翻译:

使用Bernoulli文档模型对Web内容进行融合的数字媒体新闻分类

有多种不同的来源,数字媒体中出现的许多新闻内容都倾向于通过这些来源汇聚。Web内容构成大量功能。全面报道各种新闻对保持客户信心并在竞争中胜于其他新闻机构至关重要。汇总来自不同异构来源的大量新闻内容需要集成融合计算。这些在线新闻的分类在Internet时代是一项具有挑战性的任务,在Internet时代,新闻不断地来自不同的来源。由于对Web内容的操纵不断增长,因此需要精确地对数字新闻进行分类。准确检测特定新闻进入各自的类别是最近的主要挑战。在这种情况下,基于自动化预测的方法的需求在网络门户池中的新闻的有效组织和分类中可能会很有用。这项研究包括应用伯努利模型来确定实时到达的多类数字新闻分类的有效性。使用python评估了此分析中提供的系统模型,并使用了六个不同类别的新闻以及TagMyNews数据集中具有6000个特征大小的数据集来证明了结果。使用伯努利模型的分类准确度计算为98.4%,而评估的精度指标为92.7%,召回值为90.6%。F分数衡量指标得出的最佳值为91.4%。Bernoulli模型的执行时间仅为12秒。

更新日期:2020-09-18
down
wechat
bug