当前位置: X-MOL 学术ACM Comput. Surv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Topic Modeling Using Latent Dirichlet allocation
ACM Computing Surveys ( IF 16.6 ) Pub Date : 2021-09-17 , DOI: 10.1145/3462478
Uttam Chauhan 1 , Apurva Shah 2
Affiliation  

We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling. Diverse technical challenges and future directions have been discussed.

中文翻译:

使用潜在狄利克雷分配的主题建模

如果不将它们总结成一个相对较小的子集,我们就无法处理庞大的文本语料库。理解如此庞大的文本库非常需要计算工具。概率主题建模通过在主题子空间中减少文档来发现和解释大量文档。在这项工作中,我们研究了主题建模技术的背景和进步。我们首先介绍了主题建模技术的初步知识,并回顾了它的扩展和变体,例如跨领域的主题建模、分层主题建模、词嵌入主题模型和多语言视角下的主题模型。此外,分布式环境下主题建模的研究工作,主题可视化方法也得到了探索。我们还简要介绍了主题模型的实现和评估技术。比较矩阵已经显示在各种主题建模类别的实验结果上。已经讨论了各种技术挑战和未来方向。
更新日期:2021-09-17
down
wechat
bug