当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
AOBTM: Adaptive Online Biterm Topic Modeling for Version Sensitive Short-texts Analysis
arXiv - CS - Information Retrieval Pub Date : 2020-09-13 , DOI: arxiv-2009.09930
Mohammad Abdul Hadi and Fatemeh H Fard

Analysis of mobile app reviews has shown its important role in requirement engineering, software maintenance and evolution of mobile apps. Mobile app developers check their users' reviews frequently to clarify the issues experienced by users or capture the new issues that are introduced due to a recent app update. App reviews have a dynamic nature and their discussed topics change over time. The changes in the topics among collected reviews for different versions of an app can reveal important issues about the app update. A main technique in this analysis is using topic modeling algorithms. However, app reviews are short texts and it is challenging to unveil their latent topics over time. Conventional topic models suffer from the sparsity of word co-occurrence patterns while inferring topics for short texts. Furthermore, these algorithms cannot capture topics over numerous consecutive time-slices. Online topic modeling algorithms speed up the inference of topic models for the texts collected in the latest time-slice by saving a fraction of data from the previous time-slice. But these algorithms do not analyze the statistical-data of all the previous time-slices, which can confer contributions to the topic distribution of the current time-slice. We propose Adaptive Online Biterm Topic Model (AOBTM) to model topics in short texts adaptively. AOBTM alleviates the sparsity problem in short-texts and considers the statistical-data for an optimal number of previous time-slices. We also propose parallel algorithms to automatically determine the optimal number of topics and the best number of previous versions that should be considered in topic inference phase. Automatic evaluation on collections of app reviews and real-world short text datasets confirm that AOBTM can find more coherent topics and outperforms the state-of-the-art baselines.

中文翻译:

AOBTM:用于版本敏感短文本分析的自适应在线双项主题建模

对移动应用程序评论的分析表明其在移动应用程序的需求工程、软件维护和演进中的重要作用。移动应用程序开发人员经常检查其用户的评论,以澄清用户遇到的问题或捕获由于最近的应用程序更新而引入的新问题。应用评论具有动态特性,其讨论的主题会随着时间而变化。针对不同版本的应用程序收集的评论中主题的变化可以揭示有关应用程序更新的重要问题。此分析中的一项主要技术是使用主题建模算法。然而,应用评论是简短的文本,随着时间的推移揭示其潜在主题具有挑战性。传统的主题模型在为短文本推断主题时会受到单词共现模式的稀疏性的影响。此外,这些算法无法在多个连续时间片上捕获主题。在线主题建模算法通过保存前一个时间片中的一小部分数据,加快了对最新时间片中收集的文本的主题模型的推断。但是这些算法不会分析所有先前时间片的统计数据,这可以对当前时间片的主题分布做出贡献。我们提出了自适应在线双项主题模型(AOBTM)来自适应地对短文本中的主题进行建模。AOBTM 缓解了短文本中的稀疏问题,并考虑了最佳数量的先前时间片的统计数据。我们还提出了并行算法来自动确定主题推理阶段应考虑的最佳主题数量和最佳先前版本数量。
更新日期:2020-09-22
down
wechat
bug