当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Characterization of topic-based online communities by combining network data and user generated content
Statistics and Computing ( IF 2.2 ) Pub Date : 2020-05-22 , DOI: 10.1007/s11222-020-09947-5
Mirai Igarashi , Nobuhiko Terui

This study proposes a model for characterizing online communities by combining two types of data: network data and user-generated-content (UGC). The existing models for detecting the community structure of a network employ only network information. However, not all people connected in a network share the same interests. For instance, even if students belong to the same community of “school,” they may have various hobbies such as music, books, or sports. Hence, it is more realistic and beneficial for companies to identify communities according to their interests uncovered by their communications on social media. In addition, people may belong to multiple communities such as family, work, and online friends. Our model explores multiple overlapping communities according to their topics identified using two types of data jointly. By way of validating the main features of the proposed model, our simulation study shows that the model correctly identifies the community structure that could not be found without considering both network data and UGC. Furthermore, an empirical analysis using Twitter data clarifies that our model can find realistic and meaningful community structures from large online networks and has a good predictive performance.

中文翻译:

通过组合网络数据和用户生成的内容来表征基于主题的在线社区

这项研究提出了一种通过组合两种类型的数据来表征在线社区的模型:网络数据和用户生成的内容(UGC)。用于检测网络社区结构的现有模型仅使用网络信息。但是,并非网络中连接的所有用户都有相同的兴趣。例如,即使学生属于“学校”的同一个社区,他们也可能有各种各样的爱好,例如音乐,书籍或运动。因此,对于公司而言,根据其在社交媒体上的交流所揭示的兴趣来识别社区是更加现实和有益的。此外,人们可能属于多个社区,例如家庭,工作和在线朋友。我们的模型根据联合使用两种类型数据确定的主题探索了多个重叠的社区。通过验证所提出模型的主要特征,我们的仿真研究表明,该模型正确地标识了在不考虑网络数据和UGC的情况下找不到的社区结构。此外,使用Twitter数据进行的经验分析表明,我们的模型可以从大型在线网络中找到现实而有意义的社区结构,并且具有良好的预测性能。
更新日期:2020-05-22
down
wechat
bug