当前位置: X-MOL 学术ACM Comput. Surv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Survey of Sampling Method for Social Media Embeddedness Relationship
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2022-11-21 , DOI: 10.1145/3524105
Yingan Cui 1 , Xue Li 2 , Junhuai Li 1 , Huaijun Wang 1 , Xiaogang Chen 3
Affiliation  

Social media embeddedness relationships consist of online social networks formed by self-organized individual actors and significantly affect many aspects of our lives. Since the high cost and inefficiency of using population networks generated by social media embeddedness relationships to study practical issues, sampling techniques have become increasingly important than ever. Our work consists of three parts. We first comprehensively analyze current sampling selection methods, evaluation indexes, and evaluation methods in terms of technological evolution. In the second part, we systematically conduct sampling tests using representative large-scale social media datasets. The test results indicate that unequal-probability sampling methods can construct similar sample networks at the macroscale and microscale and outperform the equal-probability methods. However, non-negligible sampling errors at the mesoscale seriously affect the sampling reliability and validity. MANOVA tests show that the direct cause of sampling errors is the low in-degree nodes with medium-high betweenness located between the core and periphery, and current sampling methods cannot accurately sample such complex interconnected structures. In the third part, we summarize the pros and cons of current sampling methods and provide suggestions for future work.



中文翻译:

社交媒体嵌入关系抽样方法调查

社交媒体嵌入关系由自组织的个人行为者形成的在线社交网络组成,并显着影响我们生活的许多方面。由于使用由社交媒体嵌入关系生成的人口网络来研究实际问题成本高昂且效率低下,因此抽样技术变得比以往任何时候都越来越重要。我们的工作包括三个部分。我们首先从技术演进的角度综合分析当前的抽样选择方法、评价指标和评价方法。在第二部分中,我们使用具有代表性的大型社交媒体数据集系统地进行抽样测试。测试结果表明,不等概率抽样方法可以在宏观和微观尺度上构建相似的样本网络,并且优于等概率方法。然而,中尺度不可忽略的抽样误差严重影响抽样的可靠性和有效性。多元方差分析表明,抽样误差的直接原因是位于核心和外围之间的低入度中高介数节点,而目前的抽样方法无法对这种复杂的互连结构进行准确抽样。在第三部分,我们总结了当前抽样方法的优缺点,并为未来的工作提出了建议。多元方差分析表明,抽样误差的直接原因是位于核心和外围之间的低入度中高介数节点,而目前的抽样方法无法对这种复杂的互连结构进行准确抽样。在第三部分,我们总结了当前抽样方法的优缺点,并为未来的工作提出了建议。多元方差分析表明,抽样误差的直接原因是位于核心和外围之间的低入度中高介数节点,而目前的抽样方法无法对这种复杂的互连结构进行准确抽样。在第三部分,我们总结了当前抽样方法的优缺点,并为未来的工作提出了建议。

更新日期:2022-11-21
down
wechat
bug