当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CrawlSN: community-aware data acquisition with maximum willingness in online social networks
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2020-09-08 , DOI: 10.1007/s10618-020-00709-5
Bay-Yuan Hsu , Chia-Lin Tu , Ming-Yi Chang , Chih-Ya Shen

Real social network datasets with community structures are critical for evaluating various algorithms in Online Social Networks (OSNs). However, obtaining such community data from OSNs has recently become increasingly challenging due to privacy issues and government regulations. In this paper, we thus make our first attempt to address two important factors, i.e., user willingness and existence of community structure, to obtain more complete OSN data. We formulate a new research problem, namely Community-aware Data Acquisition with Maximum Willingness in Online Social Networks (CrawlSN), to identify a group of users from an OSN, such that the group is a socially tight community and the users’ willingness to contribute data is maximized. We prove that CrawlSN is NP-hard and inapproximable within any factor unless, and propose an effective algorithm, named Community-aware Group Identification with Maximum Willingness (CIW) with various processing strategies. We conduct an evaluation study with 1093 volunteers to validate our problem formulation and demonstrate that CrawlSN outperforms the other alternatives. We also perform extensive experiments on 7 real datasets and show that the proposed CIW outperforms the other baselines in both solution quality and efficiency.

中文翻译:

CrawlSN:在在线社交网络中以最大的意愿获取社区感知的数据

具有社区结构的真实社交网络数据集对于评估在线社交网络(OSN)中的各种算法至关重要。但是,由于隐私问题和政府法规的限制,从OSN获取此类社区数据近来变得越来越具有挑战性。因此,在本文中,我们首先尝试解决两个重要因素,即用户的意愿和社区结构的存在,以获得更完整的OSN数据。我们提出了一个新的研究问题,即在线社交网络CrawlSN)中具有最大意愿的社区感知数据获取,以便从OSN识别一组用户,从而使该组成为一个社交紧密的社区,并使用户提供数据的意愿最大化。我们证明,除非有任何因素,否则CrawlSN都是NP坚硬且不可近似的,并提出了一种有效的算法,该算法被称为具有最大意愿的社区感知群体识别CIW),具有各种处理策略。我们与1093名志愿者进行了一项评估研究,以验证我们的问题提法并证明CrawlSN胜过其他选择。我们还对7个真实数据集进行了广泛的实验,结果表明,所提出的CIW在解决方案质量和效率方面均优于其他基准。
更新日期:2020-09-08
down
wechat
bug