当前位置: X-MOL 学术IEEE Trans. Knowl. Data. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GraphBoot: Quantifying Uncertainty in Node Feature Learning on Large Networks
IEEE Transactions on Knowledge and Data Engineering ( IF 8.9 ) Pub Date : 2021-01-01 , DOI: 10.1109/tkde.2019.2925355
Cuneyt Akcora , Yulia R. Gel , Murat Kantarcioglu , Vyacheslav Lyubchich , Bhavani Thuraisingham

In recent years, as online social networks continue to grow in size, estimating node features, such as sociodemographics, preferences and health status, in a scalable and reliable way has become a primary research direction in social network mining. Although many techniques have been developed for estimating various node features, quantifying uncertainty in such estimations has received little attention. Furthermore, most existing methods study networks parametrically, which limits insights about necessary quantity of queried data, reliable feature estimation, and estimator uncertainty. Uncertainty quantification is critical for answering key questions, such as, given a limited availability of social network data, how much data should be queried from the network?, and which node features can be learned reliably? More importantly, how can we evaluate uncertainty of our estimators? Uncertainty quantification is not equivalent to network sampling but constitutes a key complementary concept to sampling and the associated reliability analysis. To our knowledge, this paper is the first work that sheds light on uncertainty quantification and uncertainty propagation in social network feature mining. We propose a novel non-parametric bootstrap method for uncertainty analysis of node features in social network mining, derive its asymptotic properties, and demonstrate its effectiveness with extensive experiments. Furthermore, we develop a new metric based on dispersion of estimations, enabling analysts to assess how much more information is needed for increasing prediction reliability based on the estimated uncertainty. We demonstrate the effectiveness of our new uncertainty quantification methodology with extensive experiments on real life social networks, and a case study of mental health on Twitter.


