当前位置: X-MOL 学术ACM Trans. Knowl. Discov. Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Maximum Likelihood Estimation of Power-law Degree Distributions via Friendship Paradox-based Sampling
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2021-05-19 , DOI: 10.1145/3451166
Buddhika Nettasinghe 1 , Vikram Krishnamurthy 1
Affiliation  

This article considers the problem of estimating a power-law degree distribution of an undirected network using sampled data. Although power-law degree distributions are ubiquitous in nature, the widely used parametric methods for estimating them (e.g., linear regression on double-logarithmic axes and maximum likelihood estimation with uniformly sampled nodes) suffer from the large variance introduced by the lack of data-points from the tail portion of the power-law degree distribution. As a solution, we present a novel maximum likelihood estimation approach that exploits the friendship paradox to sample more efficiently from the tail of the degree distribution. We analytically show that the proposed method results in a smaller bias, variance and a Cramèr–Rao lower bound compared to the vanilla maximum likelihood estimate obtained with uniformly sampled nodes (which is the most commonly used method in literature). Detailed numerical and empirical results are presented to illustrate the performance of the proposed method under different conditions and how it compares with alternative methods. We also show that the proposed method and its desirable properties (i.e., smaller bias, variance, and Cramèr–Rao lower bound compared to vanilla method based on uniform samples) extend to parametric degree distributions other than the power-law such as exponential degree distributions as well. All the numerical and empirical results are reproducible and the code is publicly available on Github.

中文翻译:

通过基于友谊悖论的采样的幂律度分布的最大似然估计

本文考虑使用采样数据估计无向网络的幂律度分布的问题。尽管幂律度分布在本质上无处不在,但广泛使用的估计它们的参数方法(例如,双对数轴上的线性回归和均匀采样节点的最大似然估计)由于缺乏数据而引入了大方差 -幂律度分布尾部的点。作为一种解决方案,我们提出了一种新颖的最大似然估计方法,该方法利用友谊悖论从度分布的尾部更有效地采样。我们分析表明,与使用均匀采样节点获得的普通最大似然估计(这是文献中最常用的方法)相比,所提出的方法导致更小的偏差、方差和 Cramer-Rao 下限。给出了详细的数值和实证结果,以说明所提出的方法在不同条件下的性能以及它与替代方法的比较。我们还表明,所提出的方法及其理想特性(即,与基于均匀样本的普通方法相比,具有更小的偏差、方差和 Cramer-Rao 下限)扩展到除幂律之外的参数度分布,例如指数度分布也是。
更新日期:2021-05-19
down
wechat
bug