In evolving networks, PageRank can exhibit stronger age biases than indegree.
•
We use a mean-field approximation to understand the origin of PageRank’s age bias.
•
We estimate the expected PageRank score of nodes of a given indegree and age.
•
The obtained estimate is more accurate than estimates based on indegree alone.
•
We validate our results on empirical data and a growing network model with varying assortativity.
Abstract
Biases impair the effectiveness of algorithms. For example, the age bias of the widely-used PageRank algorithm impairs its ability to effectively rank nodes in growing networks. PageRank’s temporal bias cannot be fully explained by existing analytic results that predict a linear relation between the expected PageRank score and the indegree of a given node. We show that in evolving networks, under a mean-field approximation, the expected PageRank score of a node can be expressed as the product of the node’s indegree and a previously-neglected age factor which can “amplify” the indegree’s age bias. We use two well-known empirical networks to show that our analytic results explain the observed PageRank’s age bias and, when there is an age bias amplification, they enable estimates of the node PageRank score that are more accurate than estimates based solely on local structural information. Accuracy gains are larger in degree–degree correlated networks, as revealed by a growing directed network model with tunable assortativity. Our approach can be used to analytically study other kinds of ranking bias.