Estimation of Subgraph Densities in Noisy Networks,Journal of the American Statistical Association

当前位置： X-MOL 学术 › J. Am. Stat. Assoc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Estimation of Subgraph Densities in Noisy Networks
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2020-07-20 , DOI: 10.1080/01621459.2020.1778482
Jinyuan Chang ₁ , Eric D. Kolaczyk ₂ , Qiwei Yao ₃

Affiliation

Abstract

While it is common practice in applied network analysis to report various standard network summary statistics, these numbers are rarely accompanied by uncertainty quantification. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Here we study the problem of estimating the density of an arbitrary subgraph, given a noisy version of some underlying network as data. Under a simple model of network error, we show that consistent estimation of such densities is impossible when the rates of error are unknown and only a single network is observed. Accordingly, we develop method-of-moment estimators of network subgraph densities and error rates for the case where a minimal number of network replicates are available. These estimators are shown to be asymptotically normal as the number of vertices increases to infinity. We also provide confidence intervals for quantifying the uncertainty in these estimates based on the asymptotic normality. To construct the confidence intervals, a new and nonstandard bootstrap method is proposed to compute asymptotic variances, which is infeasible otherwise. We illustrate the proposed methods in the context of gene coexpression networks. Supplementary materials for this article are available online.

中文翻译：

噪声网络中子图密度的估计

摘要

虽然在应用网络分析中报告各种标准网络汇总统计数据是一种常见的做法，但这些数字很少伴随着不确定性量化。然而，在网络构建基础的测量中或在网络构建过程本身中固有的任何错误都必须传播到报告的任何汇总统计数据中。在这里，我们研究了估计任意子图的密度的问题，给定一些底层网络的噪声版本作为数据。在一个简单的网络错误模型下，我们表明，当错误率未知且仅观察到单个网络时，不可能对此类密度进行一致估计。因此，我们开发了网络子图密度和错误率的矩量法估计器，用于网络复制数量最少的情况。随着顶点数量增加到无穷大，这些估计量被证明是渐近正态的。我们还提供了基于渐近正态性量化这些估计中的不确定性的置信区间。为了构建置信区间，提出了一种新的非标准引导方法来计算渐近方差，否则这是不可行的。我们在基因共表达网络的背景下说明了所提出的方法。本文的补充材料可在线获取。我们还提供了基于渐近正态性量化这些估计中的不确定性的置信区间。为了构建置信区间，提出了一种新的非标准引导方法来计算渐近方差，否则这是不可行的。我们在基因共表达网络的背景下说明了所提出的方法。本文的补充材料可在线获取。我们还提供了基于渐近正态性量化这些估计中的不确定性的置信区间。为了构建置信区间，提出了一种新的非标准引导方法来计算渐近方差，否则这是不可行的。我们在基因共表达网络的背景下说明了所提出的方法。本文的补充材料可在线获取。

更新日期：2020-07-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>