当前位置: X-MOL 学术Psychological Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Why overfitting is not (usually) a problem in partial correlation networks.
Psychological Methods ( IF 10.929 ) Pub Date : 2022-04-14 , DOI: 10.1037/met0000437
Donald R Williams 1 , Josue E Rodriguez 1
Affiliation  

Network psychometrics is undergoing a time of methodological reflection. In part, this was spurred by the revelation that ℓ₁-regularization does not reduce spurious associations in partial correlation networks. In this work, we address another motivation for the widespread use of regularized estimation: the thought that it is needed to mitigate overfitting. We first clarify important aspects of overfitting and the bias-variance tradeoff that are especially relevant for the network literature, where the number of nodes or items in a psychometric scale are not large compared to the number of observations (i.e., a low p/n ratio). This revealed that bias and especially variance are most problematic in p/n ratios rarely encountered. We then introduce a nonregularized method, based on classical hypothesis testing, that fulfills two desiderata: (a) reducing or controlling the false positives rate and (b) quelling concerns of overfitting by providing accurate predictions. These were the primary motivations for initially adopting the graphical lasso (glasso). In several simulation studies, our nonregularized method provided more than competitive predictive performance, and, in many cases, outperformed glasso. It appears to be nonregularized, as opposed to regularized estimation, that best satisfies these desiderata. We then provide insights into using our methodology. Here we discuss the multiple comparisons problem in relation to prediction: stringent alpha levels, resulting in a sparse network, can deteriorate predictive accuracy. We end by emphasizing key advantages of our approach that make it ideal for both inference and prediction in network analysis.

中文翻译:

为什么过度拟合(通常)不是部分相关网络中的问题。

网络心理测量学正在经历方法论反思的时期。部分原因是,ℓ₁-正则化不会减少部分相关网络中的虚假关联。在这项工作中,我们解决了广泛使用正则化估计的另一个动机:需要它来减轻过度拟合的想法。我们首先澄清与网络文献特别相关的过度拟合和偏差方差权衡的重要方面,其中心理测量量表中的节点或项目数量与观察数量相比并不大(即低p/n比率)。这表明偏差,尤其是方差在p/n中是最有问题的很少遇到的比率。然后,我们介绍了一种基于经典假设检验的非正则化方法,它满足两个需要:(a) 减少或控制误报率和 (b) 通过提供准确的预测来消除过度拟合的担忧。这些是最初采用图形套索 (glasso) 的主要动机。在几项模拟研究中,我们的非正则化方法不仅提供了具有竞争力的预测性能,而且在许多情况下优于 glasso。它似乎是非正则化的,而不是正则化的估计,最能满足这些需求。然后,我们提供有关使用我们的方法的见解。在这里,我们讨论与预测相关的多重比较问题:严格的 alpha 水平会导致网络稀疏,从而降低预测准确性。网络分析中的推理和预测。
更新日期:2022-04-14
down
wechat
bug