Squaring Things Up with R2: What It Is and What It Can (and Cannot) Tell You,Journal of Analytical Toxicology

当前位置： X-MOL 学术 › J. Anal. Toxicol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Squaring Things Up with R2: What It Is and What It Can (and Cannot) Tell You
Journal of Analytical Toxicology ( IF 2.3 ) Pub Date : 2021-04-12 , DOI: 10.1093/jat/bkab036
Félix Camirand Lemyre _{1,

2,

3} , Kevin Chalifoux _{1,

4} , Brigitte Desharnais ₄ , Pascal Mireault ₄

Affiliation

The coefficient of correlation (r) and the coefficient of determination (R2 or r2) have long been used in analytical chemistry, bioanalysis and forensic toxicology as figures demonstrating linearity of the calibration data in method validation. We clarify here what these two figures are and why they should not be used for this purpose in the context of model fitting for prediction. R2 evaluates whether the data are better explained by the regression model used than by no model at all (i.e., a flat line of slope = 0 and intercept $\bar y$), and to what degree. Hopefully, in the context of calibration curves, the fact that a linear regression better explains the data than no model at all should not be a point of contention. Upon closer examination, a series of restrictions appear in the interpretation of these coefficients. They cannot indicate whether the dataset at hand is linear or not, because they assume that the regression model used is an adequate model for the data. For the same reason, they cannot disprove the existence of another functional relationship in the data. By definition, they are influenced by the variability of the data. The slope of the calibration curve will also change their value. Finally, when heteroscedastic data are analyzed, the coefficients will be influenced by calibration levels spacing within the dynamic range, unless a weighted version of the equations is used. With these considerations in mind, we suggest to stop using r and R2 as figures of merit to demonstrate linearity of calibration curves in method validations. Of course, this does not preclude their use in other contexts. Alternative paths for evaluation of linearity and calibration model validity are summarily presented.

中文翻译：

用 R2 解决问题：它是什么以及它能（和不能）告诉你什么

相关系数 (r) 和测定系数 (R2 或 r2) 长期以来一直用于分析化学、生物分析和法医毒理学，作为证明方法验证中校准数据线性的数字。我们在这里澄清这两个数字是什么，以及为什么在模型拟合预测的情况下不应将它们用于此目的。R2 评估使用的回归模型是否比完全没有模型更好地解释数据（即斜率 = 0 和截距 $\bar y$ 的平线），以及在何种程度上。希望在校准曲线的背景下，线性回归比没有模型更好地解释数据这一事实不应该成为争论的焦点。经过仔细检查，在解释这些系数时出现了一系列限制。他们无法指出手头的数据集是否是线性的，因为他们假设使用的回归模型是数据的适当模型。出于同样的原因，他们不能反驳数据中存在另一个函数关系。根据定义，它们受到数据可变性的影响。校准曲线的斜率也会改变它们的值。最后，当分析异方差数据时，系数将受到动态范围内校准水平间距的影响，除非使用方程的加权版本。考虑到这些因素，我们建议停止使用 r 和 R2 作为品质因数来证明方法验证中校准曲线的线性。当然，这并不排除它们在其他情况下的使用。

更新日期：2021-04-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11