Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation.,Methods of Information in Medicine

当前位置： X-MOL 学术 › Methods Inf. Med. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation.
Methods of Information in Medicine ( IF 1.3 ) Pub Date : 2019-11-19 , DOI: 10.1055/s-0039-1693732
Giovanna Cilluffo _{1,

2} , Salvatore Fasola _{1,

2} , Giuliana Ferrante ₃ , Laura Montalbano ₁ , Ilaria Baiardini ₄ , Luciana Indinnimeo ₅ , Giovanni Viegi _{1,

6} , Joao A Fonseca ₇ , Stefania La Grutta ₁

Affiliation

BACKGROUND The use of receiver operating characteristic curves, or "ROC analysis," has become quite common in biomedical research to support decisions. However, sensitivity, specificity, and misclassification rates are still often estimated using the training sample, overlooking the risk of overrating the test performance. METHODS A simulation study was performed to highlight the inferential implications of splitting (or not) the dataset into training and test set. The normality assumption was made for the classifier given the disease status, and the Youden's criterion considered for the detection of the optimal cutoff. Then, an ROC analysis with sample split was applied to assess the discriminant validity of the Italian version of the Control of Allergic Rhinitis and Asthma Test (CARATkids) questionnaire for children with asthma and rhinitis, for which recent studies may have reported liberal performance estimates. RESULTS The simulation study showed that both single split and cross-validation (CV) provided unbiased estimators of sensitivity, specificity, and misclassification rate, therefore allowing computation of confidence intervals. For the Italian CARATkids questionnaire, the misclassification rate estimated by fivefold CV was 0.22, with 95% confidence interval 0.14 to 0.30, indicating an acceptable discriminant validity. CONCLUSIONS Splitting into training and test set avoids overrating the test performance in ROC analysis. Validated through this method, the Italian CARATkids is valid for assessing disease control in children with asthma and rhinitis.

中文翻译：

在没有测试集的情况下，ROC分析中的分类器性能高估：来自模拟和意大利CARATkids验证的证据。

背景技术在生物医学研究中，为了支持决策，接收机工作特性曲线或“ ROC分析”的使用已经变得非常普遍。但是，仍然经常使用训练样本来估计敏感性，特异性和分类错误率，而忽略了高估测试性能的风险。方法进行了模拟研究，以强调将数据集分为（或不分为）训练和测试集的推断含义。在给定疾病状态的情况下，对分类器进行正态性假设，并考虑使用优登标准来确定最佳临界值。然后，应用ROC分析和样本分割来评估意大利版本的《过敏性鼻炎和哮喘控制对照试验》（CARATkids）问卷对哮喘和鼻炎儿童的判别有效性，最近的研究可能报告了该试验的自由度评估。结果模拟研究表明，单次拆分和交叉验证（CV）均提供了敏感性，特异性和误分类率的无偏估计，因此可以计算置信区间。对于意大利CARATkids问卷，五重CV估计的错误分类率为0.22，95％置信区间为0.14至0.30，表明可接受的判别有效性。结论分为训练和测试集可避免在ROC分析中高估测试性能。通过此方法验证的

更新日期：2019-11-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11