当前位置: X-MOL 学术Stat. Methods Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A copula-based method of classifying individuals into binary disease categories using dependent biomarkers
Statistical Methods & Applications ( IF 1 ) Pub Date : 2020-01-27 , DOI: 10.1007/s10260-020-00507-9
Shofiqul Islam , Sonia Anand , Jemila Hamid , Lehana Thabane , Joseph Beyene

Classification of a disease often depends on more than one test, and the tests can be interrelated. Under the incorrect assumption of independence, the test result using dependent biomarkers can lead to a conflicting disease classification. We develop a copula-based method for this purpose that takes dependency into account and leads to a unique decision. We first construct the joint probability distribution of the biomarkers considering Frank’s, Clayton’s and Gumbel’s copulas. We then develop the classification method and perform a comprehensive simulation. Using simulated data sets, we study the statistical properties of joint probability distributions and determine the joint threshold with maximum classification accuracy. Our simulation study results show that parameter estimates for the copula-based bivariate distributions are not biased. We observe that the thresholds for disease classification converge to a stationary distribution across different choices of copulas. We also observe that the classification accuracy decreases with the increasing value of the dependence parameter of the copulas. Finally, we illustrate our method with a real data example, where we identify the joint threshold of Apolipoprotein B to Apolipoprotein A1 ratio and total cholesterol to high-density lipoprotein ratio for the classification of myocardial infarction. We conclude, the copula-based method works well in identifying the joint threshold of two dependent biomarkers for an outcome classification. Our method is flexible and allows modeling broad classes of bivariate distributions that take dependency into account. The threshold may allow clinicians to classify uniquely individuals at risk of developing the disease and plan for early intervention.



中文翻译:

一种基于copula的方法,使用相关生物标记物将个体分为二元疾病类别

疾病的分类通常取决于多个测试,并且这些测试可以相互关联。在错误的独立性假设下,使用依赖的生物标记物的测试结果可能会导致相互矛盾的疾病分类。为此,我们开发了一种基于copula的方法,该方法考虑了依赖性并得出了唯一的决定。我们首先考虑弗兰克(Frank),克莱顿(Clayton)和古姆贝尔(Gumbel)的copulas构建生物标记的联合概率分布。然后,我们开发分类方法并执行全面的模拟。使用模拟数据集,我们研究联合概率分布的统计特性,并以最大的分类精度确定联合阈值。我们的模拟研究结果表明,基于copula的双变量分布的参数估计没有偏差。我们观察到疾病分类的阈值收敛到在不同选择的copulas上的平稳分布。我们还观察到,分类准确度随着copulas依赖参数的增加而降低。最后,我们用一个真实的数据示例来说明我们的方法,在该示例中,我们确定了载脂蛋白B与载脂蛋白A1比率以及总胆固醇与高密度脂蛋白比率的联合阈值,用于心肌梗死的分类。我们得出的结论是,基于copula的方法可以很好地识别出两个用于结果分类的相关生物标志物的联合阈值。我们的方法是灵活的,并允许对考虑相关性的双变量分布的广泛类别进行建模。

更新日期:2020-01-27
down
wechat
bug