当前位置: X-MOL 学术Am. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Optimal Correlation-Based Prediction
The American Statistician ( IF 1.8 ) Pub Date : 2022-04-22 , DOI: 10.1080/00031305.2022.2051604
Matteo Bottai 1 , Taeho Kim 2 , Benjamin Lieberman 3 , George Luta 4, 5, 6 , Edsel Peña 7
Affiliation  

Abstract

This note examines, at the population-level, the approach of obtaining predictors h˜(X) of a random variable Y, given the joint distribution of (Y,X), by maximizing the mapping hκ(Y,h(X)) for a given correlation function κ(·,·). Commencing with Pearson’s correlation function, the class of such predictors is uncountably infinite. The least-squares predictor h* is an element of this class obtained by equating the expectations of Y and h(X) to be equal and the variances of h(X) and E(Y|X) to be also equal. On the other hand, replacing the second condition by the equality of the variances of Y and h(X), a natural requirement for some calibration problems, the unique predictor h** that is obtained has the maximum value of Lin’s (1989 Lin, L. (1989), “A Concordance Correlation Coefficient to Evaluate Reproducibility,” Biometrics, 45, 255268. DOI: 10.2307/2532051.[Crossref], [PubMed], [Web of Science ®] , [Google Scholar]) concordance correlation coefficient (CCC) with Y among all predictors. Since the CCC measures the degree of agreement, the new predictor h** is called the maximal agreement predictor. These predictors are illustrated for three special distributions: the multivariate normal distribution; the exponential distribution, conditional on covariates; and the Dirichlet distribution. The exponential distribution is relevant in survival analysis or in reliability settings, while the Dirichlet distribution is relevant for compositional data.



中文翻译:

基于最优相关的预测

摘要

本说明在人口层面检查了获取预测变量的方法H(X)的随机变量Y,给定联合分布(,X),通过最大化映射Hκ(,H(X))对于给定的相关函数κ(·,·). 从 Pearson 的相关函数开始,这种预测变量的类别是无穷无尽的。最小二乘预测器H*是此类的一个元素,通过将Y的期望与H(X)相等且方差为H(X)(|X)也是平等的。另一方面,用Y的方差相等代替第二个条件和H(X),一些校准问题的自然要求,唯一的预测器H**得到的具有 Lin 的最大值(1989 Lin, L. ( 1989 年),“评估再现性的一致性相关系数”,生物识别,45, 255268。DOI:10.2307/2532051[Crossref], [PubMed], [Web of Science ®]  , [Google Scholar] )在所有预测变量中Y一致性相关系数 (CCC)由于 CCC 衡量的是一致程度,因此新的预测器H**称为最大一致性预测器。这些预测变量针对三种特殊分布进行了说明:多元正态分布;指数分布,以协变量为条件;和狄利克雷分布。指数分布与生存分析或可靠性设置相关,而 Dirichlet 分布与成分数据相关。

更新日期:2022-04-22
down
wechat
bug