当前位置: X-MOL 学术Commun. Stat. Simul. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Monte Carlo ensemble correlation coefficient for association detection
Communications in Statistics - Simulation and Computation ( IF 0.8 ) Pub Date : 2020-10-04 , DOI: 10.1080/03610918.2020.1823413
Wejdan Deebani 1 , Nezamoddin N. Kachouie 2
Affiliation  

Abstract

Data characteristics are often summarized and represented by a set of variables. Identifying the relationship between these variables is crucial for prediction, hypothesis testing, and decision making. The relation between two variables is often quantified using a correlation factor. Once the correlation between a response and an independent variable is quantified, it can be used to make predictions regarding response variable for the observed factor. That is, if two variables are correlated, by observing one, we can make predictions about the other one. A more accurate prediction can be made where there is strong relationship between variables. Several correlation factors have been introduced. Among them, Pearson’s Correlation Coefficient has been commonly used, while Distance Correlation and Maximal Information Coefficient have been recently introduced to address the shortcomings of Pearson’s Correlation Coefficient. These correlation coefficients are developed to measure associations in different trends. For example, Pearson’s Correlation is used when dealing with linear trends while Spearman’s correlation is used when dealing with monotonic trends. However, in many applications, the underlying relationship is not obvious to determine the appropriate choice of the correlation coefficient. In this paper, we compare these factors through a series of simulations and we propose a single generic factor by aggregating these factors for general applications.



中文翻译:

用于关联检测的蒙特卡罗系综相关系数

摘要

数据特征通常由一组变量来概括和表示。确定这些变量之间的关系对于预测、假设检验和决策制定至关重要。两个变量之间的关系通常使用相关因子进行量化。一旦响应和自变量之间的相关性被量化,它就可以用于对观察到的因素的响应变量进行预测。也就是说,如果两个变量相关,通过观察一个变量,我们可以对另一个变量做出预测。如果变量之间存在很强的关系,则可以做出更准确的预测。引入了几个相关因素。其中,皮尔逊相关系数已被普遍使用,而最近引入了距离相关和最大信息系数来解决皮尔逊相关系数的缺点。这些相关系数用于衡量不同趋势中的关联。例如,在处理线性趋势时使用 Pearson 相关性,而在处理单调趋势时使用 Spearman 相关性。However, in many applications, the underlying relationship is not obvious to determine the appropriate choice of the correlation coefficient. 在本文中,我们通过一系列模拟比较了这些因素,并通过将这些因素汇总用于一般应用,提出了一个单一的通用因素。这些相关系数用于衡量不同趋势中的关联。例如,在处理线性趋势时使用 Pearson 相关性,而在处理单调趋势时使用 Spearman 相关性。However, in many applications, the underlying relationship is not obvious to determine the appropriate choice of the correlation coefficient. 在本文中,我们通过一系列模拟比较了这些因素,并通过将这些因素汇总用于一般应用,提出了一个单一的通用因素。这些相关系数用于衡量不同趋势中的关联。例如,在处理线性趋势时使用 Pearson 相关性,而在处理单调趋势时使用 Spearman 相关性。However, in many applications, the underlying relationship is not obvious to determine the appropriate choice of the correlation coefficient. 在本文中,我们通过一系列模拟比较了这些因素,并通过将这些因素汇总用于一般应用,提出了一个单一的通用因素。

更新日期:2020-10-04
down
wechat
bug