当前位置: X-MOL 学术PLOS ONE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A non-parametric effect-size measure capturing changes in central tendency and data distribution shape
PLOS ONE ( IF 3.7 ) Pub Date : 2020-09-24 , DOI: 10.1371/journal.pone.0239623
Jörn Lötsch , Alfred Ultsch

Motivation

Calculating the magnitude of treatment effects or of differences between two groups is a common task in quantitative science. Standard effect size measures based on differences, such as the commonly used Cohen's, fail to capture the treatment-related effects on the data if the effects were not reflected by the central tendency. The present work aims at (i) developing a non-parametric alternative to Cohen’s d, which (ii) circumvents some of its numerical limitations and (iii) involves obvious changes in the data that do not affect the group means and are therefore not captured by Cohen’s d.

Results

We propose "Impact” as a novel non-parametric measure of effect size obtained as the sum of two separate components and includes (i) a difference-based effect size measure implemented as the change in the central tendency of the group-specific data normalized to pooled variability and (ii) a data distribution shape-based effect size measure implemented as the difference in probability density of the group-specific data. Results obtained on artificial and empirical data showed that “Impact”is superior to Cohen's d by its additional second component in detecting clearly visible effects not reflected in central tendencies. The proposed effect size measure is invariant to the scaling of the data, reflects changes in the central tendency in cases where differences in the shape of probability distributions between subgroups are negligible, but captures changes in probability distributions as effects and is numerically stable even if the variances of the data set or its subgroups disappear.

Conclusions

The proposed effect size measure shares the ability to observe such an effect with machine learning algorithms. Therefore, the proposed effect size measure is particularly well suited for data science and artificial intelligence-based knowledge discovery from big and heterogeneous data.



中文翻译:

一种非参数效应量度度量,可捕获中心趋势和数据分布形状的变化

动机

计算治疗效果的大小或两组之间的差异是定量科学中的常见任务。如果效果没有被集中趋势所反映,则基于差异的标准效果大小度量(例如常用的Cohen)无法捕获与数据相关的与治疗相关的效果。当前的工作旨在(i)开发Cohen d的非参数替代方案,该方案(ii)规避了其一些数值限制,并且(iii)涉及数据的明显变化,这些变化不影响组均值,因此未捕获由科恩的。

结果

我们提出“影响”作为一种新颖的非参数度量,作为两个独立成分之和获得的效应大小,包括(i)基于差异的效应大小度量,作为标准化的特定于群体的数据的集中趋势的变化而实现(2)基于群体分布数据的概率密度差异实施的基于数据分布形状的效应量度测量,从人工和经验数据获得的结果表明,“影响力”比Cohen d强第二个组成部分是检测清晰可见的,没有反映在中心趋势中的效果。拟议的效果大小量度不变于数据的缩放比例,在子组之间概率分布形状的差异可忽略的情况下反映了中心趋势的变化,但即使数据集或其子组的方差消失,也可以捕获概率分布的变化作为影响,并且在数值上是稳定的。

结论

拟议的效果量度与机器学习算法共享观察这种效果的能力。因此,所提出的效果量度度量特别适合于数据科学和从大数据和异构数据中发现基于人工智能的知识。

更新日期:2020-09-24
down
wechat
bug