当前位置: X-MOL 学术Astron. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TATTER: A hypothesis testing tool for multi-dimensional data
Astronomy and Computing ( IF 2.5 ) Pub Date : 2021-01-05 , DOI: 10.1016/j.ascom.2020.100445
A. Farahi , Y. Chen

The two-sample hypothesis test quantifies whether distributions p and q are different, given the corresponding finite samples drawn from each. This problem appears in a legion of applications in astronomy, ranging from data mining to data analysis and inference. For decades, the Kolmogorov–Smirnov test has been astronomers’ first choice to answer this question, but it has a major drawback, a generalization to multi-dimensional data sets is not straightforward. To fill this gap, we present a nonparametric estimator for comparing given multi-dimensional distributions drawn from them. This method employs a kernel function to construct an unbiased estimator of the Maximum Mean Discrepancy (MMD) distance between the two distributions that generated the observed data. We perform controlled numerical experiments in Gaussian, non-Gaussian, and multi-dimensional finite sample settings and test the performance of MMD estimator in each experiment. We then discuss some of the applications of this method in astronomy data analysis.



中文翻译:

TATTER:用于多维数据的假设检验工具

两样本假设检验可量化是否分布 pq给定每个样本对应的有限样本,它们是不同的。这个问题出现在天文学的众多应用中,从数据挖掘到数据分析和推断。几十年来,Kolmogorov–Smirnov检验一直是天文学家回答这个问题的首选,但是它有一个主要缺点,即对多维数据集的推广并不容易。为了填补这一空白,我们提出了一种非参数估计器,用于比较从中得出的给定多维分布。该方法采用核函数构造生成观察数据的两个分布之间的最大平均差异(MMD)距离的无偏估计量。我们在高斯,非高斯,和多维有限样本设置,并在每个实验中测试MMD估计器的性能。然后,我们讨论该方法在天文学数据分析中的一些应用。

更新日期:2021-02-03
down
wechat
bug