当前位置: X-MOL 学术Scand. J. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Geometric consistency of principal component scores for high‐dimensional mixture models and its application
Scandinavian Journal of Statistics ( IF 0.8 ) Pub Date : 2019-12-23 , DOI: 10.1111/sjos.12432
Kazuyoshi Yata 1 , Makoto Aoshima 1
Affiliation  

In this article, we consider clustering based on principal component analysis (PCA) for high‐dimensional mixture models. We present theoretical reasons why PCA is effective for clustering high‐dimensional data. First, we derive a geometric representation of high‐dimension, low‐sample‐size (HDLSS) data taken from a two‐class mixture model. With the help of the geometric representation, we give geometric consistency properties of sample principal component scores in the HDLSS context. We develop ideas of the geometric representation and provide geometric consistency properties for multiclass mixture models. We show that PCA can cluster HDLSS data under certain conditions in a surprisingly explicit way. Finally, we demonstrate the performance of the clustering using gene expression datasets.

中文翻译:

高维混合模型主成分评分的几何一致性及其应用

在本文中,我们考虑基于主成分分析(PCA)的高维混合模型聚类。我们提出了PCA有效地对高维数据进行聚类的理论原因。首先,我们从两类混合模型中得出高维,低样本量(HDLSS)数据的几何表示。借助几何表示,我们在HDLSS上下文中给出了样本主成分评分的几何一致性属性。我们开发了几何表示的想法,并为多类混合模型提供了几何一致性属性。我们表明,PCA可以在某些条件下以令人惊讶的显式方式对HDLSS数据进行聚类。最后,我们演示了使用基因表达数据集进行聚类的性能。
更新日期:2019-12-23
down
wechat
bug