当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
D-CCA: A Decomposition-based Canonical Correlation Analysis for High-Dimensional Datasets
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2019-04-11 , DOI: 10.1080/01621459.2018.1543599
Hai Shu 1 , Xiao Wang 2 , Hongtu Zhu 1, 3
Affiliation  

ABSTRACT A typical approach to the joint analysis of two high-dimensional datasets is to decompose each data matrix into three parts: a low-rank common matrix that captures the shared information across datasets, a low-rank distinctive matrix that characterizes the individual information within a single dataset, and an additive noise matrix. Existing decomposition methods often focus on the orthogonality between the common and distinctive matrices, but inadequately consider the more necessary orthogonal relationship between the two distinctive matrices. The latter guarantees that no more shared information is extractable from the distinctive matrices. We propose decomposition-based canonical correlation analysis (D-CCA), a novel decomposition method that defines the common and distinctive matrices from the space of random variables rather than the conventionally used Euclidean space, with a careful construction of the orthogonal relationship between distinctive matrices. D-CCA represents a natural generalization of the traditional canonical correlation analysis. The proposed estimators of common and distinctive matrices are shown to be consistent and have reasonably better performance than some state-of-the-art methods in both simulated data and the real data analysis of breast cancer data obtained from The Cancer Genome Atlas. Supplementary materials for this article are available online.

中文翻译:

D-CCA:基于分解的高维数据集典型相关分析

摘要 联合分析两个高维数据集的一种典型方法是将每个数据矩阵分解为三个部分:一个低阶公共矩阵,用于捕获跨数据集的共享信息,一个低阶独特矩阵,用于表征数据集内的单个信息。单个数据集和一个加性噪声矩阵。现有的分解方法往往侧重于公共矩阵和特殊矩阵之间的正交性,而没有充分考虑两个特殊矩阵之间更必要的正交关系。后者保证不再从独特矩阵中提取共享信息。我们提出基于分解的典型相关分析(D-CCA),一种新颖的分解方法,它从随机变量的空间而不是传统使用的欧几里得空间中定义公共矩阵和特殊矩阵,并仔细构建了特殊矩阵之间的正交关系。D-CCA 代表了传统典型相关分析的自然概括。在模拟数据和从癌症基因组图谱中获得的乳腺癌数据的真实数据分析中,共同和独特矩阵的建议估计量被证明是一致的,并且比一些最先进的方法具有更好的性能。本文的补充材料可在线获取。D-CCA 代表了传统典型相关分析的自然概括。在模拟数据和从癌症基因组图谱中获得的乳腺癌数据的真实数据分析中,共同和独特矩阵的建议估计量被证明是一致的,并且比一些最先进的方法具有更好的性能。本文的补充材料可在线获取。D-CCA 代表了传统典型相关分析的自然概括。在模拟数据和从癌症基因组图谱中获得的乳腺癌数据的真实数据分析中,共同和独特矩阵的建议估计量被证明是一致的,并且比一些最先进的方法具有更好的性能。本文的补充材料可在线获取。
更新日期:2019-04-11
down
wechat
bug