当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrative sparse partial least squares
Statistics in Medicine ( IF 2 ) Pub Date : 2021-02-08 , DOI: 10.1002/sim.8900
Weijuan Liang 1 , Shuangge Ma 2 , Qingzhao Zhang 3 , Tingyu Zhu 4
Affiliation  

Partial least squares, as a dimension reduction technique, has become increasingly important for its ability to deal with problems with a large number of variables. Since noisy variables may weaken estimation performance, the sparse partial least squares (SPLS) technique has been proposed to identify important variables and generate more interpretable results. However, the small sample size of a single dataset limits the performance of conventional methods. An effective solution comes from gathering information from multiple comparable studies. Integrative analysis has essential importance in multidatasets analysis. The main idea is to improve performance by assembling raw data from multiple independent datasets and analyzing them jointly. In this article, we develop an integrative SPLS (iSPLS) method using penalization based on the SPLS technique. The proposed approach consists of two penalties. The first penalty conducts variable selection under the context of integrative analysis. The second penalty, a contrasted penalty, is imposed to encourage the similarity of estimates across datasets and generate more sensible and accurate results. Computational algorithms are developed. Simulation experiments are conducted to compare iSPLS with alternative approaches. The practical utility of iSPLS is shown in the analysis of two TCGA gene expression data.

中文翻译:

积分稀疏偏最小二乘

偏最小二乘法作为一种降维技术,因其处理大量变量问题的能力而变得越来越重要。由于噪声变量可能会削弱估计性能,因此已经提出了稀疏偏最小二乘 (SPLS) 技术来识别重要变量并产生更多可解释的结果。然而,单个数据集的小样本量限制了传统方法的性能。一个有效的解决方案来自于从多个可比研究中收集信息。综合分析在多数据集分析中具有重要意义。主要思想是通过组合来自多个独立数据集的原始数据并联合分析它们来提高性能。在本文中,我们开发了一种基于 SPLS 技术的使用惩罚的综合 SPLS (iSPLS) 方法。提议的方法包括两种处罚。第一个惩罚是在综合分析的背景下进行变量选择。第二个惩罚是对比惩罚,用于鼓励跨数据集的估计相似性并产生更明智和准确的结果。开发了计算算法。进行模拟实验以将 iSPLS 与替代方法进行比较。iSPLS 的实用性体现在对两个 TCGA 基因表达数据的分析中。进行模拟实验以将 iSPLS 与替代方法进行比较。iSPLS 的实用性体现在对两个 TCGA 基因表达数据的分析中。进行模拟实验以将 iSPLS 与替代方法进行比较。iSPLS 的实用性体现在对两个 TCGA 基因表达数据的分析中。
更新日期:2021-04-06
down
wechat
bug