当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sparse linear discriminant analysis for multiview structured data
Biometrics ( IF 1.4 ) Pub Date : 2021-03-19 , DOI: 10.1111/biom.13458
Sandra E Safo 1 , Eun Jeong Min 2 , Lillian Haine 1
Affiliation  

Classification methods that leverage the strengths of data from multiple sources (multiview data) simultaneously have enormous potential to yield more powerful findings than two-step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA), and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multiview data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected. We demonstrate the effectiveness of our methods on a set of synthetic datasets and explore their use in identifying potential nontraditional risk factors that discriminate healthy patients at low versus high risk for developing atherosclerosis cardiovascular disease in 10 years. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multiview data and to perform classification.

中文翻译:


多视图结构化数据的稀疏线性判别分析



同时利用多个来源(多视图数据)数据优势的分类方法具有比两步方法(先关联后分类)产生更强大发现的巨大潜力。我们提出了两种方法,稀疏综合判别分析(SIDA)和结合网络信息的 SIDA(SIDANet),用于联合关联和分类研究。该方法考虑多视图数据之间的整体关联,以及在选择相关的判别向量时每个视图内的分离,并最佳地将受试者分为不同的类别。 SIDANet 是最早将先验结构信息纳入联合关联和分类研究的方法之一。它使用图的归一化拉普拉斯算子来平滑预测变量的系数,从而鼓励选择连接的预测变量。我们在一组综合数据集上证明了我们的方法的有效性,并探索了它们在识别潜在的非传统风险因素方面的用途,这些因素可以区分 10 年内患动脉粥样硬化性心血管疾病的低风险与高风险的健康患者。如果目标是关联多视图数据并执行分类,我们的研究结果强调了联合关联和分类方法的好处。
更新日期:2021-03-19
down
wechat
bug