当前位置: X-MOL 学术Biom. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multiobjective semisupervised learning with a right-censored endpoint adapted to the multiple imputation framework
Biometrical Journal ( IF 1.7 ) Pub Date : 2021-06-27 , DOI: 10.1002/bimj.202000365
Lilith Faucheux 1, 2 , Vassili Soumelis 2, 3 , Sylvie Chevret 1, 4
Affiliation  

Semisupervised learning aims to use additional knowledge in the search for data structure. In clinical applications, including predictive information in the construction of a data-driven classification is of major importance. This work was motivated by a study that aimed to identify different patterns of immune parameters that would be associated with relapse-free survival in a cohort of breast cancer patients. Supervised and unsupervised objectives can be concomitantly optimized using multiobjective optimization. We propose such a procedure that addresses two challenges in the semisupervised approach, that is, missing data and additional knowledge based on survival time. The former was handled by using multiple imputation and consensus clustering. Survival information was incorporated in the supervised objective through the estimation of a cross-validation error of a Cox regression. A simulation study was performed to assess the performance of the proposed procedure. On complete datasets, the performances were compared to those of an existing modified multiobjective semisupervised learning method. The added value of including the survival data in the learning process was assessed by comparing the procedure to unsupervised learning. The proposed procedure showed better performance than the existing method, notably in the selection of the number of clusters. On incomplete datasets, the procedure showed little sensitivity to most of its parameters, even though a high number of imputations and partition initialization seeds improved the performance. The performance was degraded with a high proportion of missing data (40%) and with more ambiguous data structures. Simulation results and application on real data support the conclusion that our procedure enables the construction of a classification associated with a right-censored endpoint on a possibly incomplete dataset.

中文翻译:

具有适用于多重插补框架的右删失端点的多目标半监督学习

半监督学习旨在使用额外的知识来搜索数据结构。在临床应用中,在数据驱动分类的构建中包含预测信息非常重要。这项工作的动机是一项研究,该研究旨在确定与一组乳腺癌患者的无复发生存相关的不同免疫参数模式。可以使用多目标优化同时优化监督和非监督目标。我们提出这样一个程序来解决半监督方法中的两个挑战,即缺失数据和基于生存时间的额外知识。前者是通过使用多重插补和共识聚类来处理的。通过估计 Cox 回归的交叉验证误差,将生存信息纳入监督目标。进行了模拟研究以评估所提出程序的性能。在完整的数据集上,将性能与现有改进的多目标半监督学习方法的性能进行比较。通过将程序与无监督学习进行比较,评估了在学习过程中包括生存数据的附加值。所提出的程序显示出比现有方法更好的性能,特别是在集群数量的选择方面。在不完整的数据集上,该过程对其大多数参数几乎不敏感,尽管大量的插补和分区初始化种子提高了性能。由于丢失数据比例高 (40%) 和数据结构更加模糊,性能下降。模拟结果和对真实数据的应用支持这样的结论,即我们的程序能够在可能不完整的数据集上构建与右删失端点相关联的分类。
更新日期:2021-06-27
down
wechat
bug