当前位置: X-MOL 学术Test › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust clustering of multiply censored data via mixtures of t factor analyzers
TEST ( IF 1.3 ) Pub Date : 2021-04-08 , DOI: 10.1007/s11749-021-00766-y
Wan-Lun Wang , Tsung-I Lin

Mixtures of t factor analyzers (MtFA) have been well recognized as a prominent tool in modeling and clustering multivariate data contaminated with heterogeneity and outliers. In certain practical situations, however, data are likely to be censored such that the standard methodology becomes computationally complicated or even infeasible. This paper presents an extended framework of MtFA that can accommodate censored data, referred to as MtFAC in short. For maximum likelihood estimation, we construct an alternating expectation conditional maximization algorithm in which the E-step relies on the first-two moments of truncated multivariate-t distributions and CM-steps offer tractable solutions of updated estimators. Asymptotic standard errors of mixing proportions and component mean vectors are derived by means of missing information principle, or the so-called Louis’ method. Several numerical experiments are conducted to examine the finite-sample properties of estimators and the ability of the proposed model to downweight the impact of censoring and outlying effects. Further, the efficacy and usefulness of the proposed method are also demonstrated by analyzing a real dataset with genuine censored observations.



中文翻译:

通过t因子分析仪的混合对多重删失数据进行稳健的聚类

t因子分析仪(MtFA)的混合物已被公认为是对异质性和异常值污染的多变量数据进行建模和聚类的杰出工具。但是,在某些实际情况下,很可能会对数据进行审查,以使标准方法学在计算上变得复杂甚至不可行。本文提出了可以容纳审查数据的MtFA扩展框架,简称MtFAC。对于最大似然估计,我们构造了一种交替期望条件最大化算法,其中E步依赖于截断多元t的前两个矩分布和CM步骤可为更新的估算器提供易于处理的解决方案。混合比例和分量均值向量的渐近标准误差是通过丢失信息原理或所谓的“路易斯”方法得出的。进行了一些数值实验,以检验估计量的有限样本属性以及所提出模型减轻审查和离群效应的影响的能力。此外,通过使用真实审查的观察数据分析真实数据集,也证明了所提出方法的有效性和实用性。

更新日期:2021-04-08
down
wechat
bug