Machine Learning ( IF 4.3 ) Pub Date : 2021-04-14 , DOI: 10.1007/s10994-021-05952-5 Aristeidis Panos , Petros Dellaportas , Michalis K. Titsias
We introduce a Gaussian process latent factor model for multi-label classification that can capture correlations among class labels by using a small set of latent Gaussian process functions. To address computational challenges, when the number of training instances is very large, we introduce several techniques based on variational sparse Gaussian process approximations and stochastic optimization. Specifically, we apply doubly stochastic variational inference that sub-samples data instances and classes which allows us to cope with Big Data. Furthermore, we show it is possible and beneficial to optimize over inducing points, using gradient-based methods, even in very high dimensional input spaces involving up to hundreds of thousands of dimensions. We demonstrate the usefulness of our approach on several real-world large-scale multi-label learning problems.
中文翻译:
使用高斯过程的大规模多标签学习
我们引入了用于多标签分类的高斯过程潜在因子模型,该模型可以通过使用一小组潜在的高斯过程函数来捕获类标签之间的相关性。为了解决计算难题,当训练实例的数量很大时,我们介绍了几种基于变分稀疏高斯过程近似和随机优化的技术。具体来说,我们采用双重随机变分推理,对数据实例和类进行子采样,这使我们能够应对大数据。此外,我们展示了使用基于梯度的方法优化归纳点的可能性,甚至在涉及多达数十万维的超高维输入空间中,也是有益的。