Eigen-stratified models,Optimization and Engineering

当前位置： X-MOL 学术 › Optim. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Eigen-stratified models
Optimization and Engineering ( IF 2.0 ) Pub Date : 2021-01-06 , DOI: 10.1007/s11081-020-09592-x
Jonathan Tuck , Stephen Boyd

Stratified models depend in an arbitrary way on a selected categorical feature that takes K values, and depend linearly on the other n features. Laplacian regularization with respect to a graph on the feature values can greatly improve the performance of a stratified model, especially in the low-data regime. A significant issue with Laplacian-regularized stratified models is that the model is K times the size of the base model, which can be quite large. We address this issue by formulating eigen-stratified models, which are stratified models with an additional constraint that the model parameters are linear combinations of some modest number m of bottom eigenvectors of the graph Laplacian, i.e., those associated with the m smallest eigenvalues. With eigen-stratified models, we only need to store the m bottom eigenvectors and the corresponding coefficients as the stratified model parameters. This leads to a reduction, sometimes large, of model size when \(m\le n\) and \(m \ll K\). In some cases, the additional regularization implicit in eigen-stratified models can improve out-of-sample performance over standard Laplacian regularized stratified models.

中文翻译：

本征分层模型

分层模型以任意方式依赖于一个选定的分类特征，该分类特征采用K个值，并且线性依赖于其他n个特征。关于特征值图的拉普拉斯正则化可以极大地改善分层模型的性能，尤其是在低数据状态下。拉普拉斯正则化分层模型的一个重要问题是该模型的大小是基本模型大小的K倍，后者可能会很大。我们通过制定本征分层模型来解决此问题，本征分层模型是具有附加约束的分层模型，该模型参数是某些适度数m的线性组合图拉普拉斯算子的底部特征向量，即与m个最小特征值相关联的特征向量。对于特征分层模型，我们只需要存储m个底部特征向量和相应的系数作为分层模型参数。当\（m \ le n \）和\（m \ ll K \）时，这会导致模型大小减小，有时甚至很大。在某些情况下，本征分层模型中隐含的附加正则化可以比标准的Laplacian正规化分层模型提高样本外性能。

更新日期：2021-01-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11