当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dropout Regularization in Hierarchical Mixture of Experts
Neurocomputing ( IF 5.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.neucom.2020.08.052
Ozan İrsoy , Ethem Alpaydın

Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST and CIFAR-10 datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits.

中文翻译:

专家分层混合中的 Dropout 正则化

Dropout 是一种非常有效的防止过拟合的方法,近年来已成为多层神经网络的首选正则化器。专家的分层混合是一个分层门控模型,它定义了一个软决策树,其中叶子对应于专家,决策节点对应于在其子项之间软选择的门控模型,因此,该模型定义了输入空间的软分层分区。在这项工作中,我们提出了一种用于专家分层混合的 dropout 变体,它忠实于模型定义的树层次结构,而不是像多层感知器那样具有平面的、单元独立的 dropout 应用程序。我们表明,在合成回归数据以及 MNIST 和 CIFAR-10 数据集上,
更新日期:2021-01-01
down
wechat
bug