Dropout Regularization in Hierarchical Mixture of Experts,Neurocomputing

当前位置： X-MOL 学术 › Neurocomputing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dropout Regularization in Hierarchical Mixture of Experts
Neurocomputing ( IF 5.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.neucom.2020.08.052
Ozan İrsoy , Ethem Alpaydın

Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST and CIFAR-10 datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits.

中文翻译：

专家分层混合中的 Dropout 正则化

Dropout 是一种非常有效的防止过拟合的方法，近年来已成为多层神经网络的首选正则化器。专家的分层混合是一个分层门控模型，它定义了一个软决策树，其中叶子对应于专家，决策节点对应于在其子项之间软选择的门控模型，因此，该模型定义了输入空间的软分层分区。在这项工作中，我们提出了一种用于专家分层混合的 dropout 变体，它忠实于模型定义的树层次结构，而不是像多层感知器那样具有平面的、单元独立的 dropout 应用程序。我们表明，在合成回归数据以及 MNIST 和 CIFAR-10 数据集上，

更新日期：2021-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11