当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weight and bias initialization routines for Sigmoidal Feedforward Network
Applied Intelligence ( IF 3.4 ) Pub Date : 2020-11-07 , DOI: 10.1007/s10489-020-01960-5
Apeksha Mittal , Amit Prakash Singh , Pravin Chandra

The success of the Sigmoidal Feedforward Networks in the solution of complex learning task can be attributed to their Universal Approximation Property. These networks are trained using non-linear iterative optimization method (of first-order or second-order) to solve a learning task. The convergence rate in Sigmoidal Feedforward Network training is affected by the initial choice of weights, therefore, in this paper, we propose two new weight initialization routines (Routine-1 and Routine-2) using characteristics of input and output data and property of activation function. Routine-1 uses the linear dependency of weight update step size on derivative of activation function and thus, initialize weights and bias to activate the activation function region near zero (input), where the derivative is maximum, therefore, increasing the weight update step size, and hence, the convergence speed. The same principle is used to derive Routine-2, that initialize weights and bias to activate distinct point in the significant range of activation function (where significant range defines the non-saturated region in activation function), such that, each node evolves independently of each other, and act as distinct feature identifier. Initializing weights in significant range reduces chances of (hidden) nodes getting stuck in saturated state. The networks initialized using proposed routines has higher convergence and higher probability to achieve deeper minima. The efficiency of proposed routines is evaluated by comparing them to conventional random weight initialization routine and 11 weight initialization routines proposed in literature (4 well established routines and 7 recently proposed routines) for several benchmark problems. The proposed routine is also tested for larger networks sizes and larger datasets such as MNIST. The results show that the performance of proposed routines is better than conventional random weight initialization routine and 11 established weight initialization routines.



中文翻译:

S形前馈网络的权重和偏置初始化例程

乙形前馈网络在复杂学习任务解决方案中的成功可归因于其通用逼近性质。这些网络使用非线性迭代优化方法(一阶或二阶)进行训练,以解决学习任务。Sigmoidal前馈网络训练的收敛速度受权重的初始选择的影响,因此,在本文中,我们根据输入和输出数据的特征以及激活的性质,提出了两个新的权重初始化例程(Routine-1和Routine-2)。功能。例程1使用权重更新步长对激活函数导数的线性依赖性,因此初始化权重和偏置以激活零附近(输入)的激活函数区域,其中导数最大,因此增加权重更新步长,因此收敛速度。使用相同的原理来得出Routine-2,初始化权重和偏差以激活显着范围的激活函数(其中显着范围定义了激活函数中的非饱和区域),以便每个节点彼此独立发展,并充当不同的特征标识符。在较大范围内初始化权重可减少(隐藏)节点陷入饱和状态的机会。使用建议的例程初始化的网络具有更高的收敛性和更高的概率,可实现更深的最小值。通过将它们与常规随机权重初始化例程和文献中针对几种基准问题提出的11种权重初始化例程(4种完善的例程和7种最近提出的例程)进行比较,来评估所提出例程的效率。还针对较大的网络规模和较大的数据集(例如MNIST)测试了所建议的例程。结果表明,所提出的例程的性能优于常规随机权重初始化例程和11种已建立的权重初始化例程。

更新日期:2020-11-09
down
wechat
bug