当前位置: X-MOL 学术J. Comput. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Path-finding in real and simulated rats: assessing the influence of path characteristics on navigation learning.
Journal of Computational Neuroscience ( IF 1.5 ) Pub Date : 2008-04-30 , DOI: 10.1007/s10827-008-0094-6
Minija Tamosiunaite 1 , James Ainge , Tomas Kulvicius , Bernd Porr , Paul Dudchenko , Florentin Wörgötter
Affiliation  

A large body of experimental evidence suggests that the hippocampal place field system is involved in reward based navigation learning in rodents. Reinforcement learning (RL) mechanisms have been used to model this, associating the state space in an RL-algorithm to the place-field map in a rat. The convergence properties of RL-algorithms are affected by the exploration patterns of the learner. Therefore, we first analyzed the path characteristics of freely exploring rats in a test arena. We found that straight path segments with mean length 23 cm up to a maximal length of 80 cm take up a significant proportion of the total paths. Thus, rat paths are biased as compared to random exploration. Next we designed a RL system that reproduces these specific path characteristics. Our model arena is covered by overlapping, probabilistically firing place fields (PF) of realistic size and coverage. Because convergence of RL-algorithms is also influenced by the state space characteristics, different PF-sizes and densities, leading to a different degree of overlap, were also investigated. The model rat learns finding a reward opposite to its starting point. We observed that the combination of biased straight exploration, overlapping coverage and probabilistic firing will strongly impair the convergence of learning. When the degree of randomness in the exploration is increased, convergence improves, but the distribution of straight path segments becomes unrealistic and paths become 'wiggly'. To mend this situation without affecting the path characteristic two additional mechanisms are implemented: a gradual drop of the learned weights (weight decay) and path length limitation, which prevents learning if the reward is not found after some expected time. Both mechanisms limit the memory of the system and thereby counteract effects of getting trapped on a wrong path. When using these strategies individually divergent cases get substantially reduced and for some parameter settings no divergence was found anymore at all. Using weight decay and path length limitation at the same time, convergence is not much improved but instead time to convergence increases as the memory limiting effect is getting too strong. The degree of improvement relies also on the size and degree of overlap (coverage density) in the place field system. The used combination of these two parameters leads to a trade-off between convergence and speed to convergence. Thus, this study suggests that the role of the PF-system in navigation learning cannot be considered independently from the animals' exploration pattern.

中文翻译:

真实和模拟大鼠的寻路:评估路径特征对导航学习的影响。

大量实验证据表明,海马位置场系统参与了啮齿动物基于奖励的导航学习。强化学习 (RL) 机制已被用于对此进行建模,将 RL 算法中的状态空间与大鼠中的位置场图相关联。RL 算法的收敛特性受学习器探索模式的影响。因此,我们首先分析了大鼠在试验场自由探索的路径特征。我们发现平均长度为 23 厘米到最大长度为 80 厘米的直线路径段占总路径的很大一部分。因此,与随机探索相比,老鼠路径是有偏差的。接下来我们设计了一个 RL 系统来重现这些特定的路径特征。我们的模型舞台被重叠覆盖,概率触发真实大小和覆盖范围的地点场 (PF)。由于 RL 算法的收敛性也受状态空间特征的影响,因此还研究了导致不同程度重叠的不同 PF 大小和密度。模型老鼠学习寻找与其起点相反的奖励。我们观察到,有偏差的直接探索、重叠覆盖和概率触发的组合会严重损害学习的收敛性。当探索中的随机程度增加时,收敛性提高,但直线路径段的分布变得不切实际,路径变得“摆动”。为了在不影响路径特性的情况下修复这种情况,我们实施了两种附加机制:学习权重的逐渐下降(权重衰减)和路径长度限制,如果在预期的一段时间后未找到奖励,则会阻止学习。这两种机制都会限制系统的记忆,从而抵消陷入错误路径的影响。当单独使用这些策略时,发散情况会大大减少,并且对于某些参数设置,根本不再发现发散。同时使用权重衰减和路径长度限制,收敛没有太大改善,但随着内存限制效应变得太强,收敛时间反而增加。改进程度还取决于场地系统中重叠的大小和程度(覆盖密度)。这两个参数的使用组合导致收敛和收敛速度之间的权衡。因此,
更新日期:2019-11-01
down
wechat
bug