Data Augmentation using Geometric, Frequency, and Beta Modeling approaches for Improving Multi-lingual Online Handwriting Recognition,International Journal on Document Analysis and Recognition

当前位置： X-MOL 学术 › Int. J. Doc. Anal. Recognit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data Augmentation using Geometric, Frequency, and Beta Modeling approaches for Improving Multi-lingual Online Handwriting Recognition
International Journal on Document Analysis and Recognition ( IF 1.8 ) Pub Date : 2021-06-14 , DOI: 10.1007/s10032-021-00376-2
Yahia Hamdi , Houcine Boubaker , Adel M. Alimi

The lack of large training data in the context of deep learning applications is a serious issue investigated by many studies that deal with the current challenge. In this paper, we introduce new data augmentation methods that generate more shape and dynamic variations to improve the performance of recognition systems using small datasets. Four data augmentation strategies are employed in our work. The first strategy employs the geometric methods that include: italicity angle, change of magnitude ratio, and baseline inclination angle. The second strategy applies a frequency treatment that attenuates or amplifies the trajectory high harmonics to generate handwriting modified styles. The third strategy employs the beta-elliptic model to extract a combined static and dynamic representation of the handwritten trajectory which undergoes limited random change around its parameters in order to generate more modified samples. The hybrid strategy consists of combining these strategies to maximize variations of the online handwriting trajectory (OHT). We evaluated our approach of data augmentation in the context of multi-lingual online handwriting recognition (OHR) tasks using end-to-end CNN architecture. Four databases; ADAB, ALTEC-OnDB, and Online_KHATT for Arabic script, and UNIPEN for Latin characters, are used to validate the proposed strategy. The obtained results show the effectiveness and the advantage of the adopted strategies compared with those registered before database extension or reported in the state-of-the-art systems.

中文翻译：

使用几何、频率和 Beta 建模方法进行数据增强以改进多语言在线手写识别

在深度学习应用的背景下缺乏大量训练数据是许多应对当前挑战的研究调查的一个严重问题。在本文中，我们引入了新的数据增强方法，这些方法可以生成更多的形状和动态变化，以提高使用小数据集的识别系统的性能。我们的工作中采用了四种数据增强策略。第一种策略采用的几何方法包括：斜体角、幅度比变化和基线倾斜角。第二种策略应用频率处理来衰减或放大轨迹高次谐波以生成手写修改样式。第三种策略采用β-椭圆模型来提取手写轨迹的静态和动态组合表示，该轨迹在其参数周围经历有限的随机变化，以生成更多修改的样本。混合策略包括组合这些策略以最大化在线手写轨迹（OHT）的变化。我们使用端到端 CNN 架构在多语言在线手写识别 (OHR) 任务的背景下评估了我们的数据增强方法。四个数据库；阿拉伯文字的 ADAB、ALTEC-OnDB 和 Online_KHATT 以及拉丁字符的 UNIPEN 用于验证提议的策略。

更新日期：2021-06-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11