Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-02-02 , DOI: 10.1016/j.patcog.2021.107859
Jun-Yu Ye , Yan-Ming Zhang , Qing Yang , Cheng-Lin Liu

Stroke classification and text line grouping are important tasks in online handwritten document segmentation. In the past, the two tasks were usually performed using different models which are trained independently and perform sequentially. This cannot optimize the integration of contextual information and the system may suffer from error accumulation in stroke classification. In this paper, we propose a method for joint text/non-text stroke classification and text line grouping in online handwritten documents using attention based graph neural network. In our framework, the stroke classification and text line grouping problems are formulated as node classification and node clustering problems in a relational graph, which is constructed based on the temporal and spatial relationship between strokes. We propose a new graph network architecture, called edge pooling attention network (EPAT) to efficiently aggregate information between the features of neighboring nodes and edges. The proposed model is trained by multi-task learning with cross entropy loss for node classification and distance metric loss for node clustering. In experiments on two online handwritten document datasets IAMOnDo and Kondate, the proposed method is demonstrated effective, yielding superior performance in both stroke classification and text line grouping.

中文翻译：

具有边缘池注意网络的在线手写文档中的笔画分类和文本行分组

笔划分类和文本行分组是在线手写文档分割中的重要任务。过去，这两项任务通常是使用不同的模型执行的，这些模型经过独立训练并按顺序执行。这不能优化上下文信息的集成，并且系统可能会在笔划分类中遭受错误累积的困扰。在本文中，我们提出了一种基于注意力图神经网络的在线手写文档中文本/非文本笔划分类和文本行分组的方法。在我们的框架中，笔划分类和文本行分组问题在关系图中被表述为节点分类和节点聚类问题，该关系图是基于笔划之间的时空关系构造的。我们提出了一种新的图形网络架构，边缘池注意网络（EPAT），可以有效地在相邻节点和边缘的特征之间聚合信息。提出的模型通过多任务学习进行训练，交叉熵损失用于节点分类，距离度量损失用于节点聚类。在两个在线手写文档数据集IAMOnDo和Kondate上的实验中，该方法被证明是有效的，在笔划分类和文本行分组方面均表现出卓越的性能。

更新日期：2021-02-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11