Information Extraction from Text Intensive and Visually Rich Banking Documents,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Information Extraction from Text Intensive and Visually Rich Banking Documents
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-09-17 , DOI: 10.1016/j.ipm.2020.102361
Berke Oral , Erdem Emekligil , Seçil Arslan , Gülşen Eryiǧit

Document types, where visual and textual information plays an important role in their analysis and understanding, pose a new and attractive area for information extraction research. Although cheques, invoices, and receipts have been studied in some previous multi-modal studies, banking documents present an unexplored area due to the naturalness of the text they possess in addition to their visual richness. This article presents the first study which uses visual and textual information for deep-learning based information extraction on text-intensive and visually rich scanned documents which are, in this instance, unstructured banking documents, or more precisely, money transfer orders. The impact of using different neural word representations (i.e., FastText, ELMo, and BERT) on IE subtasks (namely, named entity recognition and relation extraction stages), positional features of words on document images and auxiliary learning with some other tasks are investigated. The article proposes a new relation extraction algorithm based on graph factorization to solve the complex relation extraction problem where the relations within documents are n-ary, nested, document-level, and previously indeterminate in quantity. Our experiments revealed that the use of deep learning algorithms yielded around 10 percentage points improvement on the IE sub-tasks. The inclusion of word positional features yielded around 3 percentage points of improvement in some specific information fields. Similarly, our auxiliary learning experiments yielded around 2 percentage points of improvement on some information fields associated with the specific transaction type detected by our auxiliary task. The integration of the information extraction system into a real banking environment reduced cycle times substantially. When compared to the manual workflow, document processing pipeline shortened book-to-book money transfers to 10 minutes (from 29 min.) and electronic fund transfers (EFT) to 17 minutes (from 41 min.) respectively.

中文翻译：

从文本密集型和视觉丰富的银行凭证中提取信息

可视和文本信息在其分析和理解中起着重要作用的文档类型为信息提取研究提供了一个新的且有吸引力的领域。尽管支票，发票和收据已经在一些以前的多模式研究中进行过研究，但是银行单据除了具有丰富的视觉效果外，还由于其文字的自然性，因此存在未被开发的领域。本文介绍了第一项研究，该研究使用视觉和文本信息对基于文本的密集阅读和视觉丰富的扫描文档进行深度学习，以进行信息提取，在这种情况下，文档是非结构化的银行文档，或更准确地说是汇款单。使用不同的神经词表示形式（即FastText，ELMo和BERT）对IE子任务（即，命名实体识别和关系提取阶段），研究单词在文档图像上的位置特征以及辅助学习以及其他一些任务。本文提出了一种新的基于图分解的关系提取算法，以解决文档中的关系为n元，嵌套，文档级且数量不确定的复杂关系提取问题。我们的实验表明，深度学习算法的使用使IE子任务提高了约10个百分点。在某些特定的信息领域中，单词位置特征的包含使改进率提高了约3个百分点。同样，我们的辅助学习实验在与我们的辅助任务检测到的特定交易类型相关的某些信息字段上产生了约2个百分点的改进。将信息提取系统集成到真实的银行环境中，大大减少了周期时间。与手动工作流程相比，文档处理管道将账簿间的转账时间分别从10分钟（29分钟）缩短至10分钟，将电子资金转帐（EFT）缩短至17分钟（41分钟）。

更新日期：2020-09-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11