当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Self-supervised deep metric learning for ancient papyrus fragments retrieval
International Journal on Document Analysis and Recognition ( IF 2.3 ) Pub Date : 2021-06-08 , DOI: 10.1007/s10032-021-00369-1
Antoine Pirrone , Marie Beurton-Aimar , Nicholas Journet

This work focuses on document fragments association using deep metric learning methods. More precisely, we are interested in ancient papyri fragments that need to be reconstructed prior to their analysis by papyrologists. This is a challenging task to automatize using machine learning algorithms because labeled data is rare, often incomplete, imbalanced and of inconsistent conservation states. However, there is a real need for such software in the papyrology community as the process of reconstructing the papyri by hand is extremely time-consuming and tedious. In this paper, we explore ways in which papyrologists can obtain useful matching suggestion on new data using Deep Convolutional Siamese-Networks. We emphasize on low-to-no human intervention for annotating images. We show that the from-scratch self-supervised approach we propose is more effective than using knowledge transfer from a large dataset, the former achieving a top-1 accuracy score of 0.73 on a retrieval task involving 800 fragments.



中文翻译:

用于古纸莎草碎片检索的自监督深度度量学习

这项工作侧重于使用深度度量学习方法的文档片段关联。更准确地说,我们对古代纸莎草纸碎片感兴趣,这些碎片需要在纸莎草学家分析之前进行重建。这是使用机器学习算法自动化的一项具有挑战性的任务,因为标记数据很少,通常不完整、不平衡和不一致的守恒状态。然而,纸莎草纸界确实需要这样的软件,因为手工重建纸莎草纸的过程非常耗时和乏味。在本文中,我们探索了纸莎草学家如何使用深度卷积连体网络获得新数据的有用匹配建议。我们强调在注释图像时几乎不需要人工干预。我们表明 我们提出的从头开始的自我监督方法比使用来自大型数据集的知识转移更有效,前者在涉及 800 个片段的检索任务中获得了 0.73的top-1准确度分数。

更新日期:2021-06-09
down
wechat
bug