当前位置: X-MOL 学术Int. J. Doc. Anal. Recognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
KERTAS: dataset for automatic dating of ancient Arabic manuscripts
International Journal on Document Analysis and Recognition ( IF 1.8 ) Pub Date : 2018-09-08 , DOI: 10.1007/s10032-018-0312-3
Kalthoum Adam , Asim Baig , Somaya Al-Maadeed , Ahmed Bouridane , Sherine El-Menshawy

The age of a historical manuscript can be an invaluable source of information for paleographers and historians. The process of automatic manuscript age detection has inherent complexities, which are compounded by the lack of suitable datasets for algorithm testing. This paper presents a dataset of historical handwritten Arabic manuscripts designed specifically to test state-of-the-art authorship and age detection algorithms. Qatar National Library has been the main source of manuscripts for this dataset while the remaining manuscripts are open source. The dataset consists of over 2000 images taken from various handwritten Arabic manuscripts spanning fourteen centuries. In addition, a sparse representation-based approach for dating historical Arabic manuscript is also proposed. There is lack of existing datasets that provide reliable writing date and author identity as metadata. KERTAS is a new dataset of historical documents that can help researchers, historians and paleographers to automatically date Arabic manuscripts more accurately and efficiently.

中文翻译:

KERTAS:自动约会古代阿拉伯手稿的数据集

对于古生物学家和历史学家来说,历史手稿的年代可能是宝贵的信息来源。自动手稿年龄检测的过程具有内在的复杂性,而缺少用于算法测试的合适数据集使情况更加复杂。本文介绍了一个历史悠久的阿拉伯手写手稿的数据集,该手稿专门设计用于测试最新的作者身份和年龄检测算法。卡塔尔国家图书馆一直是该数据集手稿的主要来源,而其余手稿则是开源的。该数据集包含2000幅图像,这些图像取自14个世纪的各种手写阿拉伯手稿。此外,还提出了一种基于稀疏表示的约会阿拉伯手稿历史的方法。缺少现有的数据集,这些数据集提供可靠的撰写日期和作者身份作为元数据。KERTAS是一个新的历史文献数据集,可以帮助研究人员,历史学家和古画家自动更准确,更有效地自动标注阿拉伯手稿的日期。
更新日期:2018-09-08
down
wechat
bug