Self-supervised deep metric learning for ancient papyrus fragments retrieval

Pirrone, Antoine; Beurton-Aimar, Marie; Journet, Nicholas

doi:10.1007/s10032-021-00369-1

Self-supervised deep metric learning for ancient papyrus fragments retrieval

Special Issue Paper
Published: 08 June 2021

Volume 24, pages 219–234, (2021)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

660 Accesses
8 Citations
Explore all metrics

Abstract

This work focuses on document fragments association using deep metric learning methods. More precisely, we are interested in ancient papyri fragments that need to be reconstructed prior to their analysis by papyrologists. This is a challenging task to automatize using machine learning algorithms because labeled data is rare, often incomplete, imbalanced and of inconsistent conservation states. However, there is a real need for such software in the papyrology community as the process of reconstructing the papyri by hand is extremely time-consuming and tedious. In this paper, we explore ways in which papyrologists can obtain useful matching suggestion on new data using Deep Convolutional Siamese-Networks. We emphasize on low-to-no human intervention for annotating images. We show that the from-scratch self-supervised approach we propose is more effective than using knowledge transfer from a large dataset, the former achieving a top-1 accuracy score of 0.73 on a retrieval task involving 800 fragments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

Learning with Noisy Correspondence

Article 13 April 2024

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Article Open access 22 November 2021

Notes

Found here: https://quod.lib.umich.edu/a/apis (accessed October 28, 2020).
https://lme.tf.fau.de/competitions/hisfragir20-icfhr-2020-competition-on-image-retrieval-for-historical-handwritten-fragments.
https://www.unibas.ch/.
Using OpenCV: https://opencv.org/, accessed November 3, 2020.
http://www.papyrologie.paris-sorbonne.fr/menu1/jouguet.htm.

References

Bondi, L., Güera, D., Baroffio, L., Bestagini, P., Delp, E.J., Tubaro, S.: A preliminary study on convolutional neural networks for camera model identification. Electron. Imaging 2017(7), 67–76 (2017)
Article Google Scholar
Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell. 7(04), 669–688 (1993)
Article Google Scholar
Cao, Z., Ma, L., Long, M., Wang, J.: Partial adversarial domain adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 135–150 (2018)
Christlein, V., Gropp, M., Fiel, S., Maier, A.: Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 991–997. IEEE (2017)
Vincent Christlein and Anguelos Nicolaou and Mathias Seuret and Dominique Stutzmann and Andreas Maier ICDAR 2019 Competition on Image Retrieval for Historical Handwritten Documents (2019). arXiv:1912.03713
Cloppet, F., Eglin, V., Helias-Baron, M., Kieu, C., Vincent, N., Stutzmann, D.: ICDAR2017 competition on the classification of medieval handwritings in Latin script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1371–1376. IEEE (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR09 (2009)
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Dosovitskiy, A., Fischer, P., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(9), 1734–1747 (2015)
Article Google Scholar
Fiel, S., Kleber, F., Diem, M., Christlein, V., Louloudis, G., Nikos, S., Gatos, B.: ICDAR2017 competition on historical document writer identification (historical-wi). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1377–1382. IEEE (2017)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1735–1742. IEEE (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR (2015). arXiv:1512.03385
Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: International Workshop on Similarity-Based Pattern Recognition, pp. 84–92. Springer (2015)
Huh, M., Agrawal, P., Efros, A.A.: What makes imagenet good for transfer learning? arXiv preprint arXiv:1608.08614 (2016)
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
Article Google Scholar
Jing, L., Tian, Y.: Self-supervised visual feature learning with deep neural networks: a survey. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2020)
Karpinski, R., Belaid, A.: Semi-supervised learning through adversary networks for baseline detection. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 128–133. IEEE (2019)
Kaya, M., Bílge, H.: Deep metric learning: a survey. Symmetry (2019). https://doi.org/10.3390/sym11091066
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Korbar, B., Tran, D., Torresani, L.: Cooperative learning of audio and video models from self-supervised synchronization. arXiv preprint arXiv:1807.00230 (2018)
Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: European Conference on Computer Vision, pp. 577–593. Springer (2016)
Liu, S., Deng, W.: Very deep convolutional neural network based image classification using small training sample size. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 730–734 (2015). https://doi.org/10.1109/ACPR.2015.7486599
Lombardi, F., Marinai, S.: Deep learning for historical document analysis and recognition—a survey. J. Imaging 6(10), 110 (2020)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Mahendran, A., Thewlis, J., Vedaldi, A.: Cross pixel optical-flow similarity for self-supervised learning. In: Asian Conference on Computer Vision, pp. 99–116. Springer (2018)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Nat. Lang. Eng. 16(1), 100–103 (2008)
MATH Google Scholar
Misra, I., Maaten, L.V.D.: Self-supervised learning of pretext-invariant representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6707–6717 (2020)
Misra, I., Zitnick, C.L., Hebert, M.: Shuffle and learn: unsupervised learning using temporal order verification. In: European Conference on Computer Vision, pp. 527–544. Springer (2016)
Noroozi, M., Vinjimoor, A., Favaro, P., Pirsiavash, H.: Boosting self-supervised learning via knowledge transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9359–9367 (2018)
Ostertag, C., Beurton-Aimar, M.: Matching ostraca fragments using a siamese neural network. Pattern Recognit. Lett. 131, 336–340 (2020)
Article Google Scholar
Owens, A., Efros, A.A.: Audio-visual scene analysis with self-supervised multisensory features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 631–648 (2018)
Paixão, T.M., Berriel, R.F., Boeres, M.C., Koerich, A.L., Badue, C., De Souza, A.F., Oliveira-Santos, T.: Self-supervised deep reconstruction of mixed strip-shredded text documents. Pattern Recognit. 107, 107535 (2020). https://doi.org/10.1016/j.patcog.2020.107535
Article Google Scholar
Pal, S., Datta, A., Majumder, D.D.: Computer recognition of vowel sounds using a self-supervised learning algorithm. J. Anat. Soc. India 6, 117–123 (1978)
Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)
Article Google Scholar
Pirrone, A., Beurton-Aimar, M., Journet, N.: Papy-s-net: a siamese network to match papyrus fragments. In: Proceedings of the 5th International Workshop on Historical Document Imaging and Processing, pp. 78–83 (2019)
Ren, Z., Lee, Y.J.: Cross-domain self-supervised multi-task feature learning using synthetic imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 762–771 (2018)
Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One (2015). https://doi.org/10.1371/journal.pone.0118432
Article Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Seuret, M., Nicolaou, A., Stutzmann, D., Maier, A., Christlein, V.: ICFHR 2020 competition on image retrieval for historical handwritten fragments. Int. Conf. Front. Handwrit. Recognit. (2020). https://doi.org/10.1109/ICFHR2020.2020.00048
Studer, L., Alberti, M., Pondenkandath, V., Goktepe, P., Kolonko, T., Fischer, A., Liwicki, M., Ingold, R.: A comprehensive study of imagenet pre-training for historical document image analysis. CoRR (2019). arXiv:1905.09113
Tang, H., Zhao, Y., Lu, H.: Unsupervised person re-identification with iterative self-supervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Tang, Y., Peng, L., Xu, Q., Wang, Y., Furuhata, A.: CNN based transfer learning for historical Chinese character recognition. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 25–29. IEEE (2016)
Wiggers, K.L., Junior, A.d.S.B., Koerich, A.L., Heutte, L., de Oliveira, L.E.S.: Deep learning approaches for image retrieval and pattern spotting in ancient documents. arXiv preprint arXiv:1907.09404 (2019)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer (2014)
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: European Conference on Computer Vision, pp. 649–666. Springer (2016)

Download references

Author information

Authors and Affiliations

LaBRI: Laboratoire Bordelais de Recherche en Informatique, Talence, France
Antoine Pirrone, Marie Beurton-Aimar & Nicholas Journet

Authors

Antoine Pirrone
View author publications
You can also search for this author in PubMed Google Scholar
Marie Beurton-Aimar
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Journet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antoine Pirrone.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research leading to this results has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program under Grant agreement No. 758907 and is part of the GESHAEM Project, hosted by the Ausonius Institute. The source code (upon request) and data used in this article are available at https://morphoboid.labri.fr/self-supervised-papyrus.html.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pirrone, A., Beurton-Aimar, M. & Journet, N. Self-supervised deep metric learning for ancient papyrus fragments retrieval. IJDAR 24, 219–234 (2021). https://doi.org/10.1007/s10032-021-00369-1

Download citation

Received: 18 November 2020
Revised: 11 March 2021
Accepted: 26 April 2021
Published: 08 June 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s10032-021-00369-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-supervised deep metric learning for ancient papyrus fragments retrieval

Abstract

Access this article

Similar content being viewed by others

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Learning with Noisy Correspondence

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Self-supervised deep metric learning for ancient papyrus fragments retrieval

Abstract

Access this article

Similar content being viewed by others

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Learning with Noisy Correspondence

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation