Abstract
Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.
Similar content being viewed by others
References
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Efficient exemplar word spotting. In: BMVC, vol. 1, p. 3 (2012)
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Handwritten word spotting with corrected attributes. In: 2013 IEEE International Conference on Computer Vision, pp. 1017–1024 (2013). https://doi.org/10.1109/ICCV.2013.130
Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimed. Syst. 16(6), 345–379 (2010)
Balasubramanian, A., Meshesha, M., Jawahar, C.: Retrieval from document image collections. In: Document Analysis Systems, vol. 3872, pp. 1–12. Springer, Berlin (2006)
Bhardwaj, A., Jose, D., Govindaraju, V.: Script independent word spotting in multilingual documents. In: IJCNLP, pp. 48–54 (2008)
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague, vol. 1, pp. 1–2 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Dey, S., Nicolaou, A., Llados, J., Pal, U.: Local Binary Pattern for Word Spotting in Handwritten Historical Document, pp. 574–583. Springer, Berlin (2016)
Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. R. Stat. Soc. Ser. B (Methodol.) 39, 262–268 (1977)
Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: Bh2m: the Barcelona historical, handwritten marriages database. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 256–261. IEEE (2014)
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)
Garz, A., Fischer, A., Sablatnig, R., Bunke, H.: Binarization-free text line segmentation for historical documents based on interest point clustering. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 95–99 (2012). https://doi.org/10.1109/DAS.2012.23
Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 271–275. IEEE (2009)
Ghosh, S., Valveny, E.: R-PHOC: segmentation-free word spotting using CNN (2017). arXiv preprint arXiv:1707.01294
Ghosh, S., Valveny, E.: Text box proposals for handwritten word spotting from documents. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 91–108 (2018)
Howe, N.R., Rath, T.M., Manmatha, R.: Boosted decision trees for word recognition in handwritten document retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 377–383. ACM (2005)
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160. IEEE (2015)
Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 366–379 (1997)
Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 167–177 (2007)
Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: Adaboost for text detection in natural scene. In: 2011 International Conference on Document Analysis and Recognition, pp. 429–434. IEEE (2011)
Liang, Y., Fairhurst, M.C., Guest, R.M.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45(12), 4225–4236 (2012)
Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 123–138 (2007)
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42(12), 3169–3183 (2009)
Manmatha, R., Rothfeder, J.L.: A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1212–1225 (2005)
Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: 2003 Proceedings. Seventh International Conference on Document Analysis and Recognition, pp. 218–222. IEEE (2003)
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II–521. IEEE (2003)
Rodriguez-Serrano, J., Perronnin, F., et al.: A model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2108–2120 (2012)
Rothacker, L., Sudholt, S., Rusakov, E., Kasperidus, M., Fink, G.A.: Word hypotheses for segmentation-free word spotting in historic document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1174–1179. IEEE (2017)
Rothfeder, J.L., Feng, S., Rath, T.M.: Using corner feature correspondences to rank word images by similarity. In: 2003 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW’03, vol. 3, pp. 30–30. IEEE (2003)
Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Browsing heterogeneous document collections by a segmentation-free word spotting method. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 63–67. IEEE (2011)
Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48(2), 545–555 (2015)
Sidiropoulos, P., Vrochidis, S., Kompatsiaris, I.: Content-based binary image retrieval using the adaptive hierarchical density histogram. Pattern Recognit. 44(4), 739–750 (2011)
Srihari, S., Srinivasan, H., Babu, P., Bhole, C.: Spotting words in handwritten Arabic documents. In: Electronic Imaging 2006, pp. 606–702. International Society for Optics and Photonics (2006)
Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: Icdar 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1402–1406. IEEE (2013)
Sudholt, S., Fink, G.A.: Phocnet: a deep convolutional neural network for word spotting in handwritten documents (2016). arXiv preprint arXiv:1604.00187
Terasawa, K., Tanaka, Y.: Slit style hog feature for document image word spotting. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 116–120. IEEE (2009)
Vamvakas, G., Gatos, B., Perantonis, S.J.: Handwritten character recognition through two-stage foreground sub-sampling. Pattern Recognit. 43(8), 2807–2816 (2010)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011)
Acknowledgements
The authors would like to thank Marcal Rusiñol, Sebastian Sudholt, and Suman Ghosh for helping with BoVW, PHOCNET, and FisherCCA experiments as well as David Fernàndez for providing a tuned implementation of [25]. The authors would like to acknowledge Spanish Project Grant CONCORDIA TIN2015-70924-C2-2-R and Grant of Project “RAW—Reading in the Wild” (TIN2014-52072P)—and the CERCA programme/ Generalitat de Catalunya.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dey, S., Nicolaou, A., Lladós, J. et al. Evaluation of word spotting under improper segmentation scenario. IJDAR 22, 361–374 (2019). https://doi.org/10.1007/s10032-019-00338-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-019-00338-9