Skip to main content
Log in

Hyperkernel-based intuitionistic fuzzy c-means for denoising color archival document images

  • Special Issue Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

In this article, we have addressed the problem of denoising and enhancement of color archival handwritten document images by separating noise from text and background. Indeed, archival document images that originated from scanning or photographing paper documents are mainly digitized in full color mode. Thus, it is necessary to preserve and exploit color information when applying an enhancement method or a denoising technique. Thus, the focus of our work has been to model a color image using a hyperspace. The defined hyperspace formed by the image pixels is obtained by using both topological and color spaces. The novelty of our work lies in exploiting the obtained hyperspace to cluster the extracted low-level features (topological and color) and, thereafter, to separate noise from text and background. Indeed, based on combining the obtained hyperspace with an adapted kernel-based intuitionistic fuzzy c-means (KIFCM) algorithm we have proposed a novel hyper-KIFCM (HKIFCM) method for denoising color historical document images. To illustrate the effectiveness of the HKIFCM method, a thorough experimental study has been firstly conducted with qualitative and quantitative observations obtained from color archival handwritten document images collected from both the Tunisian national archives and two datasets provided in the context of open competitions at ICDAR and ICFHR conferences. Then, we have compared the results achieved with those obtained using the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Some image samples among those we have used in our experiments are temporarily available on https://drive.google.com/open?id=1X-SDB2CmT3cfkB8dTdS3mEWR5-KLkwsa and on request subject to the agreement from the ANT.

References

  1. ANT. http://www.archives.nat.tn/. Accessed 17 August 2018

  2. DIBCO 2009. http://users.iit.demokritos.gr/~bgat/DIBCO2009/. Accessed 17 August 2018

  3. H-DIBCO 2016. https://vc.ee.duth.gr/h-dibco2016/. Accessed 17 August 2018

  4. Elhedda, W., Mehri, M., Mahjoub, M.A.: A comparative study of filtering approaches applied to color archival document images. In: Proceedings of the International Arab Conference on Information Technology (2017)

  5. Stanco, F., Tenze, L., Ramponi, G.: Technique to correct yellowing and foxing in antique books. IET Image Process. 1(2), 123–133 (2007)

    Google Scholar 

  6. Drira, F., LeBourgeois, F., Emptoz, H.: Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique. In: Lecture Notes in Computer Science (2006)

  7. Tan, C.L., Shen, P.: Restoration of archival documents using a wavelet technique. IEEE Trans. Pattern Anal. Mach. Intell. 24, 10 (2002)

    Google Scholar 

  8. Charrada, M.A., Benamara, N.E.: Old document image denoising using bilateral filter. In: International Document Image Processing (2013)

  9. Ganbold, G.: History document image background noise and removal methods. Int. J. Knowl. Content Dev. Technol. 5(2), 11–24 (2015)

    Google Scholar 

  10. Chaira, T.: A novel intuitionistic fuzzy c-means color clustering on human cell images. In: Proceedings of World Congress on Nature and Biologically Inspired Computing, pp. 736–741 (2009)

  11. Lin, K.P.: A novel evolutionary kernel intuitionistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 22(5), 1074–1087 (2014)

    Google Scholar 

  12. Sugeno, M.: Fuzzy measures and fuzzy integrals: a survey. In: Readings in Fuzzy Sets for Intelligent Systems, pp. 251–257. Morgan Kaufmann, Los Altos (1993)

  13. Leydier, Y., LeBourgeois, F., Emptoz, H.: Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts. In: Proceedings of International Conference on Pattern Recognition, vol. 1, pp. 494–497 (2004)

  14. Sangwine, S.J., Ell, T.A.: Hypercomplex auto- and cross-correlation of color images. In: Proceedings of IEEE International Conference on Image Processing (1999)

  15. Sangwine, S.J., Ell, T.A.: The discrete Fourier transform of a colour image. In: Proceedings of Image Processing II Mathematical Methods, Algorithms and Applications, pp. 430–441 (2000)

  16. Jangra, S., Rani, P.: A survey on STING and CLIQUE grid based clustering methods. Int. J. Adv. Res. Comput. Sci. 8, 5 (2017)

    Google Scholar 

  17. Babur, I.H., Ahmed, J., Ahmed, B., Habib, M.: Analysis of DBSCAN clustering technique on different datasets using WekaTools. Sci. Int. 27(6), 5087–5090 (2015)

    Google Scholar 

  18. Mehri, M., Gomez-Krämer, P., Héroux, P., Boucher, A., Mullot, R.: A texture-based pixel labeling approach for historical books. In: Proceedings of Pattern Analysis and Applications, pp. 325–364 (2017)

  19. Tonazzini, A., Bedini, L.: Restoration of recto-verso colour documents using correlated component analysis. EURASIP J. Adv. Signal Process. 2013, 58 (2013)

    Google Scholar 

  20. Chaira, T., Panwar, A.: An Atanassov’s intuitionistic fuzzy kernel clustering for medical image segmentation. Int. J. Comput. Intell. Syst. 7(2), 360–370 (2014)

    Google Scholar 

  21. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)

    Google Scholar 

  22. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Google Scholar 

  23. Kannan, S.R., Ramathilagam, S., Sathya, A., Pandiyarajan, R.: Effective fuzzy c-means based kernel function in segmenting medical images. Comput. Biol. Med. 40(6), 572–579 (2010)

    Google Scholar 

  24. Kannan, S.R., Ramathilagam, S., Devi, R., Sathya, A.: Robust kernel FCM in segmentation of breast medical images. Expert Syst. Appl. 38(4), 4382–4389 (2011)

    Google Scholar 

  25. Atanassov, K.T.: Intuitionistic fuzzy set. Fuzzy Set Syst. 20(1), 87–96 (1986)

    MATH  Google Scholar 

  26. Kaur, P., Soni, A.K., Gosain, A.: Robust intuitionistic fuzzy c-means clustering for linearly and nonlinearly separable data. In: Proceedings of International Conference on Image Information Processing (2011)

  27. Bezdek, J.C.: A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 2(1), 1–8 (1980)

    MATH  Google Scholar 

  28. Xu, Z., Chen, J., Wu, J.: Clustering algorithm for intuitionistic fuzzy sets. Inf. Sci. 178(19), 3775–3790 (2008)

    MathSciNet  MATH  Google Scholar 

  29. Atanassov, K.T., Stoeva, S.: Intuitionistic fuzzy sets. In: Proceedings of Polish Symposium on Interval and Fuzzy Mathematics, pp. 23–26 (1983)

  30. Yager, R.R.: Some aspects of intuitionistic fuzzy sets. Fuzzy Optim. Decis. Mak. 8, 67–90 (2009)

    MathSciNet  MATH  Google Scholar 

  31. Xu, Z., Hui, H.: Entropy-based procedures for intuitionistic fuzzy multiple attribute decision making. J. Syst. Eng. Electron. 20(5), 1001–1011 (2009)

    Google Scholar 

  32. Xu, Z., Wu, J.: Intuitionistic fuzzy c-means clustering algorithms. J.Syst. Eng. Electron. 21(4), 580–590 (2010)

    Google Scholar 

  33. Chaira, T.: A novel intuitionistic fuzzy c-means clustering algorithm and its application to medical images. Appl. Soft Comput. 11, 1711–1717 (2011)

    Google Scholar 

  34. Jiang, H., Zhou, X., Feng, B., Zhang, M.: A new intuitionistic fuzzy c-means clustering algorithm. In: Proceedings of International Conference on Mechatronic Sciences, Electric Engineering and Computer (2013)

  35. Jiang, H., Zhou, X., Feng, B., Zhang, M.: A new intuitionistic fuzzy c-means clustering algorithm. In: Proceedings of International Conference on Mechatronic Sciences, Electric Engineering and Computer (2013)

  36. Gatos, B., Ntirogiannis, K., Pratikakis., I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1375–1382 (2009)

  37. Pratikakis, I., Zagoris, K., Barlas, G., Gatos., B.: ICFHR 2016 handwritten document image binarization contest (H-DIBCO 2016). In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 619–623 (2016)

  38. Cheng, H., Sun, Y.: A hierarchical approach to color image segmentation using homogeneity. IEEE Trans. Image Process. 9(12), 2071–2082 (2000)

    Google Scholar 

  39. Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.M.: Internal versus external cluster validation indexes. Int. J. Comput. Commun. 5(1), 27–34 (2011)

    Google Scholar 

  40. Rendón, E., Abundez, I., Gutierrez, C., Zagal, S.D., Arizmendi, A., Quiroz, E.M., Arzate, H.E.: A comparison of internal and external cluster validation indexes. In: Proceedings of Applications of Mathematics and Computer Engineering, pp. 158–163 (2011)

  41. Powers, D.M.W.: Evaluation: from precision, recall and F-factor to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)

    MathSciNet  Google Scholar 

  42. Pitas, I., Venetsanopoulos, A.N.: Nonlinear filters in image processing: principles and applications. In: The Springer International Series in Engineering and Computer Science. Academic Publishers, Boston (1990)

  43. Sharma, S.: Applied multivariate techniques. In: University of South Carolina, Wiley, NewYork (1996)

Download references

Acknowledgements

The authors would like to acknowledge the Tunisian national archives for providing access to their digital collections.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maroua Mehri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elhedda, W., Mehri, M. & Mahjoub, M.A. Hyperkernel-based intuitionistic fuzzy c-means for denoising color archival document images. IJDAR 23, 161–181 (2020). https://doi.org/10.1007/s10032-020-00352-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-020-00352-2

Keywords

Navigation