Skip to main content

Advertisement

Log in

Analysis of cancer in histological images: employing an approach based on genetic algorithm

  • Original Article
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The analysis of histological images is based on visual assessment of tissues by specialists using an optical microscopy. This task can be time-consuming and challenging, mainly due to the complexity of the structures and diseases under investigation. These facts have motivated the development of computational methods to support specialists in research and decision-making. Despite the different computational strategies available in the literature, the solutions based on genetic algorithm have not been fully explored to provide the best combination of features, selection algorithms and classifiers. In this paper, we describe an approach based on genetic algorithm able to evaluate a significant number of features, selection methods and classifiers in order to provide an acceptable association for the diagnosis and pattern recognition of non-Hodgkin lymphomas and colorectal cancer. The chromosomal structure was represented with four genes. The evaluation and selection of individuals, as well as the crossover and mutation processes, were defined to distinguish the groups under investigation, with the highest AUC value and the smallest number of features. The tests were performed considering 1512 features from histological images, different population sizes and number of iterations. An initial population of 50 individuals and 50 iterations provided the best result (AUC value of 0.984) for the colorectal histological images. For non-Hodgkin lymphoma images, the best result (AUC value of 0.947) was obtained with a population of 500 individuals and 50 iterations. The proposed methodology with detailed information regarding the methods, features and best associations are relevant contributions for the community interested in the study of pattern recognition of colorectal cancer and lymphomas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Availability of data and material

Not applicable.

References

  1. Ab Wahab MN, Nefti-Meziani S, Atyabi A (2015) A comprehensive review of swarm optimization algorithms. PloS One 10(5):e0122827

    Google Scholar 

  2. Al-Rajab M, Lu J, Xu Q (2017) Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. Comput Methods Program Biomed 146:11–24

    Google Scholar 

  3. Alteri R, Kramer J, Simpson S (2014) Colorectal cancer facts and figures 2014–2016. American Cancer Society, Atlanta, pp 1–30

    Google Scholar 

  4. Anbarasi M, Anupriya E, Iyengar N (2010) Enhanced prediction of heart disease with feature subset selection using genetic algorithm. Int J Eng Sci Technol 2(10):5370–5376

    Google Scholar 

  5. Bai J, Jiang H, Li S, Ma X (2019) Nhl pathological image classification based on hierarchical local information and googlenet-based representations. BioMed Res Int 2019

  6. Brancati N, De Pietro G, Frucci M, Riccio D (2019) A deep learning approach for breast invasive ductal carcinoma detection and lymphoma multi-classification in histological images. IEEE Access 7:44709–44720

    Google Scholar 

  7. Breiman L (2001) Mach Learn. Random forests 45(1):5–32

    Google Scholar 

  8. Bruderer E, Singh JV (1996) Organizational evolution, learning, and selection: a genetic-algorithm-based model. Acad Manag J 39(5):1322–1349

    Google Scholar 

  9. Căliman A, Ivanovici M (2012) Psoriasis image analysis using color lacunarity. In: 2012 13th international conference on optimization of electrical and electronic equipment (OPTIM), IEEE, pp 1401–1406

  10. Candes E, Demanet L, Donoho D, Ying L (2006) Fast discrete curvelet transforms. Multisc Model Simul 5(3):861–899

    MathSciNet  MATH  Google Scholar 

  11. Chakraborty M, Mukhopadhyay S, Dasgupta A, Patsa S, Anjum N, Ray J (2016) A new approach of oral cancer detection using bilateral texture features in digital infrared thermal images. In: 2016 IEEE 38th annual international conference of the engineering in medicine and biology society (EMBC), IEEE, pp 1377–1380

  12. Chan HP, Charles E, Metz P, Lam KL (1990) Improvement in radiologists’ detection of clustered microcalcifications on mammograms. Arbor 1001:48109–0326

    Google Scholar 

  13. Cleary JG, Trigg LE (1995) K*: an instance-based learner using an entropic distance measure. In: Machine learning proceedings 1995, Elsevier, pp 108–114

  14. Côté-Allard U, Campbell E, Phinyomark A, Laviolette F, Gosselin B, Scheme E (2020) Interpreting deep learning features for myoelectric control: a comparison with handcrafted features. Front Bioeng Biotechnol 8:158

    Google Scholar 

  15. Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221

    Google Scholar 

  16. Demanet L (2008) The curvelet organization. http://www.curvelet.org/software.html. Accessed: 01.24.2018

  17. Doi K (2007) Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput Med Imaging Graph 31(4–5):198–211

  18. Eltoukhy MM, Faye I, Samir BB (2012) A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation. Comput Biol Med 42(1):123–128

    Google Scholar 

  19. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9(Aug):1871–1874

    MATH  Google Scholar 

  20. Gardner MW, Dorling S (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14–15):2627–2636

    Google Scholar 

  21. Gibson A, Nicholson C, Patterson J (2016) Eclipse deeplearning4j development team. deeplearning4j: Open-source distributed deep learning for the jvm, apache software foundation license 2.0. http://deeplearning4j.org. Accessed: 2019-12-10

  22. Gonçalves EC, Freitas AA, Plastino A (2018) A survey of genetic algorithms for multi-label classification. In: 2018 IEEE congress on evolutionary computation (CEC), IEEE, pp 1–8

  23. Gou J, Ma H, Ou W, Zeng S, Rao Y, Yang H (2019) A generalized mean distance-based k-nearest neighbor classifier. Expert Syst Appl 115:356–372

    Google Scholar 

  24. Gurcan MN, Sahiner B, Petrick N, Chan HP, Kazerooni EA, Cascade PN, Hadjiiski L (2002) Lung nodule detection on thoracic computed tomography images: preliminary evaluation of a computer-aided diagnosis system. Med Phys 29(11):2552–2558

    Google Scholar 

  25. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Google Scholar 

  26. Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804

    Google Scholar 

  27. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  28. Iesmantas T, Alzbutas R (2018) Convolutional capsule network for classification of breast cancer histology images. In: International conference image analysis and recognition, Springer, pp 853–860

  29. INCA (2017) Estimate 2018: Cancer incidence in brazil (P-34)

  30. Ivanovici M, Richard N, Decean H (2009) Fractal dimension and lacunarity of psoriatic lesions-a colour approach. Medicine 6(4):7

  31. Jaffar MA, Siddiqui AB, Mushtaq M (2018) Ensemble classification of pulmonary nodules using gradient intensity feature descriptor and differential evolution. Clust Comput 21(1):393–407

    Google Scholar 

  32. Jørgensen AS, Rasmussen AM, Andersen NKM, Andersen SK, Emborg J, Røge R, Østergaard LR (2017) Using cell nuclei features to detect colon cancer tissue in hematoxylin and eosin stained slides. Cytom Part A 91(8):785–793

    Google Scholar 

  33. Kalkan H, Nap M, Duin RP, Loog M (2012) Automated classification of local patches in colon histopathology. In: 2012 21st international conference on pattern recognition (ICPR), IEEE, pp 61–64

  34. Karnan M, Logheshwari T (2010) Improved implementation of brain mri image segmentation using ant colony system. In: 2010 IEEE international conference on computational intelligence and computing research, IEEE, pp 1–4

  35. Kather JN, Weis CA, Bianconi F, Melchers SM, Schad LR, Gaiser T, Marx A, Zöllner FG (2016) Multi-class texture analysis in colorectal cancer histology. Sci Rep 6:27988

    Google Scholar 

  36. Kečo D, Subasi A, Kevric J (2018) Cloud computing-based parallel genetic algorithm for gene selection in cancer classification. Neural Comput Appl 30(5):1601–1610

    Google Scholar 

  37. Khan A, Qureshi AS, Hussain M, Hamza MY, et al. (2019) A recent survey on the applications of genetic programming in image processing. arXiv preprint arXiv:190107387

  38. Kira K, Rendell LA (1992) A practical approach to feature selection. In: Machine learning proceedings 1992, Elsevier, pp 249–256

  39. Kolter JZ, Ng AY (2009) Regularization and feature selection in least-squares temporal difference learning. In: Proceedings of the 26th annual international conference on machine learning, pp 521–528

  40. Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-weka 2.0: automatic model selection and hyperparameter optimization in weka. J Mach Learn Res 18(1):826–830

    MathSciNet  Google Scholar 

  41. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  42. Le Cessie S, Van Houwelingen JC (1992) Ridge estimators in logistic regression. J R Stat Soc Ser C Appl Stat 41(1):191–201

    MATH  Google Scholar 

  43. Li J, Sarma KV, Ho KC, Gertych A, Knudsen BS, Arnold CW (2017) A multi-scale u-net for semantic segmentation of histological images from radical prostatectomies. In: AMIA annual symposium proceedings, american medical informatics association, vol 2017, p 1140

  44. Li W, Li J, Sarma KV, Ho KC, Shen S, Knudsen BS, Gertych A, Arnold CW (2018) Path r-cnn for prostate cancer diagnosis and gleason grading of histological images. IEEE Trans Med Imaging 38(4):945–954

    Google Scholar 

  45. Liu L, Liu X, Wang N, Zou P (2018) Modified cuckoo search algorithm with variational parameters and logistic map. Algorithms 11(3):30

    MathSciNet  MATH  Google Scholar 

  46. Lu C, Zhu Z, Gu X (2014) An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. J Med Syst 38(9):97

    Google Scholar 

  47. Martins AS, Neves LA, Faria PR, Tosta TA, Bruno DO, Longo LC, do Nascimento MZ (2019) Colour feature extraction and polynomial algorithm for classification of lymphoma images. In: Iberoamerican congress on pattern recognition, Springer, pp 262–271

  48. Masood K, Rajpoot N, (2009) Texture based classification of hyperspectral colon biopsy samples using clbp. In: IEEE international symposium on biomedical imaging: from nano to macro, (2009) ISBI’09. IEEE, pp 1011–1014

  49. Mathworks (2020) Deep learning models. https://ch.mathworks.com/solutions/deep-learning/models.html. Accessed: 2020-06-20

  50. Mejbri S, Franchet C, Reshma IA, Mothe J, Brousset P, Faure E (2019) Deep analysis of cnn settings for new cancer whole-slide histological images segmentation: the case of small training sets

  51. Mitchell M (1998) An introduction to genetic algorithms. MIT press

  52. Mohamed AW, Sabry HZ, Khorshid M (2012) An alternative differential evolution algorithm for global optimization. J Adv Res 3(2):149–165

    Google Scholar 

  53. Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design

  54. Naiyar M, Asim Y, Shahid A (2015) Automated colon cancer detection using structural and morphological features. In: 2015 13th international conference on frontiers of information technology (FIT), IEEE, pp 240–245

  55. Ng AY (2004) Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on Machine learning, p 78

  56. Nikolaidis N, Nikolaidis I, Tsouros C (2011) A variation of the box-counting algorithm applied to colour images. arXiv preprint arXiv:11072336

  57. Özçift A, Gülten A (2013) Genetic algorithm wrapped bayesian network feature selection applied to differential diagnosis of erythemato-squamous diseases. Digit Signal Process 23(1):230–237

    MathSciNet  Google Scholar 

  58. Paul D, Su R, Romain M, Sébastien V, Pierre V, Isabelle G (2017) Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier. Comput Med Imaging Graph 60:42–49

    Google Scholar 

  59. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830

    MathSciNet  MATH  Google Scholar 

  60. Quinlan JR (2014) C4.5: programs for machine learning. Elsevier

  61. Rathore S, Iftikhar MA, Hussain M, Jalil A (2013) Classification of colon biopsy images based on novel structural features. In: 2013 IEEE 9th international conference on emerging technologies (ICET), IEEE, pp 1–6

  62. Remamany KP, Chelliah TR, Chandrasekaran K, Subraman K (2015) Brain tumor segmentation in mri images using integrated modified pso-fuzzy approach. Int Arab J Inf Technol 12(6A):797–805

    Google Scholar 

  63. Ribeiro MG, Neves LA, Roberto GF, Tosta TA, Martins AS, do Nascimento MZ, (2018) Analysis of the influence of color normalization in the classification of non-hodgkin lymphoma images. In: 2018 31st SIBGRAPI conference on graphics. Patterns and Images (SIBGRAPI), IEEE, pp 369–376

  64. Ribeiro MG, Neves LA, do Nascimento MZ, Roberto GF, Martins AS, Tosta TAA, (2019) Classification of colorectal cancer based on the association of multidimensional and multiresolution features. Expert Syst Appl 120:262–278. https://doi.org/10.1016/j.eswa.2018.11.034

    Article  Google Scholar 

  65. Roberto GF, Neves LA, Nascimento MZ, Tosta TA, Longo LC, Martins AS, Faria PR (2017) Features based on the percolation theory for quantification of non-hodgkin lymphomas. Comput Biol Med 91:135–147

    Google Scholar 

  66. Roberto GF, Nascimento MZ, Martins AS, Tosta TA, Faria PR, Neves LA (2019) Classification of breast and colorectal tumors based on percolation of color normalized images. Comput Graph 84:134–143

    Google Scholar 

  67. Samanta S, Ahmed SS, Salem MAMM, Nath SS, Dey N, Chowdhury SS (2014) Haralick features based automated glaucoma classification using back propagation neural network. In: FICTA (1), pp 351–358

  68. Schölkopf B, Smola AJ, Bach F, et al. (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press

  69. Shah M, Wang D, Rubadue C, Suster D, Beck A (2017) Deep learning assessment of tumor proliferation in breast cancer histological images. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM), IEEE, pp 600–603

  70. Shamir L, Orlov N, Eckley DM, Macura TJ, Goldberg IG (2008) Iicbu 2008: a proposed benchmark suite for biological image analysis. Med Biol Eng Comput 46(9):943–947

    Google Scholar 

  71. Siegel RL, Miller KD, Jemal A (2020) Cancer statistics, 2020. CA: A Cancer J Clin 70(1):7–30

  72. Sirinukunwattana K, Pluim JP, Chen H, Qi X, Heng PA, Guo YB, Wang LY, Matuszewski BJ, Bruni E, Sanchez U et al (2017) Gland segmentation in colon histology images: the glas challenge contest. Med Image Anal 35:489–502

    Google Scholar 

  73. Song Y, Li Q, Huang H, Feng D, Chen M, Cai W (2017) Low dimensional representation of fisher vectors for microscopy image classification. IEEE Trans Med Imaging 36(8):1636–1649

    Google Scholar 

  74. Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 847–855

  75. Van Ginneken B, Romeny BTH, Viergever MA (2001) Computer-aided diagnosis in chest radiography: a survey. IEEE Trans Med Imaging 20(12):1228–1241

    Google Scholar 

  76. Van Rossum G, Drake FL (2011) The python language reference manual. Network Theory Ltd

  77. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999

    Google Scholar 

  78. Welikala R, Fraz MM, Dehmeshki J, Hoppe A, Tah V, Mann S, Williamson TH, Barman SA (2015) Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy. Comput Med Imaging Graph 43:64–77

    Google Scholar 

  79. Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4(2):65–85

    Google Scholar 

  80. Xerri L, Dirnhofer S, Quintanilla-Martinez L, Sander B, Chan JK, Campo E, Swerdlow SH, Ott G (2016) The heterogeneity of follicular lymphomas: from early development to transformation. Virchows Arch 468(2):127–139

    Google Scholar 

  81. Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629

    Google Scholar 

  82. Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. In: Feature extraction, construction and selection, Springer, pp 117–136

  83. Yang XS, Deb S (2009) Cuckoo search via lévy flights. In: 2009 world congress on nature & biologically inspired computing (NaBIC), IEEE, pp 210–214

  84. Zhang X, Wang J, Hong C, Luo W, Wang C (2015) Design, synthesis and evaluation of genistein-polyamine conjugates as multi-functional anti-alzheimer agents. Acta Pharm Sin B 5(1):67–73

    Google Scholar 

  85. Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. ACM Sigkdd Explor Newslett 6(1):80–89

    Google Scholar 

Download references

Funding

This study was financed in part by the: National Council for Scientific and Technological Development CNPq (Grants Nos. #427114/2016-0, #304848/2018- 2, #430965/2018-4 and #313365/2018-0); and State of Minas Gerais Research Foundation - FAPEMIG (Grant No. #APQ-00578-18).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniela F. Taino.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Taino, D.F., Ribeiro, M.G., Roberto, G.F. et al. Analysis of cancer in histological images: employing an approach based on genetic algorithm. Pattern Anal Applic 24, 483–496 (2021). https://doi.org/10.1007/s10044-020-00931-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-020-00931-3

Keywords

Navigation