Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images

Aria, Massimo; D’Ambrosio, Antonio; Iorio, Carmela; Siciliano, Roberta; Cozza, Valentina

doi:10.1007/s00362-018-0997-x

Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images

Regular Article
Published: 27 March 2018

Volume 61, pages 1645–1661, (2020)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Massimo Aria¹,
Antonio D’Ambrosio ORCID: orcid.org/0000-0002-1905-037X¹,
Carmela Iorio²,
Roberta Siciliano² &
…
Valentina Cozza³

261 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, multivalued data or multiple values variables are defined. They are typical when there is some intrinsic uncertainty in data production, as the result of imprecise measuring instruments, such as in image recognition, in human judgments and so on. So far, contributions in symbolic data analysis literature provide data preprocessing criteria allowing for the use of standard methods such as factorial analysis, clustering, discriminant analysis, tree-based methods. As an alternative, this paper introduces a methodology for supervised classification, the so-called Dynamic CLASSification TREE (D-CLASS TREE), dealing simultaneously with both standard and multivalued data as well. For that, an innovative partitioning criterion with a tree-growing algorithm will be defined. Main result is a dynamic tree structure characterized by the simultaneous presence of binary and ternary partitions. A real world case study will be considered to show the advantages of the proposed methodology and main issues of the interpretation of the final results. A comparative study with other approaches dealing with the same types of data will be also shown. The comparison highlights that, even if the results are quite similar in terms of error rates, the proposed D-CLASS tree returns a more interpretable tree-based structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient computer-aided diagnosis model for classifying melanoma cancer using fuzzy-ID3-pvalue decision tree algorithm

Article 20 February 2024

Hamidreza Rokhsati, Khosro Rezaee, … Saeed Kosari

Computational diagnosis of skin lesions from dermoscopic images using combined features

Article 19 March 2018

Roberta B. Oliveira, Aledir S. Pereira & João Manuel R. S. Tavares

Malignant melanoma detection using multi-scale image decomposition and a new ensemble-learning scheme

Article 31 July 2023

Asmae Ennaji, Hasnae El Khoukhi, … Abdellah Aarab

References

Argenziano G, Fabbrocini G, Carli P, De Giorgi V, Sammarco E, Delfino M (1998) Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the abcd rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Archiv Dermatol 134(12):1563–1570
Google Scholar
Bergmann B, Hommel G (1988) Improvements of general multiple test procedures for redundant systems of hypogheses. In: Bauer P, Hommel G, Sonnemann E (eds) Multiple hypothesenprüfung (Multiple hypotheses testing). Springer, Berlin, pp 100–115
Bashir S, Qamar U, Khan FH (2014) Heterogeneous classifiers fusion for dynamic breast cancer diagnosis using weighted vote based ensemble. Qual Quant 49:2061–2076
Google Scholar
Billard L, Diday E (2003) From the statistics of data to the statistics of knowledge: symbolic data analysis. J Am Stat Assoc 98(462):470–487
MathSciNet Google Scholar
Bock HH, Diday E (2012) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer Science & Business Media, Berlin
MATH Google Scholar
Bono A, Tomatis S, Bartoli C, Tragni G, Radaelli G, Maurichi A, Marchesini R (1999) The abcd system of melanoma detection. Cancer 85(1):72–77
Google Scholar
Borgoni R, Berrington A (2013) Evaluating a sequential tree-based procedure for multivariate imputation of complex missing data structures. Qual Quant 47(4):1991–2008
Google Scholar
Box GE, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26(2):211–252
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7):1145–1159
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MATH Google Scholar
Breiman L, Friedman J, Olshen RA, Stone CJ (1984) Classification and regression trees. CRC Press, Boca Raton
MATH Google Scholar
Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3
Google Scholar
Cappelli C, Mola F, Siciliano R (2002) A statistical approach to growing a reliable honest tree. Comput Stat Data Anal 38(3):285–299
MathSciNet MATH Google Scholar
Celebi ME, Kingravi HA, Uddin B, Iyatomi H, Aslandogan YA, Stoecker WV, Moss RH (2007) A methodological approach to the classification of dermoscopy images. Comput Med Imag Graph 31(6):362–373
Google Scholar
Couso I, Sánchez L (2011) Mark-recapture techniques in statistical tests for imprecise data. Int J Approx Reason 52(2):240–260
MathSciNet MATH Google Scholar
Cozza V, Guarracino MR, Maddalena L, Baroni A (2011) Dynamic clustering detection through multi-valued descriptors of dermoscopic images. Stat Med 30(20):2536–2550
MathSciNet Google Scholar
D’Ambrosio A, Aria M, Siciliano R (2012) Accurate tree-based missing data imputation and data fusion within the statistical learning paradigm. J Classif 29(2):227–258
MathSciNet MATH Google Scholar
D’Ambrosio A, Aria M, Iorio C, Siciliano R (2017) Regression trees for multivalued numerical response variables. Expert Syst Appl 69:21–28
Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, pp 1–15
Google Scholar
Ferraro MB, Coppi R, Rodríguez GG, Colubi A (2010) A linear regression model for imprecise response. Int J Approx Reason 51(7):759–770
MathSciNet MATH Google Scholar
Ferraro MB, Colubi A, González-Rodríguez G, Coppi R (2011) A determination coefficient for a linear regression model with imprecise response. Environmetrics 22(4):516–529
MathSciNet MATH Google Scholar
Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38
Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
MathSciNet MATH Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
MATH Google Scholar
Garcia S, Herrera F (2008) An extension on ”statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9(Dec):2677–2694
MATH Google Scholar
Gil MÁ, Montenegro M, González-Rodríguez G, Colubi A, Casals MR (2006) Bootstrap approach to the multi-sample test of means with imprecise data. Comput Stat Data Anal 51(1):148–162
MathSciNet MATH Google Scholar
Górecki T, Krzyśko M, Waszak L, Wołyński W (2016) Selected statistical methods of data analysis for multivariate functional data. Stat Pap 59(1):1–30. https://doi.org/10.1007/s00362-016-0757-8
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J, Franklin J (2005) The elements of statistical learning: data mining, inference and prediction. Math Intell 27(2):83–85
Google Scholar
Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Theory Methods 9(6):571–595
MATH Google Scholar
Iorio C, Frasso G, DAmbrosio A, Siciliano R (2016) Parsimonious time series clustering using p-splines. Expert Syst Appl 52:26–38
Google Scholar
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621
MATH Google Scholar
Lange T, Mosler K, Mozharovskyi P (2014) Fast nonparametric classification based on data depth. Stat Pap 55:49–69
MathSciNet MATH Google Scholar
Limam M, Diday E, Winsberg S (2003) Symbolic class description with interval data. J Symb Data Anal 1(1)
Maglogiannis I, Kosmopoulos DI (2006) Computational vision systems for the detection of malignant melanoma. Oncol Rep 15(4):1027–1032
Google Scholar
Makinde OS (2016) Classification rules based on distribution functions of functional depth. Stat Pap. https://doi.org/10.1007/s00362-016-0841-0
Mballo C, Diday E (2005) Decision trees on interval valued variables. Electron J Symb Data Anal 3(1):8–18
Google Scholar
Mosler K, Mozharovskyi P (2015) Fast dd-classification of functional data. Stat Pap. https://doi.org/10.1007/s00362-015-0738-3
Nachbar F, Stolz W, Merkle T, Cognetta AB, Vogt T, Landthaler M, Bilek P, Braun-Falco O, Plewig G (1994) The abcd rule of dermatoscopy: high prospective value in the diagnosis of doubtful melanocytic skin lesions. J Am Acad Dermatol 30(4):551–559
Google Scholar
Otsu N (1975) A threshold selection method from gray-level histograms. Automatica 11(285–296):23–27
Google Scholar
Périnel E, Lechevallier Y (2000) Symbolic discrimination rules. In: Bock HH, Diday E (eds) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer, Berlin, pp 244–265
MATH Google Scholar
Siciliano R, Aria M, Conversano C (2004) Harvesting trees: methods, software and applications. In: Proceedings in Computational Statistics: 16th Symposium of IASC. COMPSTAT2004, held Prague
Siciliano R, Tutore VA, Aria M, D’Ambrosio A (2010) Trees with leaves and without leaves. In: Proceedings of the 45th Scientific Meeting of the Italian Statistical Society. Italian Statistical Society
Situ N, Yuan X, Zouridakis G (2011) Assisting main task learning by heterogeneous auxiliary tasks with applications to skin cancer screening. J Mach Learn Res 15:688
Google Scholar
Tarpey T, Kinateder KK (2003) Clustering functional data. J Classif 20(1):093–114
MathSciNet MATH Google Scholar
Tutore VA, Siciliano R, Aria M (2007) Conditional classification trees using instrumental variables. In: Berthold M, Shawe-Taylor J, Lavrač N (eds) Advances in intelligent data analysis VII. IDA 2007. Lecture Notes in Computer Science, vol 4723. Springer, Berlin, pp 163–173
Google Scholar
Viertl R (2003) Statistical inference with imprecise data. Encyclopedia of life support systems. UNESCO, Paris. Online publication: http://www.eolss.unesco.org
Viertl R (1997) On statistical inference for non-precise data. Environmetrics 8(5):541–568
Google Scholar
Yang MS, Hwang PY, Chen DH (2004) Fuzzy clustering algorithms for mixed feature variables. Fuzzy Sets Syst 141(2):301–317
MathSciNet MATH Google Scholar
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of the ICML. Citeseer, vol 1, pp 609–616

Download references

Acknowledgements

Authors would like to thank Prof. A. Baroni of the Campania University “Luigi Vanvitelli” (Italy) for kindly providing us the Skin lesions data set. Authors would like to thank two anonymous reviewers whose comments highly contribute to improve the quality of the manuscript.

Author information

Authors and Affiliations

Department of Economics and Statistics, University of Naples Federico II, Naples, Italy
Massimo Aria & Antonio D’Ambrosio
Department of Industrial Engineering, University of Naples Federico II, Naples, Italy
Carmela Iorio & Roberta Siciliano
Department of Law, Parthenope University of Naples, Naples, Italy
Valentina Cozza

Authors

Massimo Aria
View author publications
You can also search for this author in PubMed Google Scholar
Antonio D’Ambrosio
View author publications
You can also search for this author in PubMed Google Scholar
Carmela Iorio
View author publications
You can also search for this author in PubMed Google Scholar
Roberta Siciliano
View author publications
You can also search for this author in PubMed Google Scholar
Valentina Cozza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio D’Ambrosio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aria, M., D’Ambrosio, A., Iorio, C. et al. Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images. Stat Papers 61, 1645–1661 (2020). https://doi.org/10.1007/s00362-018-0997-x

Download citation

Received: 21 April 2017
Revised: 14 March 2018
Published: 27 March 2018
Issue Date: August 2020
DOI: https://doi.org/10.1007/s00362-018-0997-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images

Abstract

Access this article

Similar content being viewed by others

An efficient computer-aided diagnosis model for classifying melanoma cancer using fuzzy-ID3-pvalue decision tree algorithm

Computational diagnosis of skin lesions from dermoscopic images using combined features

Malignant melanoma detection using multi-scale image decomposition and a new ensemble-learning scheme

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images

Abstract

Access this article

Similar content being viewed by others

An efficient computer-aided diagnosis model for classifying melanoma cancer using fuzzy-ID3-pvalue decision tree algorithm

Computational diagnosis of skin lesions from dermoscopic images using combined features

Malignant melanoma detection using multi-scale image decomposition and a new ensemble-learning scheme

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation