Machine learning applied to rock geochemistry for predictive outcomes: The Neapolitan volcanic history case

https://doi.org/10.1016/j.jvolgeores.2021.107254Get rights and content

Highlights

  • We explore machine learning in analysis of rock geochemistry from the Neapolitan volcanoes.

  • We use a data set of 9800 volcanic rocks assembled from the literature.

  • Predictive ability has been evaluated for several machine learning methods.

  • Machine learning prediction can be useful in petrology and tephrostratigraphy.

  • Machine learning is an interesting tool to improve volcanological studies.

Abstract

In this paper we explore the efficiency of various machine learning techniques to determine the volcano source, the eruptive formation and the eruption period of volcanic rocks when their chemical contents are known. With this aim, we assembled a data set of 9800 volcanic rocks from the open-access literature. The rocks belong to eruptive formations from Somma-Vesuvius, Campi Flegrei, Ischia and Procida volcanoes, in the Neapolitan region of Italy. The data set includes content of major oxides and trace elements, as well as Sr and Nd isotope ratios, eruptive periods, eruption formations and volcano source. Some discrete numerical variables are missing in certain samples resulting in data exclusion and measurement inhomogeneity. Our results indicate that, despite such issues, some machine learning algorithms have a very high prediction ability, i.e., at >70%. The achieved results are interesting in order to facilitate the managing of new data for volcanological reconstruction and tephrostratigraphic studies.

Introduction

Volcanic rocks are continuously analyzed in order to understand the origin, evolution, and behavior of magmatic and volcano systems with assessment of environments and natural hazards. The literature allows several data that characterize rocks produced by each eruption and a lot of data often exist for various recognized eruptive deposits. Rock geochemistry studies provide the basilar characterization of eruptive deposits and of related frozen-in magmatic processes. An example that has international audience and we particularly know well, is from the Neapolitan volcanoes (Fig. 1) where rock geochemistry studies are really abundant (Arienzo et al., 2009, Arienzo et al., 2016; Barberi et al., 1978; Beccaluva et al., 1991; Belkin et al., 1993; Brown et al., 2008, Brown et al., 2014; Cannatelli et al., 2007; Caprarelli et al., 1993; Carbone et al., 1984; Casalini et al., 2017; Cecchetti et al., 2003; Civetta et al., 1991; Civetta and Santacroce, 1992; Civetta et al., 1997; Cioni et al., 1995; Cortini and Hermes, 1981; Crisci et al., 1989; D'Antonio, 1991; D'Antonio and Di Girolamo, 1994; De Astis et al., 2004; De Vita et al., 1999; D'Antonio et al., 1999; Di Renzo et al., 2007; Fedele et al., 2008, Fedele et al., 2016; Fourmentraux et al., 2012; Forni et al., 2016, Forni et al., 2018; Gebauer et al., 2014; Hawkesworth and Vollmer, 1979; Insinga et al., 2006; Iovine et al., 2017; Landi et al., 1999; Marianelli et al., 1999, Marianelli et al., 2006; Mastrolorenzo et al., 1993; Melluso et al., 2012, Melluso et al., 2014; Pabst et al., 2008; Orsi et al., 1992, Orsi et al., 1995; Pappalardo, 1994; Pappalardo et al., 1999, Pappalardo et al., 2002a, Pappalardo et al., 2002b, Pappalardo et al., 2008; Piochi, 1994; Piochi et al., 1999, Piochi et al., 2005b, Piochi et al., 2006, Piochi et al., 2008; Rosi et al., 1993; Santacroce, 1987; Santacroce et al., 1993, Santacroce et al., 2008; Signorelli et al., 1999a, Signorelli et al., 1999b; Smith et al., 2011; Somma et al., 2001; Stock et al., 2018; Tomlinson et al., 2012; Tonarini et al., 2009; Vezzoli, 1988; Villemant et al., 1993; Webster et al., 2003).

Traditionally, these studies use bivariate plots, sometimes triplots, to recognize evolution patterns and to interpret data by means of visual inspection (Harker, 1909; Pearce, 1968; Le Bas et al., 1986; Rock and Carroll, 1989). However, data abundance may determine ambiguous data clouds, numerous samples cannot be efficiently compared and several variables are not certainly managed. Therefore, traditional methods are not practical for large databases and are subject to a certain arbitrary selection of discriminant variables by operators. Tough, statistical approaches (discriminant analysis, clustering, etc) are used in rock geochemistry (e.g. Buccianti et al., 2006; Verma et al., 2005) although they are not fully adequate in volcanic rocks showing compositional variability and give more satisfactory results in other branches of the Earth Sciences, for example the fluid geochemistry. In any case, according to our knowledge, statistical approaches have not been used for geochemistry of Neapolitan volcanic rocks, the object of this study. Machine learning (ML) methods provide effective tools for big data analytics. The methods are increasingly applied in Earth sciences, i.e., petroleum, geochemical prospecting, mineral resource, geochemistry (Agoubi et al., 2016; Bolandi et al., 2015; Zuo, 2017; Chen et al., 2019). This work presents the results of several machine learning methods applied in geochemistry of rocks produced by volcanic eruptions at the Neapolitan area in Italy (Fig. 1).

The main benefits of using machine learning techniques are:

  • 1)

    Machine learning algorithms are very general and, without any specific change, except the input data, they can be applied to a huge variety of different problems despite their differences. So basically there is no need to put any efforts to find any topic-specific mathematical or statistical approach as well as specific patterns into data (in other words, machine learning methods find themselves such approaches or patterns). That's the reason why such techniques are becoming very popular in many different areas as medical diagnosis, economical choices, face authentication, scams recognition and many others;

  • 2)

    The only effort required is providing enough data to let the machine learn how to solve the problem and then the system will be able to autonomously and automatically solve the same problem in different situations, assuming that such situations are similar to the provided ones. Despite similar is not an objective definition, the machine learning ability to generalize the provided data can be measured after the learning;

  • 3)

    Many times they achieve great results even if available data are heterogeneous;

  • 4)

    Even if machines cannot replace humans and these algorithms have to be considered as support to scientists, machines lack of emotions can be very advantageous to avoid statistical bias. In fact, sometimes, for scientists, it can be difficult to be unbiased. For example, this is the reason why medical statistical procedures require double blinded protocols.

There are some applications in rock geochemistry (Malmgren and Nordlund, 1996; Petrelli et al., 2016, Petrelli et al., 2017; Jiao et al., 2018; Zhang et al., 2019); unsupervised and self-organizing techniques are also for the Neapolitan area (Esposito et al., 2020a, Esposito et al., 2020b). Differently from previous works, we use a huge dataset of unselected 9800 samples related to 384 eruption formations (or units) and 54 variables concerning common both whole-rock and glass chemistry. Furthermore, rather than using a single machine learning technique we applied several methods to find the one showing the best accuracy. These techniques “learn” from data input, so they should be able to unravel information even if there is data disproportion (not uniform distribution among series) and inhomogeneity (absence of certain parameters or predictors in series), and of heterogeneous composition of eruptive products and single rock units. Our aim is to explore the efficiency of machine learning approaches to automatically elaborate rock geochemistry data. More specifically, our main goal was getting an answer to the following question: “can machine learning approaches be used to predict the volcano source, the eruptive formation and the eruption period of an unknown deposit on the basis of rock geochemistry?”

Section snippets

Machine learning (ML) methods

As well known, machine learning algorithms are divided into two groups: supervised methods and unsupervised ones (Lantz, 2012).

The most general problem approachable by a supervised method is trying to predict a variable (answer or label) when some others (predictors) are known. As an example, in medical diagnosis, the predictors could be clinical parameters such as temperature, blood pressure, colesterol, etc. and the answer could be the knowledge if the patient is sick or not.

The first step to

Study area and data

The Neapolitan area (Fig. 1) belongs to the Italian volcanic belt developed in strict conjunction with seismicity from North to South along the peninsula due the collision between the African and Euro-asiatic plates (e.g., Doglioni, 2008). The area includes the Somma-Vesuvius and the Phlegraean Volcanic District which extends from Campi Flegrei, Procida to Ischia (Piochi et al., 2005a). Somma-Vesuvius is a classical strato-volcano famous for the plinian eruption of 79 BCE that destroyed the

Results

Before applying machine learning techniques some well known pre-processing steps are required to prepare data (Lantz, 2012): for example, as we had several missing data, in order to avoid dropping the columns we used “data imputation” technique and replaced them with their column averages. This is a common practice when there is no additional information to replace missing values with more specific ones according to other parameters. In such a way, we used all the discrete variables, i.e., the

Discussion and conclusions

Our analysis evidences that, even if a major problem arises from data inhomogeneity and/or disproportion, machine learning techniques may provide an efficient tool to classify rocks.

The dataset is not equilibrated for each volcano: Campi Flegrei has an extraordinary overabundance of data, followed by Somma-Vesuvius and Ischia and later on Procida (Fig. 3a). The above overabundance derives from the preference to study some eruptive units having more impact to volcanological and archeological

Declaration of Competing Interest

None.

Acknowledgements

We are thankful to the two anonymous reviewers for their comments very helping to improve the article. We are also grateful to the editor Prof. Alessandro Aiuppa for handling the manuscript.

References (98)

  • M.A. Di Vito et al.

    Volcanism and deformation since 12,000 years at the Campi Flegrei caldera (Italy)

    J. Volcanol. Geotherm. Res.

    (1999)
  • M.A. Di Vito et al.

    The late pleistocene pyroclastic deposits of the Campanian plain: new insights into the explosive activity of Neapolitan volcanoes

    J. Volcanol. Geotherm. Res.

    (2008)
  • F.G. Fedele et al.

    Timescales and cultural process at 40,000 bp in the light of the Campanian Ignimbrite eruption, western eurasia

    J. Hum. Evol.

    (2008)
  • L. Fedele et al.

    A chemostratigraphic study of the Campanian Ignimbrite eruption (Campi Flegrei, Italy): insights on magma chamber withdrawal and deposit accumulation as revealed by compositionally zoned stratigraphic and facies framework

    J. Volcanol. Geotherm. Res.

    (2016)
  • R.V. Fisher et al.

    Mobility of a large-volume pyroclastic flow—emplacement of the Campanian Ignimbrite, Italy

    J. Volcanol. Geotherm. Res.

    (1993)
  • F. Forni et al.

    The origin of a zoned ignimbrite: insights into the Campanian Ignimbrite magma chamber (Campi Flegrei,Iitaly)

    Earth Planet. Sci. Lett.

    (2016)
  • D. Insinga et al.

    The late-holocene evolution of the Miseno area (southwestern Campi Flegrei) as inferred by stratigraphy, petrochemistry and 40Ar/39Ar geochronology

  • R.S. Iovine et al.

    Source and magmatic evolution inferred from geochemical and sr-o-isotope data on hybrid lavas of arso, the last eruption at Ischia island (Italy; 1302 a.d.)

    J. Volcanol. Geotherm. Res.

    (2017)
  • S. Jiao et al.

    Progress and challenges of big data research on petrology and geochemistry

    Solid Earth Sci.

    (2018)
  • G. Mastrolorenzo et al.

    Vesuvius 1906: a case study of a paroxysmal eruption and its relation to eruption cycles

    J. Volcanol. Geotherm. Res.

    (1993)
  • L. Melluso et al.

    The crystallization of shoshonitic to peralkaline trachy- phonolitic magmas in a H2O–Cl–F-rich environment at Ischia (Italy), with implications for the feeder system of the Campania Plain volcanoes

    Lithos

    (2014)
  • G. Orsi et al.

    A comprehensive study of pumice formation and dispersal: the Cretaio tephra of Ischia (Italy)

    J. Volcanol. Geotherm. Res.

    (1992)
  • G. Orsi et al.

    Step-filling and development of a three-layer magma chamber: the Neapolitan Yellow Tuff case history

    J. Volcanol. Geotherm. Res.

    (1995)
  • L. Pappalardo et al.

    Chemical and Sr-isotopical evolution of the Phlegraean magmatic system before the Campanian Ignimbrite and the Neapolitan Yellow Tuff eruptions

    J. Volcanol. Geotherm. Res.

    (1999)
  • L. Pappalardo et al.

    Timing of magma extraction during the Campanian Ignimbrite eruption (Campi Flegrei caldera)

    J. Volcanol. Geotherm. Res.

    (2002)
  • M. Petrelli et al.

    Combining machine learning techniques, microanalyses and large geochemical datasets for tephrochronological studies in complex volcanic areas: New age constraints for the pleistocene magmatism of Central Italy

    Quat. Geochronol.

    (2017)
  • M. Piochi et al.

    Crustal contamination and crystal entrapment during polybaric magma evolution at Mt. Somma–Vesuvius volcano, Italy: Geochemical and Sr isotope evidence

    Lithos

    (2006)
  • M. Rosi et al.

    The 1631 Vesuvius eruption. A reconstruction based on historical and stratigraphical data

    J. Volcanol. Geotherm. Res.

    (1993)
  • R. Santacroce et al.

    Age and whole rock–glass compositions of proximal pyroclastics from the major explosive eruptions of Somma-Vesuvius: A review as a tool for distal tephrostratigraphy

    J. Volcanol. Geotherm. Res.

    (2008)
  • S. Signorelli et al.

    Origin of magmas feeding the plinian phase of the Campanian Ignimbrite eruption, Phlegrean fields (Italy): constraints based on matrix-glass and glass-inclusion compositions

    J. Volcanol. Geotherm. Res.

    (1999)
  • S. Signorelli et al.

    Pre-eruptive volatile (h2o, f, cl and s) contents of phonolitic magmas feeding the 3550-year old Avellino eruption from Vesuvius, southern Italy

    J. Volcanol. Geotherm. Res.

    (1999)
  • V.C. Smith et al.

    Tephrostratigraphy and glass compositions of post-15 kyr Campi Flegrei eruptions: implications for eruption history and chronostratigraphic markers

    Quat. Sci. Rev.

    (2011)
  • E.L. Tomlinson et al.

    Geochemistry of the Phlegraean Fields (Italy) proximal sources for major mediterranean tephras: Implications for the dispersal of plinian and co-ignimbritic components of explosive eruptions

    Geochim. Cosmochim. Acta

    (2012)
  • S. Tonarini et al.

    Geochemical and B–Sr–Nd isotopic evidence for mingling and mixing processes in the magmatic system that fed the Astroni volcano (4.1–3.8 ka) within the Campi Flegrei caldera (southern Italy)

    Lithos

    (2009)
  • B. Villemant et al.

    Geochemistry of Vesuvius volcanics during 1631–1944 period

    J. Volcanol. Geotherm. Res.

    (1993)
  • B. Agoubi et al.

    Assessment of hot groundwater in an arid area in Tunisia using geochemical and fuzzy logic approaches

    Environ. Earth Sci.

    (2016)
  • D. Andronico et al.

    Geological map of somma-vesuvius volcano

    Periodico Mineral.

    (1995)
  • I. Arienzo et al.

    Isotopic evidence for open system processes within the Campanian Ignimbrite (Campi Flegrei–Italy) magma chamber

    Bull. Volcanol.

    (2009)
  • D. Barber

    Bayesian Reasoning and Machine Learning

    (2012)
  • F. Barberi et al.

    The campanian ignimbrite: a major prehistoric eruption in the neapolitan area (Italy)

    Bull. Volcanol.

    (1978)
  • L. Beccaluva et al.

    Petrogenesis and tectonic setting of the Roman Volcanic Province, Italy

    Lithos

    (1991)
  • R.J. Brown et al.

    New insights into late pleistocene explosive volcanic activity and caldera formation on Ischia (southern Italy)

    Bull. Volcanol.

    (2008)
  • R.J. Brown et al.

    Geochemical and isotopic insights into the assembly, evolution and disruption of a magmatic plumbing system before and after a cataclysmic caldera-collapse eruption at ischia volcano (Italy)

    Contrib. Mineral. Petrol.

    (2014)
  • Antonella Buccianti et al.

    Compositional Data Analysis in the Geosciences: From Theory to Practice

    (2006)
  • A. Carbone et al.

    Caratteri petrografici dei livelli piroclastici rinvenuti in alcuni gravity cores nel golfo di Pozzuoli e nel golfo di Napoli

    Mem. Soc. Geol. Ital.

    (1984)
  • M. Casalini et al.

    Geochemical and radiogenic isotope probes of Ischia volcano, southern Italy: Constraints on magma chamber dynamics and residence time

    Am. Mineral.

    (2017)
  • A. Cecchetti et al.

    L’eruzione di Astroni (caldera dei Campi Flegrei): dati preliminari dallo studio di inclusioni silicatiche

    Atti della Società` toscana di Scienze naturali, Memorie, Serie A

    (2003)
  • C. Chen et al.

    Graph networks as a universal machine learning framework for molecules and crystals

    Chem. Mater.

    (2019)
  • R. Cioni et al.

    Compositional layering and syn-eruptive mixing of a periodically refilled shallow magma chamber: the ad 79 plinian eruption of Vesuvius

    J. Petrol.

    (1995)
  • Cited by (9)

    • Radon (<sup>222</sup>Rn) levels in thermal waters of the geothermally active Campi Flegrei volcanic caldera (Southern Italy): A framework study using a RAD7 radon detector

      2022, Journal of Volcanology and Geothermal Research
      Citation Excerpt :

      Though, 222Rn concentration likely reflects the local sedimentological, structural or hydrogeological setting, more than the bulk rock compositions that are homogeneous at the study sites. In fact, the eruptive deposits contain on average 13 ± 7 ppm of U and 40 ± 19 ppm of Th (from an extended merged database; Pignatelli and Piochi, 2021) that we are evaluating here as possible source of radon in the absence of suitable radium data. Trying to search a correlation with local lithology, we observe the relatively highest U and Th content (40 and 100 ppm) in the deposits outcropping nearby the 222Rn-richer water at Stufe (Monte Nuovo, Averno).

    • Conceptualization of a Cardiac Monitoring System via Bluetooth Communication Protocol to Mobile Devices

      2024, 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things, IDCIoT 2024
    View all citing articles on Scopus
    View full text