Sedimentary phosphate classification based on spectral analysis and machine learning

https://doi.org/10.1016/j.cageo.2021.104696Get rights and content

Highlight

  • A new spectral dataset of a sedimentary phosphate deposit is created (250–2500 nm).

  • An approach to select relevant features for better classification is proposed.

  • Informative subsets of the features are analytically selected.

  • Phosphate classification exceeded 90% accuracy with 3 selected features only.

  • The obtained features can be used for an on-site multispectral embedded system.

Abstract

The process of phosphate extraction can significantly benefit from the advances in spectral analysis and Artificial Intelligence to reduce the cost of the drilling operation. The ambiguities caused by the apparent similarities between different layers and by the existing mineralogical alterations complexify the delineation of phosphate layers with conventional vision systems. In this paper, we established a spectral signature database of representative samples collected from the Ben Guerir deposit in Morocco, over the 250–2500 nm spectral range (mid-ultraviolet to mid-infrared). The aim is to build a spectral database of the samples and select an optimal waveband set capable of good discrimination, which can be used in a multispectral inspection system for an accurate delimitation of the soil layers during the drilling operation. First, the reflectance signatures of the extracted soil samples were collected. Second, principal component analysis (PCA) loadings were investigated to determine the most informative feature subspaces. A Bhattacharyya distance (B-distance) separability test was then implemented to select the most dissociating combinations of 3 bands from these subspaces. Finally, a machine learning classification test was used to evaluate the capacity of the selected features to discriminate phosphate samples. The results demonstrate the impact of selecting an informative reduced feature set and show good discrimination rates based on the combined information of wavelengths from the Ultraviolet (UV) or the Near-infrared (NIR) spectral ranges.

Introduction

Phosphate deposits represent a major asset for Morocco. With approximately 70% of the total world reserve (Ober, 2018), Morocco is the 1st exporter country of phosphate. It is also the 2nd phosphate producer worldwide, a ranking obtained due to a relentless extraction strategy implemented by the OCP group, which relies on a continuous process exploiting the rich phosphate layers. The fuzzy aspect of the limits of the layers introduces impurities during the extraction process, which implies inserting preprocessing steps along the production line to obtain consumable phosphate. Multispectral and hyperspectral remote sensing could prove useful in this process by reducing the fuzziness impact.

Spectral analysis and imaging, particularly hyperspectral imaging (HSI), have been deeply incorporated in both agriculture and mineralogical based studies. The resulting high dimensional data represents the responses of the studied elements over narrow spectral bandwidths. Each element gives a distinct spectral response according to its physiochemical construction and reaction to electromagnetic energy. As a result, the features that best differentiate the elements can be identified (Gupta, 2017). Studies have been conducted to identify spectral signatures of pure minerals by combining both X-ray diffraction and spectrophotometry. Thus, reliable and reference data sets were constructed, namely the United States Geological Survey (USGS) and the ECOSTRESS spectral libraries (Baldridge et al., 2009; Kokaly et al., 2017; Sucich et al., 2018).

For sedimentary phosphate, the difficulty resides in the presence of complex mixtures. In addition to the existing horizontal and vertical variabilities, the discontinuity of the geological repository and the occurrence of infiltration phenomena enrich the heterogeneity of the deposits. Moreover, the color shade similarities between phosphate class and other mineral types referred to as sterile, such as limestone (Fig. 1) restrain the use of supervised classification methods (Nash et al., 2004; Maloy and Treiman, 2007). These observations raise the question of whether spectral analysis can be used to identify the ambiguous limits between the phosphate and the sterile layers and select adequate features for their discrimination.

In this context, we undertook this study to:

  • 1.

    Explore the use of spectral information, without any further experimental analysis, to algorithmically select major discriminating features between phosphate rocks and sterile.

  • 2.

    Pave the way for the inspection of spectral imaging techniques in sedimentary phosphate classification and mapping, a field poorly investigated so far.

Essentially, the aim is to assess the feasibility of identifying spectral wavelengths, using spectroscopic analysis, over which accurate phosphate classification is possible. The results should guide the selection of an adequate multispectral inspection sensor to use on-site for an optimized phosphate extraction process.

Under this perspective, we propose a new classification approach that focuses on spectral analysis and feature selection prior to using a machine learning classifier. First, representative samples are collected from a major Moroccan phosphate deposit. The samples are then prepared to undergo a spectral analysis whereby reflectance responses between 250 and 2500 nm are collected. The spectral library contains representative spectra of both phosphate and sterile classes. Secondly, PCA is applied to define descriptive spaces with characterizing wavelengths of each class. The union of both spaces is then subjected to a separability evaluation using the Bhattacharyya distance to find the best features differentiating the two classes. Finally, the K-Nearest Neighbors (KNN) algorithm is used to evaluate whether our feature reduction approach can improve the phosphate classification accuracy. Other classifiers are tested for comparison as well.

The rest of this paper is organized as follows: Section 2 provides a brief overview of conducted studies in spectral analysis. Section 3 describes the working environment and the preprocessing steps applied to the collected samples. Section 4 details the proposed analytical approach for this classification problem. Experimental results and discussion are the subjects of Section 5, and the main conclusions are drawn in the last section.

Section snippets

Related work

Studying the behavior of earth and soil elements over a growing spectral range is a trending research field. The spectral response contains meaningful information that can be used for several purposes, especially with the growing accessibility to various equipment like air born, satellite-based, or laboratory fitted appliances like spectroscopes.

Many studies of minerals and rocks enabled the construction of the USGS library, the largest -still not exhaustive- spectral database available (Kokaly

Study area and samples collection

Samples were retrieved from the Ben Guerir deposit, a major extraction site of phosphate in Morocco. Situated in the Gantour basin, it goes 25–30 km from east to west, and 10–20 km from north to south, with an average altitude of 430 m containing phosphate series from the Lutetian age to the Maastrichtian age (Daafi et al., 2014). From a cutting face of this deposit, alterations of phosphate beds (layers, furrows, and bundles) and sterile to slightly phosphatic levels or interlayers can be

Proposed approach

In our study, we faced two main challenges: first, the studied site was not previously explored for spectral analysis ends, and no existing reference data of the deposit was available. Second, the complex composition of the soil samples is likely to result in similar reflectance behavior over some particular wavelengths and negatively impact the discrimination outcome, leading to errors in the classification model. In response to these challenges, we established an approach with three main

Spectral data set construction

Spectral signatures were collected for the gathered samples. The latter were selected in a way to ensure the representativeness of the deposit. The binary ground truth labeling of the samples was performed by experts, in conformity with the standards and protocols of the field.

Representative signatures from the constructed spectral library are displayed (Fig. 5 to Fig. 8). The spectral reflectances are grouped according to the geological ages of the samples and following the vertical layering

Conclusion

In this paper, we constructed a spectral database and investigated the possibility of analytically identifying spectral features to discriminate between phosphate and sterile samples from the Moroccan Ben Guerir deposit. The obtained spectral signatures were analyzed to highlight the difficulty to distinguish between the spectral responses and the raised challenges to develop an accurate discrimination system. The proposed approach enabled the identification and use of highly informative

Computer code availability

The python version used in this article is 3.6.4 and is freely available for download (https://www.python.org/downloads/release/python-364/. All source code developed is open source and available at GitHub site: https://github.com/Crajaa/spec_phos_clf.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The Authors would like to acknowledge the support through the R&D Initiative -Appel à projets autour des phosphates APPHOS- sponsored by OCP (OCP Foundation, R&D OCP, Mohammed VI Polytechnic University, National Center of Scientific and technical Research CNRST, Ministry of Higher Education, Scientific Research and Professional Training of Morocco MESRSFC) under the project entitled *Feasibility Study of a system discriminating between phosphate and sterile during the mining extraction process

References (55)

  • X. Li

    Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    Optic Laser. Technol.

    (2018)
  • Y. Liu

    A self-trained semisupervised SVM approach to the remote sensing land cover classification

    Comput. Geosci.

    (2013)
  • M. Morchid

    Feature selection using principal component analysis for massive retweet detection

    Pattern Recogn. Lett.

    (2014)
  • G.D. Nash et al.

    Hyperspectral detection of geothermal system-related soil mineralogy anomalies in Dixie Valley, Nevada: a tool for exploration

    Geothermics

    (2004)
  • J.B. Percival et al.

    Mineralogy and spectral signature of reactive gossans, Victoria Island, NT, Canada

    Appl. Clay Sci.

    (2016)
  • R. Vašát et al.

    Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation

    Comput. Geosci.

    (2017)
  • H. Yao et al.

    Spectral preprocessing and calibration techniques

  • J.M. Amigo et al.

    Preprocessing of hyperspectral and multispectral images

    Data Handling Sci. Technol.

    (2020)
  • M.S. Ansari et al.

    Determining wavelenth for nitrogen and phosphorus nutrients through hyperspectral remote sensing in wheat (Triticum aestivum L.) plant

    Int. J. Bio-res. Stress Manag.

    (2016)
  • L. Armi et al.
  • I. Bogrekci et al.

    Improving phosphorus sensing by eliminating soil particle size effect in spectral measurement

    Trans. ASAE

    (2005)
  • A. Boujo
  • R.B. Cattell

    The scree test for the number of factors

    Multivariate Behav. Res.

    (1966)
  • H. Cen

    Hyperspectral imaging-based classification and wavebands selection for internal defect detection of pickling cucumbers

  • R.N. Clark

    Spectroscopy of rocks and minerals, and principles of spectroscopy

    Man. Rem. Sens.

    (1999)
  • M. Cui et al.

    Class-dependent sparse representation classifier for robust hyperspectral image classification’’

    IEEE Trans. Geosci. Rem. Sens.

    (2014)
  • X. Cui

    Analysis and classification of kidney stones based on Raman spectroscopy

    Biomed. Optic Express

    (2018)
  • Cited by (0)

    View full text