Abstract

One of the most important challenges in the authentication of olive oil is the determination of the geographical origin of virgin olive oil. In this work, we evaluated the capacity of two spectroscopic techniques, UV-Visible and ATR-FTMIR, coupled with chemometric tools to determine the geographical origin of olive oils. These analytical approaches have been applied to samples that have been collected during the period of olive oil production, in the Moroccan region of Beni Mellal-Khenifra. To develop a rapid analysis tool capable of authenticating the geographical origin of virgin olive oils from five geographical areas of the Moroccan region of Beni Mellal-Khenifra, UV-Visible and ATR-FTMIR spectral data were processed by chemometric algorithms. PCA was applied on the spectral data set to represent the data in a very small space, and then discrimination methods were applied on the principal components synthesized by the PCA. The application of the PCA-LDA method on the spectral data of UV-Visible and ATR-FTMIR shows a good ability to classify olive oils according to their geographical origin with a percentage of correct classification that represents 90.24% and 85.87%, respectively, and the processing of the spectral data of UV-Visible and ATR-FTMIR by PCA-SVM allows differentiating correctly between five olive oils with a correct classification rate of 100% and 97.56, respectively. This study demonstrated the feasibility of UV-Visible and ATR-FTMIR fingerprinting (routine technique) for the geographical classification of olive oils in the Moroccan region of Beni Mellal-Khenifra. Such developed methods can be proposed as alternative and complementary methods to authenticate the geographical origin of virgin olive oil.

1. Introduction

The virgin olive oil is becoming a veritable obsession in the world thanks to its taste qualities and its medicinal and nutritional virtues. Its consumption is increasing every year and has become an essential element in the diet of Mediterranean countries [1, 2]. There are four quality categories of olive oil, extra virgin, virgin, common virgin, and lampante olive oil [3], which are determined by physicochemical and sensory analyses as described in the guide of the International Olive Council.

Generally, the quality of olive oil depends on several factors such as variety, edaphic factors, and climatic factors [4, 5]. These last factors play an important role in the chemical composition of olive oils in terms of fatty acids, vitamin E, sterols, and polyphenols. For this reason, the determination of the geographical origin has now become an important parameter for judging the quality of olive oils, since it is one of the factors causing significant differences in organoleptic properties and chemical composition [4]. So, more and more consumers nowadays are interested in the origin of the food that they consume, in particular, olive oils. In order to satisfy this demand, the control authorities have established a food traceability system. In fact, the research of the geographical or varietal origin of olive oils is essential and crucial for the knowledge of their traceability [6, 7]. The traceability must bring answers in terms of identification, localization, authentication, and security of olive oils.

Numerous analytical techniques have been developed to determine the geographical origin in some countries, such as high-performance liquid chromatography (HPLC), gas chromatography (GC) [8], infrared (IR) spectroscopy [9, 10], Raman spectroscopy [11], mass spectrometry (MS) [12], and nuclear magnetic resonance (NMR) [13, 14].

Several studies have used chromatography as a basic method coupled to chemometric methods (principal component analysis (PCA), discriminant factor analysis, and other discrimination algorithms). In these cases, to discriminate olive oils coming from various regions, the discrimination may be also based on the composition of fatty acids and triglycerides [15, 16], phenolic compounds [15, 17, 18], and pigments. Principally, these basic methods are long and laborious and require the use of more expensive solvents and reagents that can be harmful for the environment and toxic for the analysts. Hence, some techniques known as coupling techniques are emerging and commonly used like coupled spectroscopic techniques to chemometrics algorithms in order to ensure data processing and discrimination to obtain more precise and complementary information.

Nowadays, IR and UV-Visible spectroscopy are widely and frequently used to study and reveal information about the molecular properties of foods [1921]. These two spectroscopic techniques are ideal for rapid, accurate evaluation of raw foods without the use of reagents and solvents [20, 22, 23]. These spectroscopic techniques have been used to develop simple and highly effective methods to evaluate the quality parameters of olive oils. They are frequently coupled to multivariate analysis methods for exploitation, classification, discrimination, or calibration [20]. To discriminate olive oils after their different geographical regions, such a coupling, of IR spectroscopy to chemometrics treatments and sometimes on selected spectral zones, has been applied in previous studies [6, 24]. However, the results remain difficult to be compared because the varieties, geographical areas, spectral range, and chemometrics treatments have been different from one study to another.

To the best of our knowledge, it is necessary to carry out studies devoted to the geographical determination, by ATR-FTMIR and UV-Visible spectroscopy, of virgin olive oils coming from different provinces of the Moroccan region of Beni Mellal-Khenifra.

The objective of the present work is to develop rapid spectroscopic methods that could be able to classify virgin olive oils according to their geographical origin. Moreover, our work aimed to evaluate the capability of mid-infrared (MIR) and visible spectroscopies, when they are coupled to supervised and unsupervised multivariate analysis methods, for the discrimination and classification of virgin olive oils that come from different provinces in the Moroccan region of Beni Mellal-Khenifra.

2. Materials and Methods

2.1. Sampling

Fifty-six samples of virgin olive oils have been collected at different industrial mills of Beni Mellal-Khenifra. The mills are located and distributed over different provinces of this area. Table 1 indicates the geographical origins of the collected samples. The oils which were collected were produced in the November-December 2018 harvest time.

The collected samples were stored at a temperature not exceeding 4°C to avoid alteration of the virgin olive oils. The samples are divided into 41 samples for calibration and 15 samples for analysis.

2.2. ATR-FTMIR Spectroscopy

Analysis of the samples was performed by JASCO 460 plus ATR-FTMIR spectrometer equipped with a horizontal ATR accessory, at a 21°C fixed temperature. Using a micropipette, each sample has been deposited on the crystal surface of the ATR. The spectra were collected between 4000 cm−1 and 600 cm−1 averaging 130 scans at a resolution of 4 cm−1. For each analysis, the ATR accessory is cleaned using the acetone solution that allows us to dry and clean the ATR accessory.

The spectra have been treated using Spectrum Manager to eliminate the effect of carbon dioxide and then transformed to a JCAMP format.

2.3. UV-Visible Spectroscopy

The olive oil samples were analyzed by Perkin Elmer UV-Visible spectroscopy at the 350 to 800 nm range. In fact, the analysis of the olive oil samples was carried out, without centrifugation, using a spectrophotometer and a quartz cell of 1 cm optical path and then the spectra are saved directly on an Excel table format.

2.4. Multivariate Analysis

In this study, different statistical techniques have been applied for the processing and evaluation of spectral data that have been obtained from ATR-FTMIR and UV-Visible spectroscopy. In order to ensure exploration and representation of the data set, we started processing the results by PCA that consists of searching for the directions of greater dispersion to find new synthetic variables and represents the data in a reduced dimensional space. In addition to reducing the dimensionality of the data set, this method is often used for data cleaning by identifying outliers. It also serves as an effective tool for the identification of similar groups of individuals that behave similarly concerning the measured variables.

These principal components can also be used in turn for many different applications that are, in our case, support vector machine (SVM) and linear discriminant analysis (LDA).

The SVM method is one of the most commonly used methods for the classification of groups. The objective of this algorithm is to find a hyperplane in the N-dimensional space (N: the number of lines) that distinctly classifies the data points. To separate classes of data points, many hyperplanes have been generated, and the objective is to find a plane that has the maximum margin that separates well between the classes in the data point space. Maximizing the margin distance provides some strengthening for future data points to be classified with more confidence [25].

The LDA method is also considered among the effective methods used to discriminate groups or classes of individuals. It consists of finding linear combinations of the variables of a data X matrix. These variables make it possible to have representations of the K groups that are as compact as possible but also as far away from each other as possible. This separation is provided by hyperplanes so that the total variability is decomposed into intergroup and intragroup variability and the groups presenting high intergroup variability are those that are well separated and well discriminated [26].

The performance parameters of the models built by PCA-SVM and PCA-LDA, such as sensitivity, specificity, correct classification rate (CCR), and accuracy, are used to characterize the classification performance of the analytical method. The best performance of any classification method is to minimize false positives (FP), the number of positive samples that are correctly identified as positive samples, and false negatives (FN), the number of positive samples that are misclassified as negative samples. Evaluation criteria for a classifier method can be obtained from statistical measures [27].

The statistical processing of the data was carried out using The Unscramble Version X 10.4 software.

3. Results and Discussion

The absorption spectra of the samples at the Fourier-transformed MIR are shown in Figure 1. The MIR spectra show intense bands of high intensity up to a 1.7 absorbance due to the ATR accessory. The intensity of the bands gives information on the concentration of functional groups that characterizes olive oil compounds.

According to the observation of the ATR-FTMIR spectra, we have observed that different bands characterize the usual functional groups of olive oil at wavelengths 720, 968, 1159, 1377, 1464, 1743, 2852, 2920, and 3004 cm−1 that corresponds respectively to the functional groups: CH2 (CH sp3); C=C-H trans (CH sp2); C-OH; CH3 (CH sp3); CH2 (CH sp3); C=O; CH2, CH3 (CH sp3); CH2, CH3 (CH sp3); C=C-H cis (CH sp2) [28].

The visual observation of the ATR-FTMIR spectra of these 41 samples cannot be used to determine the similarities between individuals to differentiate between olive oils coming from different geographical origins.

The spectral absorption of olive oils at the UV-Visible presents spectral bands that correspond to reliable information on the compounds of olive oils, in particular, the pigments because they control the coloration of olive oils. From the observation of the UV-Visible spectra (Figure 2), it can be seen that there is an additive effect due to the effect of suspended particles since the olive oils have not been previously treated by centrifugation to eliminate the suspended particles. Light-scattering phenomenon is generated which introduces additive effects on the spectral database. After the mathematical correction of the UV-Visible spectra by baseline correction, the spectra are processed by baseline correction using the method (weighted least squares). This method is generally used in spectroscopic applications It iteratively adapts a baseline to each spectrum and determines the variables that are clearly above the baseline (i.e., the signal) and those that are below the baseline. The points under the baseline are supposed to be more significant in adjusting the baseline to the spectrum. This method is also referred to as asymmetric weighted least squares method. The clear effect is the automatic suppression of the background while avoiding the creation of very negative spectral peaks. [29].

After the mathematical correction of the UV-Visible spectra by baseline correction, it can be seen from Figure 3 that there is a difference in terms of the pigments. The spectra show an intense absorption between 400 nm and 500 nm. This spectral range corresponds to the absorption of blue light. In addition, absorptions at wavelengths 530, 615, and 670 nm correspond, respectively, to the following pigments: lutein, β-carotene, pheophytins a and b, chlorophylls a and b, and other pigments [30, 31].

3.1. Principal Component Analysis

A preliminary examination of the spectral data collected was carried out by principal component analysis using the spectral range of 600–4000 cm−1 of FT-MIR, in order to represent the data in a reduced dimensional space.

The PCA results (Figure 4) show that the first two main components account for 81% of the total variance of the spectral data, representing 53% and 28% of the total variance of the raw data, respectively. It is clear that there was information on the varietal origin in the MIR spectra of virgin olive oil samples, because the observation of score plot shows that there is a clustering of virgin olive oils according to their geographical origin. The PCA also shows that the five groups of virgin olive oils show small intergroup variability because these oils have the same varietal origin (Moroccan Picholine) and from geographical origins inside the same geographical area. In addition, climatic and edaphic conditions do not show significant differences in the Beni Mellal-Khenifra area.

The first principal component (PC1), representing the total variability, allows differentiating between BJD oils and other oils while the second principal component (PC2) allows differentiating between BM and FKB oils on the one hand and BM and KHN oils on the other hand.

Analysis of the UV-Visible spectra by PCA shows that the total variability of the data is explained by the first two components PC1 and PC2 that represent, respectively, 86% and 12% shown in Figure 5. According to the PC1-PC2 plan of the score plot, there is a clustering of the five groups of olive oils according to their geographical origin. This clustering is mainly carried out by the first main component which contains 86% of the information available in the UV-Visible spectral database.

Mathematical processing of the spectral data by the baseline correction algorithm can remove the additive effects caused by suspended particles in olive oil samples and improve the clustering between the groups. The PCA of the UV-Visible spectra also shows that olive oils of BM, FKB, and BJD have high intragroup variability as shown in Figure 6.

3.2. Linear Discriminant Analysis and Support Vector Machine

PCA is used to reduce the dimensionality of the data, especially the spectral data because they contain a high number of the variables. Thanks to PCA, we have generated independent synthetic variables that can then be used for the application of LDA and SVM classification.

To evaluate the discriminatory capability of olive oils that come from the Beni Mellal-Khenifra area using UV-Visible and ATR-FTMIR spectroscopy, LDA and SVM methods were applied to synthetic variables that have been generated by the PCA.

The use of the PCA-LDA method on the spectral data of the UV-Visible and FT-MIR shows a good discrimination capacity of the five groups of olive oil according to their geographical origin. This discrimination capacity is represented by the CCR coefficient of the training data which represents 90.24% in the case of the UV-Visible and 85.37% for the results of FT-MIR.

In order to evaluate the predictive performance of these classification models, an external validation was performed using external samples (three samples for each class). This validation procedure shows that 86.67% of the samples were correctly classified using the UV-Visible spectroscopic technique and 86.67% using ATR-FTMIR spectroscopy.

Then, the sensitivity and specificity of the training and validation datasets were calculated to evaluate the classification performance of these algorithms, and the results are presented in Tables 2 and 3.

The ATR-FTMIR and UV-Visible spectral databases were processed by the SVM method using a radial basis function algorithm. The application of this method was carried out on the first synthetic variables by the PCA.

The application of the PCA-SVM method on the two spectroscopic techniques UV-Visible and FT-MIR shows a good classification capacity of the five groups of olive oil according to their geographical origins. The percentage of correct classification calculated by the CCR coefficient reaches 100% and 97.56% for the training data using the two spectroscopic techniques UV-Visible and FT-MIR, respectively.

The evaluation of the predictive performance of these classification models shows a high predictive capacity of the five groups of virgin olive oil. This classification capacity was represented by the CCR coefficient which reaches 100% and 93.33% using UV-Visible and FT-MIR, respectively.

The sensitivity and specificity of the training and validation datasets were calculated to evaluate the classification performance of these algorithms, and the results are presented in Tables 4 and 5.

The observation of the statistical parameters (CCR, specificity, and sensitivity) of the two classification methods PCA-SVM and PCA-LDA applied on the two spectroscopic techniques shows an efficient classification of the five groups of virgin olive oils. These results also show that the application of the PCA-SVM method gives better results than the one obtained by the PCA-LDA because the PCA-SVM allows the classification of the five groups with a high percentage of classification using the training data and the external validation.

These results are compared with two studies that were performed to determine the geographical origin of the Moroccan olive oils. The first one is in the region of Fes-Meknes, in which they found a percentage of correct classification of 100% for the training data and 94.23% for the cross-validation using chromatography coupled with mass spectroscopy [17], which is an efficient, expensive, laborious, and time-consuming method. The other study demonstrate 100% of correct classification using electronic nose and tongue combination coupled with SVM [32].

Compared to the other studies, this study has an important advantage because it allows differentiating between the five groups of olive oils using UV-Visible spectroscopy and ATR-FTMIR which are fast methods and do not require the use of reagents.

4. Conclusion

This work demonstrates the capability of UV-Visible and TF-MIR spectroscopy combined with PCA-LDA and PCA-SVM classification techniques for the rapid detection of Moroccan origin virgin olive oils from the Beni Mellal-Khenifra region. This study was carried out on 41 samples of the Picholine virgin olive oil variety taken from five different geographical areas. The application of the two methods on the spectral data of UV-Visible and TF-MIR shows that the PCA-SVM method allows a better classification of the five olive oil groups with a higher sensitivity, specificity, and correct classification rate. In view of the results obtained by the external validation, this new approach provides more reliable information to predict the geographical origin of Moroccan olive oils. Last but not least, it should be noted that the proposed method is environmentally friendly, fast, and easy to use.

For a rapid and reliable process of assessment and authentication of virgin olive oils on the basis of their geographical origin, the development of robust spectral databases is encouraged as far as possible.

Data Availability

The data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to express their gratitude to all of the people who helped in achieving this work. Thanks are due to National Centre for Scientific and Technical Research (CNRST), Morocco.