Original Research Article
Untargeted identification of adulterated Sanqi powder by near-infrared spectroscopy and one-class model

https://doi.org/10.1016/j.jfca.2020.103450Get rights and content

Highlights

  • A fast NIR-based method for analyzing Sanqi powder.

  • One-class model is used for untargeted identification.

  • Such a strategy is potential for product quality supervision.

Abstract

Sanqi is a widely used traditional Chinese medicines (TCM) for its outstanding efficacy. In Chinese market, Sanqi powder is the goal of counterfeiting for a long time. Investigation of Sanqi authenticity is very important in both economic and public health terms. The present work aims at studying the feasibility of combining near-infrared (NIR) spectroscopy with relief-based variable selection and class-modeling for identifying adulterated Sanqi powder. A total of 209 samples including pure and mixed samples, were prepared. Principal component analysis (PCA) was applied for exploratory analysis. The relief algorithm was used to rank all variables, based on which only the first 100 most informative variables were picked out for subsequent class-modeling. By optimizing the parameters such as the number of components, type I and type II errors, the final one-class models were constructed on the training set and evaluated on the test set. Such a procedure is simple and is more in line with actual need. The performance of the models is acceptable. The results indicate that NIR spectroscopy combined with class-modeling and relief-based variable selection is feasible for identifying the adulteration of Sanqi powder.

Introduction

Notoginseng, Sanqi in Chinese, is a widely used traditional Chinese medicines (TCM) for its efficacy in relieving swelling, alleviating pain, inducing blood clotting, promoting blood circulation and removing blood stasis (Liu et al., 2003; Ng, 2006). The saponins are main bioactive compounds of Sanqi, which also contains other constituents such as amino acids, flavonoids and polysaccharides (Wan et al., 2008; Liu et al., 2007). Sanqi is characterized as a functional food in China and Sanqi powder is being sold as over-the-counter dietary supplements in the health food market. Some dietary forms of Sanqi is easily made at home. For example, Sanqi wine can be prepared by mixing Sanqi and Chinese spirits, which can tonify qi, soothe nerves and promote blood circulation. Sanqi powder enjoys an excellent reputation as a health food and is also of great economic importance. However, because of the rising demands and shortage of resources, the price of Sanqi keep rising. Driven by benefit, Sanqi powder is the goal of counterfeiting for a long time in china. It has been found that Sanqi powder is adulterated with corn flour, sophora flavescens powder due to their similar appearances, tastes and much lower cost (Nie et al., 2013). These adulterants can not cure and even have hazardous effects on human healthy. So, the analytical method of Sanqi authenticity is very important, in both economic and public health terms.

Just because of the similar appearances, it is difficult for consumers to identify the adulterated and pure Sanqi powder only by naked eyes (Wu et al., 2011). At present, the determination of TCM including Sanqi, especially in powder form, is very challenging, which follows complex experience rules and mainly depends on the expert’s subjective senses (Guo et al., 2011). Modern instrumental methods such as capillary electrophoresis (CE) (Huck et al., 2005), high performance liquid chromatography (HPLC) (Cai et al., 2016), gas chromatography coupled with mass spectrometry (GC–MS) (Liu et al., 2017) have been used for these work, but they are often laborious, time-consuming, and difficult to realize rapid analysis.

Nowadays, spectroscopy-based techniques offer many advantages and their combination with chemometric techniques has proved to be a kind of powerful tool for adulteration testing (Ferreiro-González at al., 2018; Zhang and Xue, 2016; Moncayo et al., 2017; Chen et al., 2018; Isabel López et al., 2014). Some examples of ultraviolet (UV) (Di Anibal et al., 2009), mid-infrared (MIR) (Zhang et al., 2012), Raman (López-Díez et al., 2003), fluorescence (Kunz et al., 2011) and nuclear magnetic resonance (NMR) (Consonni et al., 2012) techniques can be seen in the literature. Among these techniques, the most widely used one is maybe near-infrared (NIR) spectroscopy, which has resulted in many successful applications in the field of adulteration test (Chen et al., 2017). Note that NIR spectral features arise from the molecular absorptions of the overtones and combination bands that originate from fundamental vibrational bands generally found in the MIR region. The most-often observed bands in the NIR spectrum include the combination bands and the first, second, or third overtones of Osingle bondH, Nsingle bondH, and Csingle bondH fundamentals (Porep et al., 2015). Precise band assignments are very difficult in the NIR region because a single band may be attributable to several possible combinations of fundamental and overtone vibrations, all severely overlapped. A NIR spectrum contains information of bond strengths, chemical species, hydrogen bonding, etc. For solid samples, besides the complex composition, physical information such as scattering and reflection is superimposed on the NIR spectrum. Since a NIR spectrum is characterized by overlapping, broad and weak peaks/bands, it is inevitable to apply chemometrics to extract information and build prediction models, whose performance is a determining factor. Also, since NIR spectra often involve hundreds or even thousands of variables with a relatively small number of samples, variable selection is necessary in most cases.

As far as the identification of the authenticity of a sample is regarded, it mainly involves pattern recognitions. Both classification and class-modeling can be applied but they provide very distinct solutions (Forina et al., 2008). Classification is more popular, maybe because there are many software available. Typically, A classification algorithm focuses on determining to which class, among several predefined classes, a sample most probably belongs to. They work by seeking a delimiter that partitions the sample space into a few sub-subspaces related to the classes respectively. A sample can always be assigned to a predefined class, whether or not it come from one of these predefined classes in the training set. The delimiter can be of varying complexity. Different algorithms can be defined according to the nature of the delimiter. Classification is appropriate only when all classes are clearly defined (Oliveri, 2017; Irigoien et al., 2014). For classification it needs at least two classes to be defined for seeking the delimiter between classes. For authenticity detection, the first/target class contains the authentic samples and the second/non-target contains adulterated samples. However, there are two difficulties: firstly, the second class is not really a meaningful class. So, to seek the boundary between two classes is not scientific. Secondly, the classification rule, denoted geometrically by the delimiter, depend on the samples of both classes. On the contrary, class-modeling focus on solving the question: whether or not a sample come from a given interest/target class. More specifically, can a sample claimed as class A is really from class A? A basic concept of class-modeling is that it builds a description on the interest class rather than seeks a boundary between classes (Brereton, 2011). Often, it results in a closed boundary surrounding the interest class, based on which, a new sample can be identified according to whether it located inside/outside the boundary. Thus, class-modeling is more suitable than classification for authenticity detection, although most people are not familiar with it.

As far as Sanqi is regarded, there are several examples of applying spectroscopy for analysis purpose (Lu et al., 2008) used 2-diemensional correlation IR (2DCOS-MIR) spectroscopy to identify Asia ginseng, American ginseng and Notoginseng (Ma et al., 2016) used MIR and 2DCOS-MIR to identify the grade of Sanqi (Liu et al., 2019) used NIR and several algorithms to identify and quantify the adulterants of Sanqi (Yang et al., 2018) used MIR and support vector machine(SVM) to discriminate the grade of Sanqi powder. Even so, to our best knowledge, there exist no applications of class-modeling technology for adulteration detection of Sanqi powder.

The present work aims at studying the feasibility of combining NIR spectroscopy with relief-based variable selection and class-modeling for identifying adulterated Sanqi powder. A total of samples were prepared. Principal component analysis (PCA) was applied for exploratory analysis. The relief algorithm was used to reduce variables. Class-modeling technique was used to construct the one-class model. The results indicate that NIR spectroscopy combined with class-modeling and relief-based variable selection is feasible for identifying the adulteration of Sanqi powder.

Section snippets

Relief-based feature selection

The Relief algorithm is a classical filter algorithm for feature/variable selection (Deng et al., 2010). It focuses on assigning a “relevance” weight to each feature. Relief is a randomized algorithm. It extracts samples randomly from the training set and updates weight values based on the difference between the selected samples and the two nearest samples of the same and opposite classes (the “near-hit” and “near-miss”). So, the weight of a given feature indicates how well it helps to

Sample preparation

A total 209 samples were prepared in powder form. Sanqi specimen were collected from four provinces of China, i.e., Yunnan, Guizou, Xizang and Guangxi (Sichuan Shengjie Pharmaceutical Co., Ltd, Yibin, China). Those originated from Yunnan has the highest market share and were usually the objective of adulteration. Two main adulterants including sophora flavescens powder (Tiantiankang Pharmaceutical Co., Ltd, Yibin, China) and corn flour (Henan Xinxiang Liangrunquan Grain Food Co., Ltd, Xinxiang,

Preliminary spectral analysis

The NIR spectra of all experimental samples are shown in Fig. 1 Raw specra of samples consist of background information and noises besides expected sample information. It is difficult to find the differences in the raw NIR spectra due to the overlapping of spectral bands. Also, these spectra display apparent baseline drift originated from the difference of particle size, which is harmful for spectral interpretation and subsequent modeling of NIR diffuse reflectance spectra. It results in

Conclusions

The overall results demonstrate that NIR spectroscopy coupled with class-modeling can successfully identify the adulteration of Sanqi powder. This procedure is a promising alternative to laborious, time-consuming, expensive and chemical analysis methods. In the frame of class-modeling, the model can be built for the target class without in-depth information on other classes or samples. All optimization of model parameters and validation are based on only using the information from the target

CRediT authorship contribution statement

Hui Chen: Validation, Writing - original draft, Methodology. Chao Tan: Conceptualization, Funding acquisition, Writing - review & editing. Hongjin Li: Resources, Investigation.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (21375118, J1310041), Scientific Research Foundation of Sichuan Provincial Education Department of China (17TD0048), Scientific Research Foundation of Yibin University (2017ZD05), Sichuan Science and Technology Program of China (2018JY0504) and Opening Fund of Key Lab of Process Analysis and Control of Sichuan Universities of China (2018005).

References (36)

Cited by (22)

View all citing articles on Scopus
View full text