当前位置: X-MOL 学术Spectrosc. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The model updating based on near infrared spectroscopy for the sex identification of silkworm pupae from different varieties by a semi-supervised learning with pre-labeling method
Spectroscopy Letters ( IF 1.7 ) Pub Date : 2019-11-08 , DOI: 10.1080/00387010.2019.1681463
Xie Lin 1 , Yang Zhuang 1 , Tao Dan 1 , Li Guanglin 1 , Yang Xiaodong 1 , Song Jie 1 , Liu Xuwen 1
Affiliation  

Abstract It is effective to accurately discriminate the sex of silkworm pupae with the same varieties based on near infrared spectroscopy. However, when the model is promoted to classify new varieties of silkworm pupae, the model’s performance becomes worse, due to the cultivation environment and varieties changing. In the aims of improving the generalization ability and accuracy of the model, this paper proposed a model updating strategy based on semi-supervised learning. First, support vector machine identification model was built after the original spectra was pretreated by Savitzky-Golay convolution smoothing operation, which could effectively reduce spectra noise. Then, the support vector machine model gave the pre-labelings of unlabeled silkworm pupae in the updated set, which were divided into male samples and female samples. According to the correlation coefficients that calculated by Pearson correlation coefficient and Euclidean distance, a total of 8 reliable samples were selected from the male and female samples, respectively. The reliable samples were added to the original training set to update the original model. Finally, the updated model was used to test the test sets from the varieties of silkworm pupae that were the same with updated sets.The results showed the performance of the non-updated model for silkworm pupae from the three new varieties just reached 54.55%, 68.52%, 86.84%, respectively. The support vector machine model updated by using Pearson correlation coefficient improved the accuracy to 100%, 96.30%, 97.37%, and the model updated by Euclidean distance increased the identification accuracy of the three varieties that were not involved in the modeling to 100%, 75.93%, 92.10% respectively. The results showed that the performance of the model updated by Pearson correlation coefficient was better than Euclidean distance. The results revealed that the method based on semi-supervised learning could effectively solve the problem of poor universality for sex identification model.

中文翻译:

基于近红外光谱的不同品种蚕蛹性别识别模型更新的半监督预标记方法

摘要 利用近红外光谱技术准确判别同品种蚕蛹的性别是有效的。然而,当模型推广到对蚕蛹新品种进行分类时,由于栽培环境和品种的变化,模型的性能变得更差。为了提高模型的泛化能力和准确率,本文提出了一种基于半监督学习的模型更新策略。首先,对原始光谱进行Savitzky-Golay卷积平滑操作预处理后建立支持向量机识别模型,可以有效降低光谱噪声。然后,支持向量机模型对更新集中未标记的蚕蛹进行预标记,分为雄性样本和雌性样本。根据Pearson相关系数和欧几里德距离计算的相关系数,分别从男性和女性样本中选出8个可靠样本。将可靠样本添加到原始训练集中以更新原始模型。最后,使用更新后的模型对与更新后集相同的蚕蛹品种的测试集进行测试。结果表明,三个新品种的蚕蛹未更新模型的性能刚刚达到54.55%,分别为 68.52%、86.84%。使用 Pearson 相关系数更新的支持向量机模型将准确率提高到 100%、96.30%、97.37%,欧氏距离更新后的模型对未参与建模的三个品种的识别准确率分别提高到100%、75.93%、92.10%。结果表明,Pearson相关系数更新后的模型性能优于欧氏距离。结果表明,基于半监督学习的方法可以有效解决性别识别模型通用性差的问题。
更新日期:2019-11-08
down
wechat
bug