当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Learning Model for Pathogen Classification Using Feature Fusion and Data Augmentation
Current Bioinformatics ( IF 4 ) Pub Date : 2021-02-28 , DOI: 10.2174/1574893615999200707143535
Fareed Ahmad 1 , Amjad Farooq 1 , Muhammad Usman Ghani Khan 1
Affiliation  

Background: Bacterial pathogens are deadly for animals and humans. The ease of their dissemination, coupled with their high capacity for ailments and death in infected individuals, makes them a threat to society.

Objective: Due to the high similarity among genera and species of pathogens, it is sometimes difficult for microbiologists to differentiate between them. Their automatic classification using deeplearning models can help in gaining reliable and accurate outcomes.

Methods: Deep-learning models, namely; AlexNet, GoogleNet, ResNet101, and InceptionV3 are used with numerous variations including training model from scratch, fine-tuning without pre-trained weights, fine-tuning along with freezing weights of initial layers, fine-tuning along with adjusting weights of all layers and augmenting the dataset by random translation and reflection. Moreover, as the dataset is small, fine-tuning and data augmentation strategies are applied to avoid overfitting and produce a generalized model. A merged feature vector is produced using two best-performing models and accuracy is calculated by xgboost algorithm on the feature vector by applying cross-validation.

Results: Fine-tuned models where augmentation is applied produces the best results. Out of these, two-best-performing deep models i.e. (ResNet101, and InceptionV3) selected for feature fusion, produced a similar validation accuracy of 95.83 with a loss of 0.0213 and 0.1066, and testing accuracy of 97.92 and 93.75, respectively. The proposed model used xgboost to attain a classification accuracy of 98.17% by using 35-folds cross-validation.

Conclusion: The automatic classification using these models can help experts in the correct identification of pathogens. Consequently, they can help in controlling epidemics and thereby minimizing the socio-economic impact on the community.



中文翻译:

使用特征融合和数据增强的病原体分类深度学习模型

背景:细菌病原体对动物和人类是致命的。它们的传播容易,加上感染者的疾病和死亡能力高,使他们对社会构成威胁。

目的:由于病原体的属和种之间高度相似,因此微生物学家有时很难区分它们。使用深度学习模型对它们进行自动分类可以帮助获得可靠和准确的结果。

方法:深度学习模型,即;使用了AlexNet,GoogleNet,ResNet101和InceptionV3的各种变体,包括从头开始训练模型,无需预先训练权重即可进行微调,在冻结初始层权重的同时进行微调以及在调整所有层权重的同时进行微调。通过随机平移和反射来扩充数据集。此外,由于数据集较小,因此应用微调和数据扩充策略来避免过度拟合并生成通用模型。使用两个性能最佳的模型生成合并的特征向量,并通过应用交叉验证通过xgboost算法对特征向量计算准确性。

结果:应用增强的微调模型可产生最佳结果。在这些模型中,选择了两个性能最好的深度模型(ResNet101和InceptionV3)进行特征融合,它们产生的相似验证准确度为95.83,损失为0.0213和0.1066,测试准确度分别为97.92和93.75。所提出的模型使用xgboost通过35倍交叉验证获得98.17%的分类精度。

结论:使用这些模型的自动分类可以帮助专家正确鉴定病原体。因此,它们可以帮助控制流行病,从而最大程度地减少对社区的社会经济影响。

更新日期:2021-02-28
down
wechat
bug