当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Study on Host Tropism Determinants of Influenza Virus Using Machine Learning
Current Bioinformatics ( IF 2.4 ) Pub Date : 2020-01-31 , DOI: 10.2174/1574893614666191104160927
Eunmi Kwon 1 , Myeongji Cho 1 , Hayeon Kim 2 , Hyeon S. Son 1
Affiliation  

Background: The host tropism determinants of influenza virus, which cause changes in the host range and increase the likelihood of interaction with specific hosts, are critical for understanding the infection and propagation of the virus in diverse host species.

Methods: Six types of protein sequences of influenza viral strains isolated from three classes of hosts (avian, human, and swine) were obtained. Random forest, naïve Bayes classification, and knearest neighbor algorithms were used for host classification. The Java language was used for sequence analysis programming and identifying host-specific position markers.

Results: A machine learning technique was explored to derive the physicochemical properties of amino acids used in host classification and prediction. HA protein was found to play the most important role in determining host tropism of the influenza virus, and the random forest method yielded the highest accuracy in host prediction. Conserved amino acids that exhibited host-specific differences were also selected and verified, and they were found to be useful position markers for host classification. Finally, ANOVA analysis and post-hoc testing revealed that the physicochemical properties of amino acids, comprising protein sequences combined with position markers, differed significantly among hosts.

Conclusion: The host tropism determinants and position markers described in this study can be used in related research to classify, identify, and predict the hosts of influenza viruses that are currently susceptible or likely to be infected in the future.



中文翻译:

基于机器学习的流感病毒宿主取向决定因素研究

背景:流感病毒的宿主嗜性决定因素会导致宿主范围发生变化,并增加与特定宿主相互作用的可能性,对于了解病毒在多种宿主物种中的感染和传播至关重要。

方法:从三种宿主(禽,人和猪)中分离出六种流感病毒株的蛋白质序列。随机森林,朴素贝叶斯分类和近邻邻居算法用于宿主分类。Java语言用于序列分析编程和识别特定于主机的位置标记。

结果:探索了一种机器学习技术来推导用于宿主分类和预测的氨基酸的理化特性。发现HA蛋白在确定流感病毒的宿主嗜性中起着最重要的作用,而随机森林方法在宿主预测中产生了最高的准确性。还选择并验证了表现出宿主特异性差异的保守氨基酸,并发现它们是宿主分类的有用位置标记。最后,ANOVA分析和事后测试表明,宿主之间氨基酸的物理化学性质(包括与位置标记结合的蛋白质序列)存在显着差异。

结论:本研究中描述的宿主嗜性决定因素和位置标记可用于相关研究中,以分类,鉴定和预测当前易感或将来可能感染的流感病毒宿主。

更新日期:2020-01-31
down
wechat
bug