当前位置: X-MOL 学术Comput. Math. Method Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Genomic Island Prediction via Chi-Square Test and Random Forest Algorithm
Computational and Mathematical Methods in Medicine ( IF 2.809 ) Pub Date : 2021-05-25 , DOI: 10.1155/2021/9969751
Mbulayi Onesime 1 , Zhenyu Yang 1 , Qi Dai 1
Affiliation  

Genomic islands are related to microbial adaptation and carry different genomic characteristics from the host. Therefore, many methods have been proposed to detect genomic islands from the rest of the genome by evaluating its sequence composition. Many sequence features have been proposed, but many of them have not been applied to the identification of genomic islands. In this paper, we present a scheme to predict genomic islands using the chi-square test and random forest algorithm. We extract seven kinds of sequence features and select the important features with the chi-square test. All the selected features are then input into the random forest to predict the genome islands. Three experiments and comparison show that the proposed method achieves the best performance. This understanding can be useful to design more powerful method for the genomic island prediction.

中文翻译:

通过卡方检验和随机森林算法进行基因组岛预测

基因组岛与微生物适应有关,并携带与宿主不同的基因组特征。因此,已经提出了许多方法来通过评估其序列组成来从基因组的其余部分检测基因组岛。已经提出了许多序列特征,但其中许多还没有应用于基因组岛的识别。在本文中,我们提出了一种使用卡方检验和随机森林算法来预测基因组岛的方案。我们提取了七种序列特征,并用卡方检验选择了重要的特征。然后将所有选定的特征输入随机森林以预测基因组岛。三个实验和比较表明,所提出的方法达到了最好的性能。
更新日期:2021-05-25
down
wechat
bug