Applied Soft Computing ( IF 5.472 ) Pub Date : 2021-02-20 , DOI: 10.1016/j.asoc.2021.107197 Mohamed Issa
COVID-19 is a global pandemic that aroused the interest of scientists to prevent it and design a drug for it. Nowadays, presenting intelligent biological data analysis tools at a low cost is important to analyze the biological structure of COVID-19. The global alignment algorithm is one of the important bioinformatics tools that measure the most accurate similarity between a pair of biological sequences. The huge time consumption of the standard global alignment algorithm is its main limitation especially for sequences with huge lengths. This work proposed a fast global alignment tool (G-Aligner) based on meta-heuristic algorithms that estimate similarity measurements near the exact ones at a reasonable time with low cost. The huge length of sequences leads G-Aligner based on standard Sine-Cosine optimization algorithm (SCA) to trap in local minima. Therefore, an improved version of SCA was presented in this work that is based on integration with PSO. Besides, mutation and opposition operators are applied to enhance the exploration capability and avoiding trapping in local minima. The performance of the improved SCA algorithm (SP-MO) was evaluated on a set of IEEE CEC functions. Besides, G-Aligner based on the SP-MO algorithm was tested to measure the similarity of real biological sequence. It was used also to measure the similarity of the COVID-19 virus with the other 13 viruses to validate its performance. The tests concluded that the SP-MO algorithm has superiority over the relevant studies in the literature and produce the highest average similarity measurements 75% of the exact one.
中文翻译:

基于带有突变和对立算子的合并SCA算法的快速COVID-19相似性度量工具
COVID-19是一种全球性的大流行病,引起了科学家的兴趣,要求预防它并为其设计药物。如今,以低成本提供智能的生物数据分析工具对于分析COVID-19的生物学结构非常重要。全局比对算法是重要的生物信息学工具之一,可测量一对生物序列之间最准确的相似性。标准全局比对算法的大量时间消耗是它的主要局限性,特别是对于长度很大的序列。这项工作提出了一种基于元启发式算法的快速全局对齐工具(G-Aligner),该算法可在合理的时间以低成本估算出准确度附近的相似度。巨大的序列长度导致基于标准Sine-Cosine优化算法(SCA)的G-Aligner陷入局部最小值。因此,在这项工作中提出了基于与PSO集成的SCA的改进版本。此外,采用变异和对立算子来增强勘探能力,避免陷入局部极小值。在一组IEEE CEC函数上评估了改进的SCA算法(SP-MO)的性能。此外,还对基于SP-MO算法的G-Aligner进行了测试,以测量真实生物序列的相似性。它也用于测量COVID-19病毒与其他13种病毒的相似性,以验证其性能。