当前位置: X-MOL 学术Comput. Struct. Biotechnol. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
COSMO: A dynamic programming algorithm for multicriteria codon optimization.
Computational and Structural Biotechnology Journal ( IF 6 ) Pub Date : 2020-06-30 , DOI: 10.1016/j.csbj.2020.06.035
Akito Taneda 1 , Kiyoshi Asai 2
Affiliation  

Codon optimization in protein-coding sequences (CDSs) is a widely used technique to promote the heterologous expression of target genes. In codon optimization, a combinatorial space of nucleotide sequences that code a given amino acid sequence and take into account user-prescribed forbidden sequence motifs is explored to optimize multiple criteria. Although evolutionary algorithms have been used to tackle such complex codon optimization problems, evolutionary codon optimization tools do not provide guarantees to find the optimal solutions for these multicriteria codon optimization problems.

We have developed a novel multicriteria dynamic programming algorithm, COSMO. By using this algorithm, we can obtain all Pareto-optimal solutions for the multiple features of CDS, which include codon usage, codon context, and the number of hidden stop codons. User-prescribed forbidden sequence motifs are rigorously excluded from the Pareto-optimal solutions. To accelerate CDS design by COSMO, we introduced constraints that reduce the number of Pareto-optimal solutions to be processed in a branch-and-bound manner. We benchmarked COSMO for run-time and the number of generated solutions by adapting selected human genes to yeast codon usage frequencies, and found that the constraints effectively reduce the run-time. In addition to the benchmarking of COSMO, a multi-objective genetic algorithm (MOGA) for CDS design was also benchmarked for the same two aspects and their performances were compared. In this comparison, (i) MOGA identified significantly fewer Pareto-optimal solutions than COSMO, and (ii) the MOGA solutions did not achieve the same mean hypervolume values as those provided by COSMO. These results suggest that generating the whole set of the Pareto-optimal solutions of the codon optimization problems is a difficult task for MOGA.



中文翻译:

COSMO:一种用于多准则密码子优化的动态编程算法。

蛋白质编码序列(CDSs)中的密码子优化是一种广泛使用的技术,可促进靶基因的异源表达。在密码子优化中,研究了编码给定氨基酸序列并考虑用户指定的禁止序列基序的核苷酸序列的组合空间,以优化多个标准。尽管进化算法已被用来解决这种复杂的密码子优化问题,但是进化密码子优化工具并不能保证找到针对这些多准则密码子优化问题的最佳解决方案。

我们已经开发了一种新颖的多准则动态规划算法COSMO。通过使用该算法,我们可以获得针对CDS多个功能的所有帕累托最优解,包括密码子使用,密码子上下文和隐藏终止密码子的数量。从帕累托最优解中严格排除用户指定的禁止序列基序。为了加快COSMO的CDS设计,我们引入了一些约束,这些约束减少了需要以分支定界方式处理的Pareto最优解的数量。通过使选定的人类基因适应酵母密码子使用频率,我们对运行时间和生成的解决方案的数量对COSMO进行了基准测试,发现约束条件有效地减少了运行时间。除了COSMO的基准测试外,还针对相同的两个方面对用于CDS设计的多目标遗传算法(MOGA)进行了基准测试,并比较了它们的性能。在此比较中,(i)MOGA识别出的帕累托最优解明显少于COSMO,并且(ii)MOGA解决方案没有获得与COSMO相同的平均超体积值。这些结果表明,生成密码子优化问题的整个帕累托最优解是MOGA的一项艰巨任务。

更新日期:2020-06-30
down
wechat
bug