LP-based heuristics for the distinguishing string and substring selection problems

Torres, Jean P. Tremeschin; Hoshino, Edna A.

doi:10.1007/s10479-021-04138-5

LP-based heuristics for the distinguishing string and substring selection problems

S.I.: CLAIO 2018
Published: 04 June 2021

Volume 316, pages 1205–1234, (2022)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

151 Accesses
1 Altmetric
Explore all metrics

Abstract

This work aims to evaluate and propose matheuristics for the Distinguishing String Selection Problem (DSSP) and the Distinguishing Substring Selection Problems (DSSSP). Heuristics based on mathematical programming have already been proposed for String Selection problems in the literature and we are interested in adopting and testing different approaches for those problems. We proposed two matheuristics for both the DSSP and DSSSP by combining the Variable Neighbourhood Search (VNS) metaheuristic and mathematical programming. We compare the linear relaxation, lower bounds found through the branch-and-bound technique, and the matheuristics in three different groups of instances. Computational experiments show that the Basic Core Problem Algorithm (BCPA) finds overall better results for the DSSP. However, it was unable to provide any solutions for some hard DSSSP instances in a reasonable time limit. The two matheuristics based on the VNS have their own niche related to the different groups of instances. They found good solutions for the DSSSP while the BCPA failed. All the obtained data are available in our repository.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computational performance evaluation of two integer linear programming models for the minimum common string partition problem

Article 23 July 2015

Christian Blum & Günther R. Raidl

On the Far from Most String Problem, One of the Hardest String Selection Problems

Improved Approximation for the Maximum Duo-Preservation String Mapping Problem

References

Chimani, M., Woste, M., & Böcker, S. (2011) A closer look at the closest string and closest substring problem. In: Proceedings of the Meeting on Algorithm Engineering & Expermiments (pp. 13–24).
Della Croce, F., & Salassa, F. (2012). Improved lp-based algorithms for the closest string problem. Computers & Operations Research, 39(3), 746–749.
Article Google Scholar
Deng, X., Li, G., Li, Z., Ma, B., & Wang, L. (2003). Genetic design of drugs without side-effects. SIAM Journal on Computing, 32(4), 1073–1090.
Article Google Scholar
Faro, S., & Pappalardo, E. (2010). Ant-csp: An ant colony optimization algorithm for the closest string problem. In Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science, Lecture Notes in Computer Science (Vol. 5901, pp. 370–381).
Gamrath, G., Fischer, T., Gally, T., Gleixner, A.M., Hendel, G., Koch, T., Maher, S.J., Miltenberger, M., Müller, B., Pfetsch, M.E., Puchert, C., Rehfeldt, D., Schenker, S., Schwarz, R., Serrano, F., Shinano, Y., Vigerske, S., Weninger, D., Winkler, M., Witt, J.T., & Witzig, J. (2016). The scip optimization suite 3.2. Tech. Rep. 15-60, ZIB, Berlin.
Gramm, J., Guo, J., & Niedermeier, R. (2006). Parameterized intractability of distinguishing substring selection. Theory of Computing Systems, 39(4), 545–560.
Article Google Scholar
Hansen, P., Mladenović, N., & Moreno Pérez, J. A. (2008). Variable neighbourhood search: methods and applications. A Quarterly Journal of Operations Research, 6(4), 319–360.
Google Scholar
IBM (2013). IBM ILOG CPLEX v12.6 optimization studio CPLEX user’s manual.
Jean, T. (2018). DSSP. https://github.com/jeanpttorres/dssp . Accessed December 28, 2020.
Lanctot, J.K. (2000). Some string problems in computational biology. Ph.D. thesis, University of Waterloo, Ontario, Canada
Lanctot, J. K., Li, M., Ma, B., Wang, S., & Zhang, L. (2003). Distinguishing string selection problems. Information and Computation, 185(1), 41–55. https://doi.org/10.1016/S0890-5401(03)00057-9.
Article Google Scholar
Liu, X., Holger, M., Hao, Z., & Wu, G. (2008). A compounded genetic and simulated annealing algorithm for the closest string problem. In Proceedings of the 2nd International Conference on Bioinformatics and Biomedical Engineering (pp. 702–705). https://doi.org/10.1109/ICBBE.2008.171
Liu, X., Liu, S., Hao, Z., & Mauch, H. (2011). Exact algorithm and heuristic for the closest string problem. Computers & Operations Research, 38, 1513–1520.
Article Google Scholar
Mauch, H., Melzer, M.J., & Hu, J.S. (2003). Genetic algorithm approach for the closest string problem. In Proceedings of the IEEE Computer Society Conference on Bioinformatics (p. 560) . https://doi.org/10.1109/CSB.2003.1227407
Meneses, C.N. (2005). Combinatorial approaches for problems in bioinformatics. Ph.D. thesis, University of Florida, Florida, USA
Meneses, C.N., Pardalos, P.M., Resende, M.G.C., & Vazacopoulos, A. (2005). Modeling and solving string selection problems. In Proceedings of the Second International Symposium on Mathematical and Computational Biology (pp. 54–64).
Proutski, V., & Holmes, E. C. (1996). Primer Master: a new program for the design and analysis of PCR primers. Bioinformatics, 12(3), 253–255. https://doi.org/10.1093/bioinformatics/12.3.253.
Article Google Scholar
Stormo, G. D., Hartzell, G. W., & Hertz, G. Z. (1990). Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Computer Applications in the Biosciences, 6(2), 81–92.
Google Scholar
Torres, J., Silva, E., & Hoshino, E.A. (2018). Heuristic approaches to the distinguishing substring selection problem. In Proceedings of the 5th International Conference on Variable Neighborhood Search, Electronic Notes in Discrete Mathematics (Vol. 66, pp. 151–158). https://doi.org/10.1016/j.endm.2018.03.020
Torres, J.P., & Hoshino, E.A. (2018). Abordagens heurísticas para problemas de seleção de strings. In Proceedings of Simpásio Brasileiro de Matemática Aplicada e Computacional, Proceeding Series of the Brazilian Society of Computational and Applied Mathematics, vol. 6.

Download references

Acknowledgements

We would like to thank SCIP developers team and IBM for the SCIP and Cplex academic licenses. We also thank the anonymous reviewers, whose works and contributions helped us to improve the quality of the paper.

Author information

Authors and Affiliations

Faculty of Computing, Federal University of Mato Grosso do Sul, Campo Grande, MS, Brazil
Jean P. Tremeschin Torres & Edna A. Hoshino

Authors

Jean P. Tremeschin Torres
View author publications
You can also search for this author in PubMed Google Scholar
Edna A. Hoshino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edna A. Hoshino.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Appendix: Detailed experiments results

Tables 5, 6, 7, 8, 9, and 10 show the results for each instance. Column frac presents the percentage of variables whose values in the optimal solution of the linear relaxation were fractional. Column root refers to the value of the optimal solution of the linear relaxation at the root node, whilst columns lb and ub represent the final lower an upper bound found by the exact algorithm, respectively. Columns \(H_1\), \(H_2\), \(H_3\), and \(H_4\) show the value of the solution found by RA, BCPA, ILPBN-VNS, and ILPBS-VNS, respectively. Instances are named r-XX-YY-ZZ-N. where XX indicates \(|\varSigma |\), YY refers to \(|S^c|=|S^f|\), ZZ to |t|, and N is used to express different test cases with similar structures. The best solutions found by heuristics are highlighted in boldface. We also highlight the root lower bound and the final upper bound found by the exact approach when they are optimum. The symbol \(*\) are used to indicate that the information was not available after the time limit. We use the symbol − instead of the value of the solution found by a heuristic, when it could not be find due to the time limit.

Table 5 DSSP Group 1

Full size table

Table 6 DSSP Group 2

Full size table

Table 7 DSSP Group 3

Full size table

Table 8 DSSSP Group 1

Full size table

Table 9 DSSSP Group 2

Full size table

Table 10 DSSSP Group 3

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Torres, J.P.T., Hoshino, E.A. LP-based heuristics for the distinguishing string and substring selection problems. Ann Oper Res 316, 1205–1234 (2022). https://doi.org/10.1007/s10479-021-04138-5

Download citation

Accepted: 26 May 2021
Published: 04 June 2021
Issue Date: September 2022
DOI: https://doi.org/10.1007/s10479-021-04138-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LP-based heuristics for the distinguishing string and substring selection problems

Abstract

Access this article

Similar content being viewed by others

Computational performance evaluation of two integer linear programming models for the minimum common string partition problem

On the Far from Most String Problem, One of the Hardest String Selection Problems

Improved Approximation for the Maximum Duo-Preservation String Mapping Problem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Detailed experiments results

Rights and permissions

About this article

Cite this article

Keywords

Navigation

LP-based heuristics for the distinguishing string and substring selection problems

Abstract

Access this article

Similar content being viewed by others

Computational performance evaluation of two integer linear programming models for the minimum common string partition problem

On the Far from Most String Problem, One of the Hardest String Selection Problems

Improved Approximation for the Maximum Duo-Preservation String Mapping Problem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Detailed experiments results

Appendix: Detailed experiments results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation