Skip to main content
Log in

LP-based heuristics for the distinguishing string and substring selection problems

  • S.I.: CLAIO 2018
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

This work aims to evaluate and propose matheuristics for the Distinguishing String Selection Problem (DSSP) and the Distinguishing Substring Selection Problems (DSSSP). Heuristics based on mathematical programming have already been proposed for String Selection problems in the literature and we are interested in adopting and testing different approaches for those problems. We proposed two matheuristics for both the DSSP and DSSSP by combining the Variable Neighbourhood Search (VNS) metaheuristic and mathematical programming. We compare the linear relaxation, lower bounds found through the branch-and-bound technique, and the matheuristics in three different groups of instances. Computational experiments show that the Basic Core Problem Algorithm (BCPA) finds overall better results for the DSSP. However, it was unable to provide any solutions for some hard DSSSP instances in a reasonable time limit. The two matheuristics based on the VNS have their own niche related to the different groups of instances. They found good solutions for the DSSSP while the BCPA failed. All the obtained data are available in our repository.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Chimani, M., Woste, M., & Böcker, S. (2011) A closer look at the closest string and closest substring problem. In: Proceedings of the Meeting on Algorithm Engineering & Expermiments (pp. 13–24).

  • Della Croce, F., & Salassa, F. (2012). Improved lp-based algorithms for the closest string problem. Computers & Operations Research, 39(3), 746–749.

    Article  Google Scholar 

  • Deng, X., Li, G., Li, Z., Ma, B., & Wang, L. (2003). Genetic design of drugs without side-effects. SIAM Journal on Computing, 32(4), 1073–1090.

    Article  Google Scholar 

  • Faro, S., & Pappalardo, E. (2010). Ant-csp: An ant colony optimization algorithm for the closest string problem. In Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science, Lecture Notes in Computer Science (Vol. 5901, pp. 370–381).

  • Gamrath, G., Fischer, T., Gally, T., Gleixner, A.M., Hendel, G., Koch, T., Maher, S.J., Miltenberger, M., Müller, B., Pfetsch, M.E., Puchert, C., Rehfeldt, D., Schenker, S., Schwarz, R., Serrano, F., Shinano, Y., Vigerske, S., Weninger, D., Winkler, M., Witt, J.T., & Witzig, J. (2016). The scip optimization suite 3.2. Tech. Rep. 15-60, ZIB, Berlin.

  • Gramm, J., Guo, J., & Niedermeier, R. (2006). Parameterized intractability of distinguishing substring selection. Theory of Computing Systems, 39(4), 545–560.

    Article  Google Scholar 

  • Hansen, P., Mladenović, N., & Moreno Pérez, J. A. (2008). Variable neighbourhood search: methods and applications. A Quarterly Journal of Operations Research, 6(4), 319–360.

    Google Scholar 

  • IBM (2013). IBM ILOG CPLEX v12.6 optimization studio CPLEX user’s manual.

  • Jean, T. (2018). DSSP. https://github.com/jeanpttorres/dssp . Accessed December 28, 2020.

  • Lanctot, J.K. (2000). Some string problems in computational biology. Ph.D. thesis, University of Waterloo, Ontario, Canada

  • Lanctot, J. K., Li, M., Ma, B., Wang, S., & Zhang, L. (2003). Distinguishing string selection problems. Information and Computation, 185(1), 41–55. https://doi.org/10.1016/S0890-5401(03)00057-9.

    Article  Google Scholar 

  • Liu, X., Holger, M., Hao, Z., & Wu, G. (2008). A compounded genetic and simulated annealing algorithm for the closest string problem. In Proceedings of the 2nd International Conference on Bioinformatics and Biomedical Engineering (pp. 702–705). https://doi.org/10.1109/ICBBE.2008.171

  • Liu, X., Liu, S., Hao, Z., & Mauch, H. (2011). Exact algorithm and heuristic for the closest string problem. Computers & Operations Research, 38, 1513–1520.

    Article  Google Scholar 

  • Mauch, H., Melzer, M.J., & Hu, J.S. (2003). Genetic algorithm approach for the closest string problem. In Proceedings of the IEEE Computer Society Conference on Bioinformatics (p. 560) . https://doi.org/10.1109/CSB.2003.1227407

  • Meneses, C.N. (2005). Combinatorial approaches for problems in bioinformatics. Ph.D. thesis, University of Florida, Florida, USA

  • Meneses, C.N., Pardalos, P.M., Resende, M.G.C., & Vazacopoulos, A. (2005). Modeling and solving string selection problems. In Proceedings of the Second International Symposium on Mathematical and Computational Biology (pp. 54–64).

  • Proutski, V., & Holmes, E. C. (1996). Primer Master: a new program for the design and analysis of PCR primers. Bioinformatics, 12(3), 253–255. https://doi.org/10.1093/bioinformatics/12.3.253.

    Article  Google Scholar 

  • Stormo, G. D., Hartzell, G. W., & Hertz, G. Z. (1990). Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Computer Applications in the Biosciences, 6(2), 81–92.

    Google Scholar 

  • Torres, J., Silva, E., & Hoshino, E.A. (2018). Heuristic approaches to the distinguishing substring selection problem. In Proceedings of the 5th International Conference on Variable Neighborhood Search, Electronic Notes in Discrete Mathematics (Vol. 66, pp. 151–158). https://doi.org/10.1016/j.endm.2018.03.020

  • Torres, J.P., & Hoshino, E.A. (2018). Abordagens heurísticas para problemas de seleção de strings. In Proceedings of Simpásio Brasileiro de Matemática Aplicada e Computacional, Proceeding Series of the Brazilian Society of Computational and Applied Mathematics, vol. 6.

Download references

Acknowledgements

We would like to thank SCIP developers team and IBM for the SCIP and Cplex academic licenses. We also thank the anonymous reviewers, whose works and contributions helped us to improve the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edna A. Hoshino.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Appendix: Detailed experiments results

Appendix: Detailed experiments results

Tables 5, 6, 7, 8, 9, and 10 show the results for each instance. Column frac presents the percentage of variables whose values in the optimal solution of the linear relaxation were fractional. Column root refers to the value of the optimal solution of the linear relaxation at the root node, whilst columns lb and ub represent the final lower an upper bound found by the exact algorithm, respectively. Columns \(H_1\), \(H_2\), \(H_3\), and \(H_4\) show the value of the solution found by RA, BCPA, ILPBN-VNS, and ILPBS-VNS, respectively. Instances are named r-XX-YY-ZZ-N. where XX indicates \(|\varSigma |\), YY refers to \(|S^c|=|S^f|\), ZZ to |t|, and N is used to express different test cases with similar structures. The best solutions found by heuristics are highlighted in boldface. We also highlight the root lower bound and the final upper bound found by the exact approach when they are optimum. The symbol \(*\) are used to indicate that the information was not available after the time limit. We use the symbol − instead of the value of the solution found by a heuristic, when it could not be find due to the time limit.

Table 5 DSSP Group 1
Table 6 DSSP Group 2
Table 7 DSSP Group 3
Table 8 DSSSP Group 1
Table 9 DSSSP Group 2
Table 10 DSSSP Group 3

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Torres, J.P.T., Hoshino, E.A. LP-based heuristics for the distinguishing string and substring selection problems. Ann Oper Res 316, 1205–1234 (2022). https://doi.org/10.1007/s10479-021-04138-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-021-04138-5

Keywords

Navigation