Skip to main content
Log in

Mining Time-constrained Sequential Patterns with Constraint Programming

  • Published:
Constraints Aims and scope Submit manuscript

Abstract

Constraint Programming (CP) has proven to be an effective platform for constraint based sequence mining. Previous work has focused on standard frequent sequence mining, as well as frequent sequence mining with a maximum ’gap’ between two matching events in a sequence. The main challenge in the latter is that this constraint can not be imposed independently of the omnipresent frequency constraint. Indeed, the gap constraint changes whether a subsequence is included in a sequence, and hence its frequency. In this work, we go beyond that and investigate the integration of timed events and constraining the minimum/maximum gap as well as minimum/maximum span. The latter constrains the allowed time between the first and last matching event of a pattern. We show how the three are interrelated, and what the required changes to the frequency constraint are. Key in our approach is the concept of an extension window defined by gap/span and we develop techniques to avoid scanning the sequences needlessly, as well as using a backtracking-aware data structure. Experiments demonstrate that the proposed approach outperforms both specialized and CP-based approaches in almost all cases and that the advantage increases as the minimum frequency threshold decreases. This paper is an extension of the original manuscript presented at CPAIOR’17 [5].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Listing 1
Listing 2
Listing 3
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Except some CP-Solvers such as Gecode, Oz/Mozart and Figaro.

  2. It is neither post-processing nor hard-coded

  3. http://sites.uclouvain.be/cp4dm/spm/ppict/

  4. https://sites.google.com/site/cp4spm/

  5. http://www.cs.rpi.edu/~zaki/www-new/pmwiki.php/Software

  6. http://sites.uclouvain.be/cp4dm/spm/

References

  1. Aggarwal, C.C., & Han, J. (2014). Frequent pattern mining. Springer.

  2. Agrawal, R., & Srikant, R. (1995). Mining sequential patterns, Proceedings of the eleventh international conference on data engineering, 1995. (pp. 3–14).

  3. Antunes, C., & Oliveira, A.L. (2003). Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In Perner, P., & Rosenfeld, A. (Eds.), Machine learning and data mining in pattern recognition: 3rd international conference, MLDM 2003 leipzig, Germany, July 5–7, 2003 Proceedings (pp. 239–251). Berlin: Springer.

    Chapter  Google Scholar 

  4. Aoga, J.O.R., Guns, T., & Schaus, P. (2016). An efficient algorithm for mining frequent sequence with constraint programming. In Frasconi, P., Landwehr, N., Manco, G., & Vreeken, J. (Eds.), Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2016, riva del garda, Italy, September 19-23, 2016, Proceedings, Part II (pp. 315–330). Cham: Springer International Publishing.

    Chapter  Google Scholar 

  5. Aoga, J.O.R., Guns, T., & Schaus, P. (2017). Mining time-constrained sequential patterns with constraint programming. In Salvagnin, D., & Lombardi, M. (Eds.), Integration of AI and OR techniques in constraint programming - 13th international conference, CPAIOR 2017, padova, Italy, June 5 - 8, 2017, Proceedings, Lecture Notes in Computer Science. Springer.

  6. Ayres, J., Flannick, J., Gehrke, J., & Yiu, T. (2002). Sequential pattern mining using a bitmap representation, Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, July 23-26, 2002, edmonton, alberta, Canada (pp. 429–435).

  7. Batal, I., Fradkin, D., Harrison, J., Moerchen, F., & Hauskrecht, M. (2012). Mining recent temporal patterns for event detection in multivariate time series data. In Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 280–288).

  8. Beldiceanu, N., & Contejean, E. (1994). Introducing global constraints in chip. Mathematical and computer Modelling, 20(12), 97–123.

    Article  MATH  Google Scholar 

  9. Coquery, E., Jabbour, S., Saïs, L., & Salhi, Y. (2012). A sat-based approach for discovering frequent, closed and maximal patterns in a sequence. In Raedt, L.d., Bessiėre, C., Dubois, D., Doherty, P., Frasconi, P., Heintz, F., & Lucas, P.J.F. (Eds.), ECAI 2012 - 20th European Conference on Artificial Intelligence. Montpellier, France, August 27-31, 2012, Frontiers in Artificial Intelligence and Applications, vol. 242, pp. 258–263. IOS Press.

  10. Desai, N.A.K., & Ganatra, A. (2015). Efficient constraint-based sequential pattern mining (spm) algorithm to understand customers buying behaviour from time stamp-based sequence dataset. Cogent Engineering, 2(1), 1072,292.

    Article  Google Scholar 

  11. Fournier-Viger, P., Wu, C.W., & Tseng, V.S. (2013). Mining maximal sequential patterns without candidate maintenance, Advanced data mining and applications (pp. 169–180): Springer.

  12. Guns, T., Nijssen, S., & De Raedt, L. (2013). k-pattern set mining under constraints. IEEE Transactions on Knowledge and Data Engineering, 25(2), 402–418.

    Article  Google Scholar 

  13. Han, J., Pei, J., Yin, Y., & Mao, R. (2004). Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data mining and knowledge discovery, 8(1), 53–87.

    Article  MathSciNet  Google Scholar 

  14. He, J., Flener, P., Pearson, J., & Zhang, W.M. (2013). Solving string constraints: The case for constraint programming, International conference on principles and practice of constraint programming (pp. 381–397): Springer.

  15. Henriques, R., Antunes, C., & Madeira, S.C. (2014). Methods for the efficient discovery of large item-indexable sequential patterns. In Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., & Ras, Z.W. (Eds.), New frontiers in mining complex patterns: Second international workshop, NFMCP 2013, held in conjunction with ECML-PKDD 2013, prague, Czech Republic, September 27, 2013, Revised Selected Papers (pp. 100–116). Cham: Springer International Publishing.

    Google Scholar 

  16. Henriques, R., & Madeira, S.C. (2014). Bicspam: flexible biclustering using sequential patterns. BMC Bioinformatics, 15(1), 130.

    Article  Google Scholar 

  17. Kadioglu, S., & Sellmann, M. (2010). Grammar constraints. Constraints, 15(1), 117–144.

    Article  MathSciNet  MATH  Google Scholar 

  18. Kemmar, A., Lebbah, Y., Loudni, S., Boizumault, P., & Charnois, T. (2017). Prefix-projection global constraint and top-k approach for sequential pattern mining. Constraints, 22(2), 265–306.

    Article  MathSciNet  MATH  Google Scholar 

  19. Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., & Charnois, T. (2015). Prefix-projection global constraint for sequential pattern mining. In Pesant, G. (Ed.), Principles and practice of constraint programming: 21st international conference, CP 2015, cork, Ireland, August 31 – September 4, 2015, Proceedings (pp. 226–243). Cham: Springer International Publishing.

    Google Scholar 

  20. Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., & Charnois, T. (2016). A global constraint for mining sequential patterns with GAP constraint. In Quimper, C. (Ed.), Integration of AI and OR techniques in constraint programming - 13th international conference, CPAIOR 2016, banff, AB, Canada, May 29 - June 1, 2016, Proceedings, Lecture Notes in Computer Science, (Vol. 9676 pp. 198–215): Springer.

  21. Li, C., & Wang, J. (2008). Efficiently mining closed subsequences with gap constraints. In Proceedings of the SIAM international conference on data mining, SDM 2008, April 24-26, 2008, atlanta, Georgia, USA (pp. 313–322).

  22. Lu, S., & Li, C. (2004). Aprioriadjust: an efficient algorithm for discovering the maximum sequential patterns. In Proc. Intern. Workshop knowl. Grid and grid intell.

  23. Mannila, H., Toivonen, H., & Verkamo, A.I. (1997). Discovery of frequent episodes in event sequences. Data mining and knowledge discovery, 1(3), 259–289.

    Article  Google Scholar 

  24. Metivier, J., Boizumault, P., Crémilleux, B., Khiari, M., & Loudni, S. (2011). A constraint-based language for declarative pattern discovery. In Data mining workshops (ICDMW), 2011 IEEE 11th international conference on (pp. 1112–1119).

  25. Nėgrevergne, B., & Guns, T. (2015). Constraint-based sequence mining using constraint programming. In Michel, L. (Ed.), Integration of AI and OR techniques in constraint programming - 12th international conference, CPAIOR 2015, barcelona, Spain, May 18-22, 2015, Proceedings, Lecture Notes in Computer Science, (Vol. 9075 pp. 288–305): Springer.

  26. OscaR Team (2012). OscaR: Scala in OR. Available from https://bitbucket.org/oscarlib/oscar.

  27. Parthasarathy, S., Zaki, M.J., Ogihara, M., & Dwarkadas, S. (1999). Incremental and interactive sequence mining. In Proceedings of the 8th international conference on information and knowledge management (pp. 251–258).

  28. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., & Hsu, M.C. (2001). Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th international conference on data engineering (pp. 215–224).

  29. Pei, J., Han, J., & Wang, W. (2007). Constraint-based sequential pattern mining: the pattern-growth methods. Journal of Intelligent Information Systems, 28 (2), 133–160.

    Article  Google Scholar 

  30. Pesant, G. (2004). A regular language membership constraint for finite sequences of variables. In International conference on principles and practice of constraint programming (pp. 482–495): Springer.

  31. Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., & Dayal, U. (2001). Multi-dimensional sequential pattern mining. In Proceedings of the tenth international conference on information and knowledge management (pp. 81–88).

  32. Quimper, C.G., & Walsh, T. (2006). Global grammar constraints. In International conference on principles and practice of constraint programming (pp. 751–755): Springer.

  33. Régin, J. C. (1996). Generalized arc consistency for global cardinality constraint. In Proceedings of the thirteenth national conference on artificial intelligence-volume 1 (pp. 209–215): AAAI press.

  34. Rossi, F., Van Beek, P., & Walsh, T. (2006). Handbook of CP. elsevier.

  35. Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. Springer.

  36. Tatti, N., & Cule, B. (2011). Mining closed episodes with simultaneous events. In Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11 (pp. 1172–1180). New York: ACM.

  37. Wang, J., Han, J., & Li, C. (2007). Frequent closed sequence mining without candidate maintenance. IEEE Transactions on Knowledge and Data Engineering, 19(8), 1042–1056.

    Article  Google Scholar 

  38. Yan, X., Han, J., & Afshar, R. (2003). Clospan: Mining: Closed sequential patterns in large datasets. In Proceedings of the 2003 SIAM international conference on data mining (pp. 166–177): SIAM.

  39. Zaki, M.J. (1998). Efficient enumeration of frequent sequences. In Proceedings of the seventh international conference on information and knowledge management (pp. 68–75): ACM.

  40. Zaki, M.J. (2000). Sequence mining in categorical domains: incorporating constraints. In Proceedings of the ninth international conference on information and knowledge management (pp. 422–429): ACM.

  41. Zhao, Q., & Bhowmick, S.S. (2003). Sequential pattern mining: a survey. ITechnical Report CAIS Nayang Technological University Singapore pp. 1–26.

Download references

Acknowledgements

The research is supported by the FRIA-FNRS (Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture, Belgium) and FWO (Research Foundation – Flanders).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John O. R. Aoga.

Additional information

This article belongs to the Topical Collection: Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming

Guest Editors: Michele Lombardi and Domenico Salvagnin

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aoga, J.O.R., Guns, T. & Schaus, P. Mining Time-constrained Sequential Patterns with Constraint Programming. Constraints 22, 548–570 (2017). https://doi.org/10.1007/s10601-017-9272-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10601-017-9272-3

Keywords

Navigation