skip to main content
research-article

Learning Bayesian Networks with the Saiyan Algorithm

Published:22 June 2020Publication History
Skip Abstract Section

Abstract

Some structure learning algorithms have proven to be effective in reconstructing hypothetical Bayesian Network graphs from synthetic data. However, in their mission to maximise a scoring function, many become conservative and minimise edges discovered. While simplicity is desired, the output is often a graph that consists of multiple independent subgraphs that do not enable full propagation of evidence. While this is not a problem in theory, it can be a problem in practice. This article examines a novel unconventional associational heuristic called Saiyan, which returns a directed acyclic graph that enables full propagation of evidence. Associational heuristics are not expected to perform well relative to sophisticated constraint-based and score-based learning approaches. Moreover, forcing the algorithm to connect all data variables implies that the forced edges will not be correct at the rate of those identified unrestrictedly. Still, synthetic and real-world experiments suggest that such a heuristic can be competitive relative to some of the well-established constraint-based, score-based and hybrid learning algorithms.

References

  1. Judea Pearl. 1982. Reverend Bayes on inference engines: A distributed hierarchical approach. In Proceedings of the 2nd AAAI Conference on Artificial Intelligence. AAAI Press, 133--136.Google ScholarGoogle Scholar
  2. Judea Pearl. 1985. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th Conference of the Cognitive Science Society. 329--334.Google ScholarGoogle Scholar
  3. Peter Spirtes, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search (2nd Edition). The MIT Press, Cambridge Massachusetts.Google ScholarGoogle Scholar
  4. Steen A. Andersson, David Madigan, and D. Michael. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Annals of Statistics 25, 2 (1997), 505--541.Google ScholarGoogle ScholarCross RefCross Ref
  5. Christopher Meek. 1995. Causal inference and causal explanation with background knowledge. In Proceedings of the 11th UAI Conference on Uncertainty in Artificial Intelligence. 403--410.Google ScholarGoogle Scholar
  6. Peter Spirtes and Christopher Meek. 1995. Learning Bayesian networks with discrete variables from data. In Proceedings of the 1st Annual Conference on Knowledge Discovery and Data Mining. 294--299.Google ScholarGoogle Scholar
  7. David M. Chickering, David Heckerman, and Christopher Meek. 2004. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research 5 (2004), 1287--1330.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models. MIT Press, Cambridge, Massachusetts.Google ScholarGoogle Scholar
  9. Gregory F. Cooper and Edward Herskovits. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9 (1992), 309--347.Google ScholarGoogle ScholarCross RefCross Ref
  10. Nir Friedman, Iftach Nachman, and Dana Peer. 1999. Learning Bayesian network structure from massive datasets: The “Sparse Candidate” algorithm. In Proceedings of the 16th UAI Conference on Uncertainty in Artificial Intelligence. 206--215.Google ScholarGoogle Scholar
  11. Andrew Moore and Weng-Keen Wong. 2003. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 552--559.Google ScholarGoogle Scholar
  12. David M. Chickering. 2002. Optimal structure identification with greedy search. Journal of Machine Learning Research 3 (2002), 507--554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Verma and Judea Pearl. 1990. Equivalence and synthesis of causal models. In Proceedings of the 6th UAI Conference on Uncertainty in Artificial Intelligence. 255--270.Google ScholarGoogle Scholar
  14. Ioannis Tsamardinos, Laura E. Brown, and Constantin F. Aliferis. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65, 1 (2006), 31--78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mark Schmidt, Alexandru Niculescu-Mizil and Kevin Murphy. 2007. Learning graphical model structure using L1-regularization paths. In Proceedings of the 22nd National Conference on Artificial Intelligence. 1278--1283.Google ScholarGoogle Scholar
  16. James Cussens. 2011. Bayesian network learning with cutting planes. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). AUAI Press, 153--160.Google ScholarGoogle Scholar
  17. James Cussens, Matti Jarvisalo, Janne H. Korhonen, and Mark Bartlett. 2017. Bayesian network structure learning with integer programming: Polytopes, facets and complexity. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 4990--4994.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mark Bartlett and James Cussens. 2017. Integer linear programming for the Bayesian network structure learning problem. Artificial Intelligence 244 (2017), 258--271.Google ScholarGoogle ScholarCross RefCross Ref
  19. Pekka Parviainen, Hossein Shahrabi Farahani, and Jens Lagergren. 2014. Learning bounded tree-width Bayesian networks using integer linear programming. In Proceedings of the 17th International Conderence on AI and Statistics (AISTATS’14). 751--759.Google ScholarGoogle Scholar
  20. Tommi Jaakkola, David Sontag, Amir Globerson, and Marina Meila. 2010. Learning Bayesian network structure using LP relaxations. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). 358--365.Google ScholarGoogle Scholar
  21. Raymond Hemmecke, Silvia Lindner, and Milan Studeny. 2012. Characteristic imsets for learning Bayesian network structure. International Journal of Approcimate Reasoning 53, 9 (2012), 1336--1349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Robert Peharz and Franz Pernkopf. 2012. Exact maximum margin structure learning of Bayesian networks. In Proceedings of the 29th International Conference on Machine Learning (ICML’12).Google ScholarGoogle Scholar
  23. Tomi Silander and Petri Myllymaki. 2006. A simple approach for finding the globally optimal Bayesian network structure. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI’06). AUAI Press.Google ScholarGoogle Scholar
  24. Mikko Koivisto and Kismat Sood. 2004. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research 5 (2004), 549--573.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sascha Ott, Seiya Imoto, and Satoru Miyano. 2004. Finding optimal models for small gene networks. Pacific Symposium in Biocomputing (2004), 557--567.Google ScholarGoogle Scholar
  26. Ajit P. Singh and Andrew W. Moore. 2005. Finding optimal Bayesian networks by dynamic programming. Technical Report, CMU-CALD-05-106, Carnegie Mellon University.Google ScholarGoogle Scholar
  27. Changhe Yuan and Brandon Malone. 2013. Learning optimal Bayesian networks: A shortest path perspective. Journal of Artificial Intelligence Research 48 (2013), 23--65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Brandon Malone, Changhe Yuan, Eric A. Hansen, and Susan Bridges. 2011. Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). 479--488.Google ScholarGoogle Scholar
  29. Cassio de Campos and Qiang Ji. 2011. Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research 12 (2011), 663--689.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Seiya Imoto, Takao Goto, and Satoru Miyano. 2001. Estimation of genetic networks and dunctional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing (2001), 175--186.Google ScholarGoogle Scholar
  31. Peter van Beek and Hella-Franziska Hoffmann. 2015. Machine learning of Bayesian networks using constraint programming. In Proceedings of the 21st International Conference on Principles and Practice of Constraint Programming. 429--445.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Dimitris Margaritis and Sebastian Thrun. 1999. Bayesian network induction vial local neighborhoods. In Proceedings of the 12th International Conference on Neural Information Processing Systems (NIPS’99). 505--512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Anthony C. Constantinou. 2019. Asian handicap football betting with rating-based hybrid Bayesian networks. arXiv:2003.09384 [stat.AP].Google ScholarGoogle Scholar
  34. Anthony C. Constantinou, Mark Freestone, William Marsh, Norman Fenton, and Jeremy Coid. 2015. Risk assessment and risk management of violent reoffending among prisoners. Expert Systems with Applications 42, 21 (2015), 7511--7529.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ingo A. Beinlich, Jaap Suermondt, Martin Chavez, and Gregory F. Cooper. 1989. The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proceedings of the 2nd European Conference on Artificial Intelligence and Medicine.Google ScholarGoogle Scholar
  36. Anthony C. Constantinou and Norman Fenton. 2017. The future of the London buy-to-let property market: Simulation with temporal Bayesian networks. PlOS ONE, 12, 6 (2017), e0179297.Google ScholarGoogle ScholarCross RefCross Ref
  37. Anthony C. Constantinou. 2019. Evaluating structure learning algorithms with a balanced scoring function. ArXiv: 1905.12666, 2019.Google ScholarGoogle Scholar
  38. Marco Scutari. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software 35, 3 (2010), 1--22.Google ScholarGoogle Scholar
  39. Peter Spirtes and Clark Glymour. 1991. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9, 1 (1991), 62--72.Google ScholarGoogle ScholarCross RefCross Ref
  40. Peter Spirtes, Christopher Meek, and Thomas Richardson. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. In Clark Glymour and Gregory Cooper (Eds.), Computation, Causation, and Discovery. The MIT Press, Cambridge, MA, 211--252.Google ScholarGoogle Scholar
  41. Christopher Meek. 1997. Graphical models: Selecting causal and statistical models. PhD dissertation, Carnegie Mellon University.Google ScholarGoogle Scholar
  42. Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Time and sample efficient discovery of markov blankets and direct causal relations. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 673--678.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach (3 ed.). Prentice Hall.Google ScholarGoogle Scholar
  44. Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Algorithms for large scale Markov blanket discovery. In Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference. 376--381.Google ScholarGoogle Scholar
  45. Marco Scutari and Jean-Baptiste Denis. 2014. Bayesian Networks: With examples in R. CRC Press.Google ScholarGoogle ScholarCross RefCross Ref
  46. Anthony C. Constantinou. The Bayesys user manual. Queen Mary University of London, London, UK. Retrieved from http://bayesian-ai.eecs.qmul.ac.uk/bayesys/.Google ScholarGoogle Scholar

Index Terms

  1. Learning Bayesian Networks with the Saiyan Algorithm

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 14, Issue 4
      August 2020
      316 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3403605
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 June 2020
      • Online AM: 7 May 2020
      • Revised: 1 February 2020
      • Accepted: 1 February 2020
      • Received: 1 June 2019
      Published in tkdd Volume 14, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format