research-article

Learning Bayesian Networks with the Saiyan Algorithm

Author:
Anthony C. Constantinou

Queen Mary University of London, London, UK

Queen Mary University of London, London, UK
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 14 Issue 4Article No.: 44pp 1–21https://doi.org/10.1145/3385655

Published:22 June 2020Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Some structure learning algorithms have proven to be effective in reconstructing hypothetical Bayesian Network graphs from synthetic data. However, in their mission to maximise a scoring function, many become conservative and minimise edges discovered. While simplicity is desired, the output is often a graph that consists of multiple independent subgraphs that do not enable full propagation of evidence. While this is not a problem in theory, it can be a problem in practice. This article examines a novel unconventional associational heuristic called Saiyan, which returns a directed acyclic graph that enables full propagation of evidence. Associational heuristics are not expected to perform well relative to sophisticated constraint-based and score-based learning approaches. Moreover, forcing the algorithm to connect all data variables implies that the forced edges will not be correct at the rate of those identified unrestrictedly. Still, synthetic and real-world experiments suggest that such a heuristic can be competitive relative to some of the well-established constraint-based, score-based and hybrid learning algorithms.

References

Judea Pearl. 1982. Reverend Bayes on inference engines: A distributed hierarchical approach. In Proceedings of the 2nd AAAI Conference on Artificial Intelligence. AAAI Press, 133--136.Google Scholar
Judea Pearl. 1985. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th Conference of the Cognitive Science Society. 329--334.Google Scholar
Peter Spirtes, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search (2nd Edition). The MIT Press, Cambridge Massachusetts.Google Scholar
Steen A. Andersson, David Madigan, and D. Michael. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Annals of Statistics 25, 2 (1997), 505--541.Google ScholarCross Ref
Christopher Meek. 1995. Causal inference and causal explanation with background knowledge. In Proceedings of the 11th UAI Conference on Uncertainty in Artificial Intelligence. 403--410.Google Scholar
Peter Spirtes and Christopher Meek. 1995. Learning Bayesian networks with discrete variables from data. In Proceedings of the 1st Annual Conference on Knowledge Discovery and Data Mining. 294--299.Google Scholar
David M. Chickering, David Heckerman, and Christopher Meek. 2004. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research 5 (2004), 1287--1330.Google ScholarDigital Library
Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models. MIT Press, Cambridge, Massachusetts.Google Scholar
Gregory F. Cooper and Edward Herskovits. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9 (1992), 309--347.Google ScholarCross Ref
Nir Friedman, Iftach Nachman, and Dana Peer. 1999. Learning Bayesian network structure from massive datasets: The “Sparse Candidate” algorithm. In Proceedings of the 16th UAI Conference on Uncertainty in Artificial Intelligence. 206--215.Google Scholar
Andrew Moore and Weng-Keen Wong. 2003. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 552--559.Google Scholar
David M. Chickering. 2002. Optimal structure identification with greedy search. Journal of Machine Learning Research 3 (2002), 507--554.Google ScholarDigital Library
Thomas Verma and Judea Pearl. 1990. Equivalence and synthesis of causal models. In Proceedings of the 6th UAI Conference on Uncertainty in Artificial Intelligence. 255--270.Google Scholar
Ioannis Tsamardinos, Laura E. Brown, and Constantin F. Aliferis. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65, 1 (2006), 31--78.Google ScholarDigital Library
Mark Schmidt, Alexandru Niculescu-Mizil and Kevin Murphy. 2007. Learning graphical model structure using L1-regularization paths. In Proceedings of the 22nd National Conference on Artificial Intelligence. 1278--1283.Google Scholar
James Cussens. 2011. Bayesian network learning with cutting planes. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). AUAI Press, 153--160.Google Scholar
James Cussens, Matti Jarvisalo, Janne H. Korhonen, and Mark Bartlett. 2017. Bayesian network structure learning with integer programming: Polytopes, facets and complexity. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 4990--4994.Google ScholarCross Ref
Mark Bartlett and James Cussens. 2017. Integer linear programming for the Bayesian network structure learning problem. Artificial Intelligence 244 (2017), 258--271.Google ScholarCross Ref
Pekka Parviainen, Hossein Shahrabi Farahani, and Jens Lagergren. 2014. Learning bounded tree-width Bayesian networks using integer linear programming. In Proceedings of the 17th International Conderence on AI and Statistics (AISTATS’14). 751--759.Google Scholar
Tommi Jaakkola, David Sontag, Amir Globerson, and Marina Meila. 2010. Learning Bayesian network structure using LP relaxations. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). 358--365.Google Scholar
Raymond Hemmecke, Silvia Lindner, and Milan Studeny. 2012. Characteristic imsets for learning Bayesian network structure. International Journal of Approcimate Reasoning 53, 9 (2012), 1336--1349.Google ScholarDigital Library
Robert Peharz and Franz Pernkopf. 2012. Exact maximum margin structure learning of Bayesian networks. In Proceedings of the 29th International Conference on Machine Learning (ICML’12).Google Scholar
Tomi Silander and Petri Myllymaki. 2006. A simple approach for finding the globally optimal Bayesian network structure. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI’06). AUAI Press.Google Scholar
Mikko Koivisto and Kismat Sood. 2004. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research 5 (2004), 549--573.Google ScholarDigital Library
Sascha Ott, Seiya Imoto, and Satoru Miyano. 2004. Finding optimal models for small gene networks. Pacific Symposium in Biocomputing (2004), 557--567.Google Scholar
Ajit P. Singh and Andrew W. Moore. 2005. Finding optimal Bayesian networks by dynamic programming. Technical Report, CMU-CALD-05-106, Carnegie Mellon University.Google Scholar
Changhe Yuan and Brandon Malone. 2013. Learning optimal Bayesian networks: A shortest path perspective. Journal of Artificial Intelligence Research 48 (2013), 23--65.Google ScholarDigital Library
Brandon Malone, Changhe Yuan, Eric A. Hansen, and Susan Bridges. 2011. Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). 479--488.Google Scholar
Cassio de Campos and Qiang Ji. 2011. Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research 12 (2011), 663--689.Google ScholarDigital Library
Seiya Imoto, Takao Goto, and Satoru Miyano. 2001. Estimation of genetic networks and dunctional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing (2001), 175--186.Google Scholar
Peter van Beek and Hella-Franziska Hoffmann. 2015. Machine learning of Bayesian networks using constraint programming. In Proceedings of the 21st International Conference on Principles and Practice of Constraint Programming. 429--445.Google ScholarDigital Library
Dimitris Margaritis and Sebastian Thrun. 1999. Bayesian network induction vial local neighborhoods. In Proceedings of the 12th International Conference on Neural Information Processing Systems (NIPS’99). 505--512.Google ScholarDigital Library
Anthony C. Constantinou. 2019. Asian handicap football betting with rating-based hybrid Bayesian networks. arXiv:2003.09384 [stat.AP].Google Scholar
Anthony C. Constantinou, Mark Freestone, William Marsh, Norman Fenton, and Jeremy Coid. 2015. Risk assessment and risk management of violent reoffending among prisoners. Expert Systems with Applications 42, 21 (2015), 7511--7529.Google ScholarDigital Library
Ingo A. Beinlich, Jaap Suermondt, Martin Chavez, and Gregory F. Cooper. 1989. The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proceedings of the 2nd European Conference on Artificial Intelligence and Medicine.Google Scholar
Anthony C. Constantinou and Norman Fenton. 2017. The future of the London buy-to-let property market: Simulation with temporal Bayesian networks. PlOS ONE, 12, 6 (2017), e0179297.Google ScholarCross Ref
Anthony C. Constantinou. 2019. Evaluating structure learning algorithms with a balanced scoring function. ArXiv: 1905.12666, 2019.Google Scholar
Marco Scutari. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software 35, 3 (2010), 1--22.Google Scholar
Peter Spirtes and Clark Glymour. 1991. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9, 1 (1991), 62--72.Google ScholarCross Ref
Peter Spirtes, Christopher Meek, and Thomas Richardson. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. In Clark Glymour and Gregory Cooper (Eds.), Computation, Causation, and Discovery. The MIT Press, Cambridge, MA, 211--252.Google Scholar
Christopher Meek. 1997. Graphical models: Selecting causal and statistical models. PhD dissertation, Carnegie Mellon University.Google Scholar
Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Time and sample efficient discovery of markov blankets and direct causal relations. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 673--678.Google ScholarDigital Library
Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach (3 ed.). Prentice Hall.Google Scholar
Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Algorithms for large scale Markov blanket discovery. In Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference. 376--381.Google Scholar
Marco Scutari and Jean-Baptiste Denis. 2014. Bayesian Networks: With examples in R. CRC Press.Google ScholarCross Ref
Anthony C. Constantinou. The Bayesys user manual. Queen Mary University of London, London, UK. Retrieved from http://bayesian-ai.eecs.qmul.ac.uk/bayesys/.Google Scholar

Index Terms

Learning Bayesian Networks with the Saiyan Algorithm
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations

Recommendations

The max-min hill-climbing Bayesian network structure learning algorithm

We present a new algorithm for Bayesian network structure learning, called Max-Min Hill-Climbing ( MMHC ). The algorithm combines ideas from local learning, constraint-based, and search-and-score techniques in a principled and effective way. It first ...
Read More
Parallel globally optimal structure learning of Bayesian networks

Given n random variables and a set of m observations of each of the n variables, the Bayesian network structure learning problem is to learn a directed acyclic graph (DAG) on the n variables such that the implied joint probability distribution best ...
Read More
Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs

There is a commonly held opinion that the algorithms for learning unrestricted types of Bayesian networks, especially those based on the score+search paradigm, are not suitable for building competitive Bayesian network-based classifiers. Several ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 14, Issue 4
August 2020
316 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3403605
Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
Minginglamp Academy of Sciences, China
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2020
- Online AM: 7 May 2020
- Revised: 1 February 2020
- Accepted: 1 February 2020
- Received: 1 June 2019
Published in tkdd Volume 14, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bayesian networks
directed acyclic graphs
graphical models
structure learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 196
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Learning Bayesian Networks with the Saiyan Algorithm

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

The max-min hill-climbing Bayesian network structure learning algorithm

Parallel globally optimal structure learning of Bayesian networks

Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs