Abstract
Some structure learning algorithms have proven to be effective in reconstructing hypothetical Bayesian Network graphs from synthetic data. However, in their mission to maximise a scoring function, many become conservative and minimise edges discovered. While simplicity is desired, the output is often a graph that consists of multiple independent subgraphs that do not enable full propagation of evidence. While this is not a problem in theory, it can be a problem in practice. This article examines a novel unconventional associational heuristic called Saiyan, which returns a directed acyclic graph that enables full propagation of evidence. Associational heuristics are not expected to perform well relative to sophisticated constraint-based and score-based learning approaches. Moreover, forcing the algorithm to connect all data variables implies that the forced edges will not be correct at the rate of those identified unrestrictedly. Still, synthetic and real-world experiments suggest that such a heuristic can be competitive relative to some of the well-established constraint-based, score-based and hybrid learning algorithms.
- Judea Pearl. 1982. Reverend Bayes on inference engines: A distributed hierarchical approach. In Proceedings of the 2nd AAAI Conference on Artificial Intelligence. AAAI Press, 133--136.Google Scholar
- Judea Pearl. 1985. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th Conference of the Cognitive Science Society. 329--334.Google Scholar
- Peter Spirtes, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search (2nd Edition). The MIT Press, Cambridge Massachusetts.Google Scholar
- Steen A. Andersson, David Madigan, and D. Michael. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Annals of Statistics 25, 2 (1997), 505--541.Google ScholarCross Ref
- Christopher Meek. 1995. Causal inference and causal explanation with background knowledge. In Proceedings of the 11th UAI Conference on Uncertainty in Artificial Intelligence. 403--410.Google Scholar
- Peter Spirtes and Christopher Meek. 1995. Learning Bayesian networks with discrete variables from data. In Proceedings of the 1st Annual Conference on Knowledge Discovery and Data Mining. 294--299.Google Scholar
- David M. Chickering, David Heckerman, and Christopher Meek. 2004. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research 5 (2004), 1287--1330.Google ScholarDigital Library
- Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models. MIT Press, Cambridge, Massachusetts.Google Scholar
- Gregory F. Cooper and Edward Herskovits. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9 (1992), 309--347.Google ScholarCross Ref
- Nir Friedman, Iftach Nachman, and Dana Peer. 1999. Learning Bayesian network structure from massive datasets: The “Sparse Candidate” algorithm. In Proceedings of the 16th UAI Conference on Uncertainty in Artificial Intelligence. 206--215.Google Scholar
- Andrew Moore and Weng-Keen Wong. 2003. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 552--559.Google Scholar
- David M. Chickering. 2002. Optimal structure identification with greedy search. Journal of Machine Learning Research 3 (2002), 507--554.Google ScholarDigital Library
- Thomas Verma and Judea Pearl. 1990. Equivalence and synthesis of causal models. In Proceedings of the 6th UAI Conference on Uncertainty in Artificial Intelligence. 255--270.Google Scholar
- Ioannis Tsamardinos, Laura E. Brown, and Constantin F. Aliferis. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65, 1 (2006), 31--78.Google ScholarDigital Library
- Mark Schmidt, Alexandru Niculescu-Mizil and Kevin Murphy. 2007. Learning graphical model structure using L1-regularization paths. In Proceedings of the 22nd National Conference on Artificial Intelligence. 1278--1283.Google Scholar
- James Cussens. 2011. Bayesian network learning with cutting planes. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). AUAI Press, 153--160.Google Scholar
- James Cussens, Matti Jarvisalo, Janne H. Korhonen, and Mark Bartlett. 2017. Bayesian network structure learning with integer programming: Polytopes, facets and complexity. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 4990--4994.Google ScholarCross Ref
- Mark Bartlett and James Cussens. 2017. Integer linear programming for the Bayesian network structure learning problem. Artificial Intelligence 244 (2017), 258--271.Google ScholarCross Ref
- Pekka Parviainen, Hossein Shahrabi Farahani, and Jens Lagergren. 2014. Learning bounded tree-width Bayesian networks using integer linear programming. In Proceedings of the 17th International Conderence on AI and Statistics (AISTATS’14). 751--759.Google Scholar
- Tommi Jaakkola, David Sontag, Amir Globerson, and Marina Meila. 2010. Learning Bayesian network structure using LP relaxations. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). 358--365.Google Scholar
- Raymond Hemmecke, Silvia Lindner, and Milan Studeny. 2012. Characteristic imsets for learning Bayesian network structure. International Journal of Approcimate Reasoning 53, 9 (2012), 1336--1349.Google ScholarDigital Library
- Robert Peharz and Franz Pernkopf. 2012. Exact maximum margin structure learning of Bayesian networks. In Proceedings of the 29th International Conference on Machine Learning (ICML’12).Google Scholar
- Tomi Silander and Petri Myllymaki. 2006. A simple approach for finding the globally optimal Bayesian network structure. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI’06). AUAI Press.Google Scholar
- Mikko Koivisto and Kismat Sood. 2004. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research 5 (2004), 549--573.Google ScholarDigital Library
- Sascha Ott, Seiya Imoto, and Satoru Miyano. 2004. Finding optimal models for small gene networks. Pacific Symposium in Biocomputing (2004), 557--567.Google Scholar
- Ajit P. Singh and Andrew W. Moore. 2005. Finding optimal Bayesian networks by dynamic programming. Technical Report, CMU-CALD-05-106, Carnegie Mellon University.Google Scholar
- Changhe Yuan and Brandon Malone. 2013. Learning optimal Bayesian networks: A shortest path perspective. Journal of Artificial Intelligence Research 48 (2013), 23--65.Google ScholarDigital Library
- Brandon Malone, Changhe Yuan, Eric A. Hansen, and Susan Bridges. 2011. Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11). 479--488.Google Scholar
- Cassio de Campos and Qiang Ji. 2011. Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research 12 (2011), 663--689.Google ScholarDigital Library
- Seiya Imoto, Takao Goto, and Satoru Miyano. 2001. Estimation of genetic networks and dunctional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing (2001), 175--186.Google Scholar
- Peter van Beek and Hella-Franziska Hoffmann. 2015. Machine learning of Bayesian networks using constraint programming. In Proceedings of the 21st International Conference on Principles and Practice of Constraint Programming. 429--445.Google ScholarDigital Library
- Dimitris Margaritis and Sebastian Thrun. 1999. Bayesian network induction vial local neighborhoods. In Proceedings of the 12th International Conference on Neural Information Processing Systems (NIPS’99). 505--512.Google ScholarDigital Library
- Anthony C. Constantinou. 2019. Asian handicap football betting with rating-based hybrid Bayesian networks. arXiv:2003.09384 [stat.AP].Google Scholar
- Anthony C. Constantinou, Mark Freestone, William Marsh, Norman Fenton, and Jeremy Coid. 2015. Risk assessment and risk management of violent reoffending among prisoners. Expert Systems with Applications 42, 21 (2015), 7511--7529.Google ScholarDigital Library
- Ingo A. Beinlich, Jaap Suermondt, Martin Chavez, and Gregory F. Cooper. 1989. The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proceedings of the 2nd European Conference on Artificial Intelligence and Medicine.Google Scholar
- Anthony C. Constantinou and Norman Fenton. 2017. The future of the London buy-to-let property market: Simulation with temporal Bayesian networks. PlOS ONE, 12, 6 (2017), e0179297.Google ScholarCross Ref
- Anthony C. Constantinou. 2019. Evaluating structure learning algorithms with a balanced scoring function. ArXiv: 1905.12666, 2019.Google Scholar
- Marco Scutari. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software 35, 3 (2010), 1--22.Google Scholar
- Peter Spirtes and Clark Glymour. 1991. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9, 1 (1991), 62--72.Google ScholarCross Ref
- Peter Spirtes, Christopher Meek, and Thomas Richardson. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. In Clark Glymour and Gregory Cooper (Eds.), Computation, Causation, and Discovery. The MIT Press, Cambridge, MA, 211--252.Google Scholar
- Christopher Meek. 1997. Graphical models: Selecting causal and statistical models. PhD dissertation, Carnegie Mellon University.Google Scholar
- Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Time and sample efficient discovery of markov blankets and direct causal relations. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 673--678.Google ScholarDigital Library
- Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach (3 ed.). Prentice Hall.Google Scholar
- Ioannis Tsamardinos, Constantin F. Aliferis, and Alexander Statnikov. 2003. Algorithms for large scale Markov blanket discovery. In Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference. 376--381.Google Scholar
- Marco Scutari and Jean-Baptiste Denis. 2014. Bayesian Networks: With examples in R. CRC Press.Google ScholarCross Ref
- Anthony C. Constantinou. The Bayesys user manual. Queen Mary University of London, London, UK. Retrieved from http://bayesian-ai.eecs.qmul.ac.uk/bayesys/.Google Scholar
Index Terms
- Learning Bayesian Networks with the Saiyan Algorithm
Recommendations
The max-min hill-climbing Bayesian network structure learning algorithm
We present a new algorithm for Bayesian network structure learning, called Max-Min Hill-Climbing ( MMHC ). The algorithm combines ideas from local learning, constraint-based, and search-and-score techniques in a principled and effective way. It first ...
Parallel globally optimal structure learning of Bayesian networks
Given n random variables and a set of m observations of each of the n variables, the Bayesian network structure learning problem is to learn a directed acyclic graph (DAG) on the n variables such that the implied joint probability distribution best ...
Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs
There is a commonly held opinion that the algorithms for learning unrestricted types of Bayesian networks, especially those based on the score+search paradigm, are not suitable for building competitive Bayesian network-based classifiers. Several ...
Comments