Learning MAX-SAT from contextual examples for combinatorial optimisation
Introduction
Combinatorial optimisation is an effective and popular class of techniques for solving real-life problems like scheduling [1], routing [2], and planning [3]. However, encoding the underlying models often proves to be time-consuming and complicated, as it requires substantial domain and modelling expertise. Therefore, the question arises as to whether such models can be learned from data. This question is studied in constraint learning [4], [5], where several algorithms have been developed that automatically acquire theories or mathematical models from examples of past working (positive) and non-working (negative) solutions or analogous forms of supervision.
Combinatorial optimisation models have two components: a set of hard constraints ϕ defining the feasible region, and an objective function f that measures the quality of candidate solutions, sometimes defined as a set of soft constraints. The task of the solver is then to complete a (potentially empty) partial assignment x into a complete assignment xy that is both feasible and optimal, i.e., and . We use the term context to refer to both partial assignments x and to more general temporary constraints that restrict the outcome of optimisation.
Current learning approaches suffer from two limitations. First, to the best of our knowledge, they do not learn from contextual examples. By doing so, they ignore the fact that the optima can be affected drastically by the context. For instance, , where the context fixes y, can be very different from , where the context fixes x. Furthermore, this is also a less realistic setting in practice, as examples of good and bad solutions will always be relative to a context. The reader may notice a resemblance with structured output prediction (e.g. [6]), where one learns a function f that computes a structured output for a given input x. The difference is that in structured prediction the choice of input and output variables is fixed, while in optimisation it is not.
Second, existing approaches do not jointly learn the hard constraints and the objective function: they either learn one or the other, or else learn them sequentially or independently. But this may break down in applications like personnel rostering. Here, past schedules are often stored in a data set, but the reasons why a schedule was found to be unacceptable are usually not tracked. In cases like this, a negative example may be either infeasible (because of the hard constraints) or sub-optimal (because of the objective function). This induces a credit-assignment problem that can only be solved by learning the constraints and objective function jointly. See the related work section for a more in-depth discussion.
The key contribution of this paper is that we develop a more realistic setting for learning combinatorial optimisation models from contextual examples that does not suffer from these limitations. Furthermore, we provide foundational results within this setting for one of the simplest but most fundamental models for combinatorial optimisation, maximum satisfiability (MAX-SAT for short). Our theoretical results show that MAX-SAT models can be probably approximately correctly (PAC) and agnostically learned from contextual data using empirical risk minimisation (ERM) as long as the contextual examples are “representative enough”, and that if enough data is available the acquired model is guaranteed to output high-quality feasible solutions.
Motivated by this, we introduce two implementations of ERM for MAX-SAT learning, hassle-milp and hassle-sls. hassle-milp relies on ideas from syntax-guided synthesis [7], in that the learning task is encoded as an optimisation problem – namely, a mixed-integer linear programming (MILP) problem – and solved using an efficient solver. hassle-milp acquires a MAX-SAT model that is guaranteed to fit the examples (almost) exactly, if one exists. This accuracy, however, comes at the expense of run-time. The second implementation, hassle-sls, uses stochastic local search (SLS) to look for a high-quality MAX-SAT model, and in addition integrates a heuristic to prune the neighbourhood of the current candidate model and focus on the most promising neighbours. hassle-sls is not guaranteed to return an (almost) optimal model, but it offers enhanced efficiency. Our experiments show that, on the one hand, hassle-milp successfully recovers both synthetic and benchmark MAX-SAT models from contextual examples, and on the other, that hassle-sls matches the model quality of hassle-milp in a fraction of the time and scales to learning problems beyond the reach of the more exact implementation.
A preliminary version of this work appeared as a conference paper [8]. The present work contributes the following major improvements:
- •
The theoretical results in the conference paper hold for the realisable, noiseless case only, and only guarantee that, given enough examples, the learned model will perform well at solving time in the presence of the empty context. Here, we show that, under mild assumptions, these results apply to the agnostic and noisy cases. In addition, we show that the learned model will perform well in arbitrary contexts at solving-time and introduce a tighter regret bound, see Theorem 3.
- •
The only implementation available in the conference paper is hassle-milp. Here we introduce a new implementation based on stochastic local search [9], named hassle-sls. In particular we studied four different versions of hassle-sls, based on some of the most widely used SLS techniques [10]: WalkSAT, Novelty, Novelty+ and Adaptive Novelty+. Compared to the original implementation, hassle-sls scales to larger learning tasks while acquiring models of comparable or better quality in practice. The implementation is non-trivial as evaluating the score of each neighbour requires solving a MAX-SAT model. To keep the run-time low, we designed techniques to restrict the neighbourhood to those models which show some promise of being better than the current candidate.
This paper is structured as follows: Section 2 provides notations and definitions for various terms used throughout the text and Section 3 provides a formal definition of MAX-SAT learning. In Section 4, we first prove PAC learnability of MAX-SAT learning and that this provides guarantees on the quality of the assignment output by the learned model (relative to those output by the ground-truth model). In Section 5, we present the two implementations and evaluate them in Section 6. The related work is discussed in Section 7 and the paper concludes in Section 8. For ease of exposition, all the proofs are deferred to the Appendices. A summary of all the notations is provided in Table 1.
Section snippets
Maximum satisfiability
Let be a set of Boolean variables and a class of Boolean formulas of interest on X, e.g., the set of conjunctions or disjunctions of up to k literals (i.e., variables or their negations). An assignment fixes the value of each to . In practical implementations, we will use 1 and 0 to encode true and false, respectively.
A partial maximum satisfiability (abbreviated MAX-SAT) model is a collection of hard and soft constraints taken from Φ [11], [12]:
Problem statement
In contrast to existing constraint learning methods, we consider a more realistic setting where example solutions and non-solutions are context-specific. Here and below, indicates the set of possible contexts. More specifically, we assume each example to be generated as follows:
- 1.
A context is observed;
- 2.
An assignment x that satisfies context ψ (i.e., ) is chosen according to some policy, e.g., by asking a domain expert to provide either a high-quality solution or a non-solution;
- 3.
x is
Learnability of MAX-SAT models
In this section, we study the learnability of MAX-SAT models from contextual examples from the perspective of statistical learning theory.4 Basic knowledge of statistical learning theory is assumed; the necessary material, including the definitions of Vapnik-Chervonenkis dimension and the theorems that link it to generalisation outside of
Two implementations
We are finally ready to present our implementations of MAX-SAT learning. Both are based on Empirical Risk Minimisation, which, given X, Φ, and a context-specific data set S encompassing contexts Ψ, amounts to searching for a MAX-SAT model that minimises . This equates to solving the following optimisation problem: Intuitively, Eq. (7) searches for two vectors and that encode a MAX-SAT model, which is constrained by Eq. (8) to classify
Experiments
In this section, we empirically answer the following research questions:
- Q1
Does ERM succeed in acquiring good quality MAX-SAT models from contextual examples?
- Q2
Among the four implementations of hassle-sls, which SLS strategy performs the best and how does it compare to hassle-milp?
- Q3
How does hassle-sls scale as the complexity of the ground-truth model increases?
- Q4
How good is our strategy of neighbourhood pruning compared to a naive implementation where each neighbour is evaluated?
- Q5
Is having both
Related work
Our approach is closely related to constraint learning and acquisition [4], [5]. There the goal is to acquire a constraint satisfaction problem (aka constraint network), usually from examples of feasible and infeasible assignments. However, the issue of learning from contextual examples, which are pervasive in real-world decision making, is usually ignored. One exception is QuAcq [23], which acquires hard constraints from membership queries about partial assignments. QuAcq, however, is allowed
Conclusion and future work
We introduced the novel learning task of acquiring combinatorial optimisation models from contextual data, focusing specifically on MAX-SAT models. Our analysis shows that ERM provably learns low-regret MAX-SAT models from context-specific examples in both realisable and agnostic setting. It works even in the noisy setting as long as the noise is random and the probability of occurrence is less than 50%. These results justify our ERM-based implementations, hassle-milp and hassle-sls. The first
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
We thank the reviewers for helping to improve the quality of the manuscript. This work has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. [694980] SYNTH: Synthesising Inductive Data Models). The research of ST was partially supported by TAILOR, a project funded by EU Horizon 2020 research and innovation programme under GA No 952215.
References (43)
- et al.
Automatic synthesis of constraints from examples using mixed integer linear programming
Eur. J. Oper. Res.
(2017) - et al.
Structured learning modulo theories
Artif. Intell.
(2017) - et al.
Modeling and solving staff scheduling with partial weighted MaxSAT
Ann. Oper. Res.
(2019) - et al.
Guided local search for solving SAT and weighted MAX-SAT problems
J. Autom. Reason.
(2000) - et al.
Partial weighted MaxSAT for optimal planning
- et al.
New approaches to constraint acquisition
- et al.
Learning constraints from examples
- et al.
Large margin methods for structured and interdependent output variables
J. Mach. Learn. Res.
(Sep. 2005) - et al.
Search-based program synthesis
Commun. ACM
(2018) - et al.
Learning MAX-SAT from contextual examples for combinatorial optimisation
Stochastic Local Search: Foundations and Applications
Stochastic Local Search Algorithms: An Overview
Database queries as combinatorial optimization problems
MaxSAT, hard and soft constraints
Prediction, Learning, and Games
Understanding Machine Learning: From Theory to Algorithms
Principles of risk minimization for learning theory
Learning from noisy examples
Mach. Learn.
Local search methods
The SAT phase transition
SATLIB: an online resource for research on SAT
Cited by (3)
Methods for Constraint Solving Problems Based on Learn to Reason Model: A Review
2024, Ruan Jian Xue Bao/Journal of SoftwareCPSO: Chaotic Particle Swarm Optimization for Cluster Analysis
2023, Journal of Artificial Intelligence and Technology