1 Introduction

The canonical graph representation problem is pertinent to a wide range of scientific applications. It is closely related to the graph isomorphism problem, as two graphs are isomorphic if and only if they have the same canonical representation. Examples of applications where graphs (and their representations) have been used include data-mining [1], mathematical chemistry [2, 3], computer vision [4], and bioinformatics [5]. The two problems are poly-time equivalent, and are among the few that are known to be in NP but not known either to be solvable in polynomial time, nor to be NP-complete. Recently László Babai has claimed to have produced a quasi-polynomial time algorithm for graph isomorphism. This result is yet to be published.

There are a variety of software tools devoted to solving the two problems “in practice,” one of which is nauty (No AUTomporphisms Yes?), due to McKay [6, 7]. Nauty is sometimes referred to as the world’s fastest isomorphism testing program. It is also able to produce a canonically-labeled isomorph of a graph to assist in isomorphism testing. The nauty package includes a suite of programs called gtools which are useful for processing files of graphs stored in one of two compressed formats. In this paper we make use of the shortg tool to remove isomorphs from a file of graphs stored in the graph6 format.

This paper is about constraint problems which involve the search for either a single graph that satisfies certain properties, or all graphs that do so. For example, consider the problem to determine if there exists an undirected graph with 31 vertices, 81 edges, and which does not contain cycles of length 4 or less. This question arises in “extremal graph theory” [8], and its answer was unknown prior to the work described in this paper [9]. The search space for problems of this type is enormous, and search may be optimized by restricting it to focus on canonical representations, or to avoid isomorphic graphs as much as possible. The general idea is to “break” symmetries in the search space that derive from the fact that the actual names of the vertices in a solution do not matter. When searching for a graph coloring where the edges of the graph are associated with colors, on top of graph isomorphism, solutions are typically closed under permutations of the colors: the actual names of the colors do not matter. However, it is not clear how to apply this idea when searching for a graph. In this type of problem the graph is a variable, so graph algorithms for canonical representation and isomorphism, as well as tools such as nauty, all of which operate on given graphs, do not apply. This paper provides a solution to this problem.

We assume a setting where testing for the existence of a graph G satisfying a property P is posed as a Boolean constraint P(AG) on the variables of the Boolean adjacency matrix AG of G. We follow the approach advocated by Crawford et al. [10], where a predicate, sb(AG), is introduced to break symmetries in the search space. In this way the satisfiability of P(AG) is equivalent to that of P(AG) ∧sb(AG). Ideally, sb(AG) is satisfied by a single member of each equivalence class of AG under graph isomorphism, thus drastically restricting the search space for P(AG) ∧sb(AG). However, this is not realistically possible as such a predicate also determines a canonical representation. In practice, it is sufficient that sb(AG) is satisfied by at least one member of the equivalence class of AG under isomorphism (typically by more than one) and in this case we say that sb is a symmetry breaking predicate. Shlyakhter [11] notes that the difficulty is to identify a symmetry-breaking predicate which is both effective (rules out a large portion of the search space) and compact (so that checking the additional constraints does not prohibitively slow down the search).

The presentations in [10, 11] consider symmetry breaking in terms of isomorphism, but focus on different structures such as acyclic digraphs, relations, permutations and functions. We introduce a novel, effective and compact predicate to break symmetries on graph representation. We consider two different scenarios where an (unknown) graph G is represented by its adjacency matrix AG and the elements at positions i, j in AG are: Boolean variables indicating the presence of an edge between vertices i and j; and (2) integer variables from a finite domain C = {0,1,…, k} indicating the presence of a colored edge between vertices i and j where a value c > 0 indicates a c-colored edge.

This paper extends preliminary results presented in [12], in which our symmetry breaking constraints were introduced and applied to solve several open instances of a problem in extremal graph theory regarding the maximal number of edges in a graph with v vertices and no cycles of length k or less (i.e. graphs with girth at least k + 1), for k = 4 and k = 5. We also use our approach to determine the number of non-isomorphic extremal graphs of girth at least 5. Here, we exploit some properties of extremal graphs (presented and proved in [13]) which then enable us to apply our approach to solve additional open instances of maximal graphs with girth 5. We also extend the approach to apply to graph coloring problems and demonstrate its impact on the search for Ramsey numbers and colorings [14]. Note that in [15] we apply the results of this paper to determine the precise value of R(4,3,3). This was an open problem for over 30 years.

2 Graphs and their canonical representation

Throughout this paper we consider undirected simple graphs without loops or multiple edges. We focus on finite graphs and typically name the n vertices of a graph in the set {1,…, n}. We denote the Boolean values true and false by 1 and 0 respectively.

Definition 1

(Graph) A graph G = (V, E) has vertices V = {1,…, n} and edges EV × V where (x, y) ∈ E ⇒ (y, x) ∈ E. The Boolean adjacency matrix, AG of G, is the n × n symmetric matrix where AG[x, y] ⇔ (x, y) ∈ E. The ith row of matrix A is denoted by A[i], and A[i, j] denotes the jth element of A[i]. The degree of vertex uV is degree(u) = |{(u, v)|(u, v) ∈ E}|. We denote the minimum and maximum degrees of the vertices in G as δ(G) and Δ(G), or δ and Δ when the context is clear.

In Section 5, we extend Definition 1 to colored graphs where the edges are associated with colors from a given finite domain of positive integers.

Example 1

Figure 1 illustrates three graphs with corresponding adjacency matrices.

Fig. 1
figure 1

Three example graphs and their adjacency matrices

We use cycle notation to represent permutations. For example, the permutation (1,2,6)(3,4,5) on set {1,…,7} maps 1 to 2, 2 to 6, 6 to 1, and 3 to 4, 4 to 5, and 5 to 3 (cycles of length 1 are not included in the representation).

Definition 2

(permuting vertices) Let G = (V, E) be a graph with n vertices, AG the adjacency matrix for G, and π a permutation on {1,…, n}. Then π(G) is the graph obtained by permuting the vertices of G using π. Formally, π(G) = (V, E) where E = {(π(x), π(y))|(x, y) ∈ E} and π(AG) is the adjacency matrix of π(G).

Definition 3

(graph isomorphism) G and G are isomorphic if there exists a permutation π such that \(A_{G} =\pi (A_{G^{\prime }})\).

Example 2

The graphs in Fig. 1 are isomorphic. We can permute G1 to G2 using π1 = (2,8,5,9,4,7,3) and G1 to G3 using π2 = (2,9,4,8,6,7,3).

Because of the way they are presented, the graphs in Example 2 (Fig. 1) are very obviously isomorphic. However isomorphism (particularly for large graphs) is not usually so easy to detect. For example, the graphs in Fig. 2 are isomorphic (using π = (1,2,3)(4,5,6)(7,10,11)) but, because of the placement of the vertices, are not obviously so. The fact that the two graphs are isomorphic can be ascertained from their adjacency matrices using a graph-theoretic tool such as nauty [6, 7].

Fig. 2
figure 2

Isomorphic graphs presented with isomorphism concealed

Definition 4

(sequences, lexicographic order) Let A be matrix and A[i]A[j] the concatenation of rows i and j (viewed as sequences). The length of a sequence s is denoted |s|. We use ≼ to denote the usual lexicographic order on sequences. We extend this notation in the obvious way: for matrices, with n and m rows respectively, AB, if and only if A[1]A[2]⋯A[n] ≼ B[1]B[2]…B[m]; and for graphs, GG if and only if \(A_{G}\preceq A_{G^{\prime }}\).

One way to define a canonical representation of a graph is to take the smallest graph (i.e. in the lexicographic order) which is isomorphic to G [16]. This is the definition which we adopt throughout the paper.

Definition 5

(canonical form of a graph) The canonical form of a graph G is the graph with can(G) = min≼{π(G)|π is a permutation}. We say that G is canonical if G = can(G).

Example 3

Consider the graphs of Fig. 1. The graph G3 is the canonical representation of G1, G2 and G3.

Note that the canonical representation of a graph does not necessarily order the vertices by degree. In Fig. 1, the vertices of G2 are ordered by degree: vertices {1,2,3,4} are of degree 2, vertices {5,6,7,8} are of degree 3 and vertex 9 is degree 4. But this is not the case for the canonical form, G3.

3 Symmetry breaking on representation

We first consider a symmetry breaking predicate, introduced without proof in [17], which constrains the rows of an adjacency matrix to be sorted lexicographically in non-decreasing order.

Definition 6

(lexicographic symmetry break) Let A be an n × n adjacency matrix. We define

$$\textsf{sb}_{\ell}(A) = \bigwedge_{i = 1}^{n-1} A[i]\preceq A[i + 1]$$

Observe the graphs in Fig. 1. We have \(\textsf {sb}_{\ell }(A_{G_{1}})=\mathit {false}\), \(\textsf {sb}_{\ell }(A_{G_{2}})=\mathit {false}\), and \(\textsf {sb}_{\ell }(A_{G_{3}})=\mathit {true}\).

Definition 6 is more subtle than might first appear. It defines a symmetry breaking predicate only because for every adjacency matrix A, sb(A) is true for at least one of the matrices A isomorphic to A. Reversing the order, i.e. insisting that A[i] ≽ A[i + 1], would not define a symmetry breaking constraint. Consider for example any representation of the graph G with 2 vertices and a single edge. Then AG[1]⋡AG[2]. The subtlety arises because, in contrast to the case of breaking symmetries in matrix problems where rows and columns can be reordered, such as in [18,19,20], here we need to reorder rows and columns simultaneously, both in the same way. To prove the correctness of Definition 6 it is sufficient to show that sb(can(A)) holds.

Theorem 1

LetG be a graph. Then sb(can(AG)).

Proof

Let A be canonical and assume to the contrary that A does not satisfy sb(A). Let i be such that A[i]⋠A[i + 1]. It follows that there is a j such that for every 1 ≤ j < j, A[i, j] = A[i + 1, j] and A[i, j] > A[i + 1, j]. Let B be the matrix obtained by swapping rows i, i + 1 as well as columns i, i + 1. We show that BA in contradiction to A being canonical. Since A[i, j] > 0, ij and there are two cases to consider.

  1. (a)

    j < i: Since j − 1 length prefixes of A[i] and A[i + 1] are equal (the diagonally striped regions in Fig. 3), A[i,1]⋯A[i, j − 1] = B[i,1]⋯B[i, j − 1]. Note also that A[i] = B[i] for 1 ≤ i < j. This is because the only elements to be swapped in these rows are those in columns i and i + 1 (the dotted regions in Fig. 3). These elements are equal because A is symmetric and A[i, i] = A[i + 1, i] for i < j. Hence the first cell to differ in A and B is at position [j, i], and, since A[j, i] > A[j, i + 1], A[j, i] > B[j, i]. So BA. Contradiction.

  2. (b)

    j > i: By a similar argument to the above, the A[i,1]⋯A[i, j − 1] = B[i,1]⋯B[i, j − 1] and A[i] = B[i] for 1 ≤ i < i (so the similarly shaded sections of rows i and i + 1 and columns i and i + 1 in Fig. 4 are identical). It follows that the first cell to differ in A and B is at position [i, j] and that A[i, j] > B[i, j]. So BA. Contradiction.

Fig. 3
figure 3

Graph A, Theorem 1a, j < i. Similarly shaded sections of rows/columns i and i + 1 are identical

Fig. 4
figure 4

Graph A, Theorem 1(b), j > i. Similarly shaded sections of rows/columns i and i + 1 are identical

We now proceed to strengthen this notion of symmetry breaking. The following example illustrates a symmetry not captured by sb(A).

Example 4

Consider the adjacency matrix A1 depicted in Fig. 5 for which sb(A1) = true as the rows are ordered lexicographically. Observe that A1[2] ≼ A1[3] independent of whether we swap the vertices (rows and columns) 2 and 3, or not. Adjacency matrix A2 depicted in Fig. 5 is the result of this swap and it too satisfies sb(A2) = true. However, it is “closer” to canonical as A2A1. Indeed A2 is the canonical representative of this graph. Figure 5 highlights that the first 3 elements of rows 2 and 3 are invariant under vertex swap.

Fig. 5
figure 5

Graphs and adjacency matrices for Example 4

In view of Example 4 we introduce the following definition and then introduce a stronger symmetry breaking constraint.

Definition 7

(extended lexicographic order)

Let s be a sequence and I ⊆{1,…,|s|}. We denote by \((s\upharpoonright I)\) the sequence obtained from s by simultaneously omitting the elements at positions I. For a set of natural numbers I we denote by ≼I the order on sequences of length at least max(I) defined by: \(s_{1}\preceq _{I} s_{2} \Leftrightarrow (s_{1}\upharpoonright I) \preceq (s_{2}\upharpoonright I) \).

Definition 8

(improved lexicographic symmetry break) Let A be an n × n adjacency matrix. We define

$$\textsf{sb}^{*}_{\ell}(A) = \bigwedge_{i<j} A[i]\preceq_{\{i,j\}} A[j] $$

Theorem 2

If A is ann × nadjacency matrix and\(\textsf {sb}^{*}_{\ell }(A)\)then sb(A).

Proof

Suppose that \(\textsf {sb}^{*}_{\ell }(A)\) and that for some row i, A[i]⋠A[i + 1]. Since A[i] ≼{i, i+ 1}A[i + 1], we must have that A[i,1]⋯A[i, i − 1] = A[i + 1,1]⋯A[i + 1, i − 1] and A[i, i + 2]⋯A[i, n] ≼ A[i + 1, i + 2]⋯A[i + 1, n]. Since A[i, i] = 0 and A[i, i + 1] = A[i + 1, i] either A[i, i] < A[i + 1, i] or A[i, i + 1] = A[i + 1, i] = 0. In both cases A[i] ≼ A[i + 1], and we have a contradiction. □

Observe that Definition 8 introduces O(n2) constraints on lexicographic order whereas Definition 6 introduces only O(n). This is needed because we lack a “transitivity” like property stating that if s1{i, j}s2 and s2{j, k}s3 then also s1{i, k}s3. The fact that no such property holds is illustrated by the following example.

Example 5

Consider the adjacency matrix A shown in Fig. 6. While clearly A[1] ≼{1,2}A[2] and A[2] ≼{2,4}A[4], it is not the case that A[1] ≼{1,4}A[4].

Fig. 6
figure 6

Extended lexicographic comparisons for Example 5 where only the highlighted entries are compared

Interestingly, transitivity does hold for rows at a distance of two apart.

Theorem 3

A[i] ≼{i, i+ 1}A[i + 1] ∧ A[i + 1] ≼{i+ 1, i+ 2}A[i + 2] ⇒ A[i] ≼{i, i+ 2}A[i + 2]

Proof

Assume the premise and adopt the following representation where the boxed elements are at positions i, i + 1 and i + 2 in the sequences.

figure a

From the premise and by definition of ≼I, we have S1yT1S2zT2 and S2xT2S3yT3. We prove that S1xT1S3zT3 which gives the result. There are two cases: either (a) S1S2, and since S2S3 we have S1S3 and the result holds; or (b) S1 = S2, then either S2S3, and the result follows, or S2 = S3, and it remains to show that xT1zT3. Suppose yz then since xy we have that xz and the result holds. Otherwise, y = z. Suppose xy then clearly xz and the result holds. So assume x = y = z. Then we have that T1T2 and T2T3 and hence the result holds. □

Intuitively, transitivity fails for the general case (for rows i, j and k) due to the possibility that the position of the first element in row i that is not equal to the corresponding element in row j may be between i and j or between j and k (or a similar scenario for rows j and k or rows i and k), which is clearly not possible for the case in Theorem 3. Using a representation similar to the above, where the boxed elements are in positions i, j and k in the sequences:

figure b

If S1 = S2 = S3, T3T1T2, x = 0 and y = 1 then S1T1U1yV1S2T2U2zV2 and S2xT2U2V2S3yT3U3V3, i.e. A[i] ≼{i, j}A[j] and A[j] ≼{j, k}A[k], but S1T1xU1V1S3T3zU3V3, i.e A[i]⋠{i, k}A[k].

By removing the redundancy implied by Theorem 3 we can remove O(n) constraints and refine Definition 8 thus:

Corollary 1

$$\textsf{sb}^{*}_{\ell}(A) = \bigwedge_{\tiny \begin{array}{c} i<j\\j-i\neq 2 \end{array}} A[i]\preceq_{\{i,j\}} A[j] $$

The following proves that \(\textsf {sb}^{*}_{\ell }\) is a symmetry-breaking predicate.

Theorem 4

Let A be a canonical adjacency matrix.Then\(\textsf {sb}^{*}_{\ell }(A)\)holds.

Proof

Let A be the canonical adjacency matrix for a graph G and assume to the contrary that A does not satisfy \(\textsf {sb}^{*}_{\ell }(A)\). That is, there exist i and j such that i < j and A[i]⋠{i, j}A[j]. Let π denote the permutation which swaps vertices i and j in G. We show that B = Aπ(G)A. Let ki, j be the first column at which A[i] and A[j] differ (except possibly for columns i and/or j). It follows that for every 1 ≤ k < k, ki, j, A[i, k] = A[j, k] and A[i, k] > A[j, k]. There are 3 cases to consider, k < i, i < k < j and k > j. We consider these cases below and illustrate them using Figs. 78 and 9. In each case similarly shaded sections of rows/columns i and j are identical, and the grey squares denote the unknown (but identical) values A[i, j] and A[j, i].

  1. (a)

    k < i: Since k − 1 length prefixes of A[i] and A[j] are equal (the diagonally striped regions in Fig. 7), A[i,1]⋯A[i, k − 1] = B[j,1]⋯B[j, k − 1]. Note also that A[i] = B[i] for 1 ≤ i < k (the dotted regions in Fig. 7). The i and j elements in A[i] are equal because A is symmetric and A[i, i] = A[j, i] for i < k . Hence the first cell to differ in A and B is at position [k, i], and A[k, i] > B[k, i]. So BA. Contradiction.

  2. (b)

    i < k < j: By a similar argument to the above, A[i,1]⋯A[i, k − 1] = A[j,1]⋯A[j, k − 1], except possibly in column i (the diagonally striped regions in Fig. 8). By symmetry A[1, i]⋯A[k − 1, i] = B[j,1]⋯B[j, k − 1], except possibly in row i (the dotted regions in Fig. 8) and so, in particular, for i < i, A[i] = B[i]. Since A[i, i] = 0 and A is canonical, A[j, i] must be 0. Hence, for k < k, A[i, k] = B[i, k]. It follows that the first cell to differ in A and B is at position [i, k] and that A[i, k] > B[i, k].

  3. (c)

    k > j: By a similar argument to the above, A[i,1]⋯A[i, k − 1] = A[j,1]⋯A[j, k − 1], except possibly in columns i and j (the diagonally striped regions in Fig. 9) and A[1, i]⋯A[k − 1, i] = B[1, j]⋯B[k − 1, j], except possibly in rows i and j (the dotted regions in Fig. 9). In particular, for i < i, A[i] = B[i]. By a similar argument to part (b), as A[i, i] = 0 and A is canonical, A[j, i] must be 0. Since A[j, i] = A[i, j] (by symmetry) and A[j, j] = 0 and we already know that rows i and j are identical up to column k − 1 otherwise, it follows that the first cell to differ in A and B is at position [i, k] and that A[i, k] > B[i, k].

Fig. 7
figure 7

Graph A, Theorem 4(a), k < i. Similarly shaded sections of rows/columns i and j are identical and grey shaded squares are unknown but identical values

Fig. 8
figure 8

Graph A, Theorem 4(b), i < k < j. Similarly shaded sections of rows/columns i and j are identical and grey shaded squares are unknown but identical values

Fig. 9
figure 9

Graph A, Theorem 4(c), j < k. Similarly shaded sections of rows/columns i and j are identical and grey shaded squares are unknown but identical values

Fig. 10
figure 10

Graphs A and B, Theorem 5. Shaded areas in (a) denote the possible position of the first element to differ in (a) and (b)

Another way to think about \(\textsf {sb}^{*}_{\ell }(A)\) is that it prevents us from creating a form of the graph where swapping any two rows will lead to a lexicographically smaller graph.

Theorem 5

Let A be an adjacency matrix for graphG where\(\textsf {sb}^{*}_{\ell }(A)\)holds.Letπbe a permutation that swaps vertices i and j in G. ThenAAπ(G).

Proof

Let B = Aπ(G). Assume w.l.o.g. that i < j. Suppose to the contrary that BA we show that \(\textsf {sb}^{*}_{\ell }(A)\) does not hold.

Let and where the circled entries occur at positions i and j in the sequences. Now we know that and by the definition of B.

Arrays A and B are shown in Fig. 10. We also label the regions of columns A and B that are (by symmetry) transpose of one of S1,⋯ , S3 or T1,⋯ , T3.

Suppose the first position where B and A differ is row k column l. Then B[k] ≺ A[k]. By the nature of B, either k or l is i or j. Suppose that l = i or l = j then, since A[k, i]≠B[k, j] if and only if A[k, j]≠B[k, j], we can assume that l = i. Similarly, if k is i or j we can assume that k = i. Also, by symmetry, if elements in row i and column j differ in A and B, so do the elements in row j and column i. We can therefore assume that the first element in A that differs from the corresponding element in B is one of: k < i, l = i; k = i, i < l < j or k = i, l > j, i.e. in one of the shaded areas in Fig. 10.

If k < i, l = i, then since B[k] ≺ A[k], B[k, i] < A[k, i]. As B[k, i] = A[k, j] we have A[k, j] < A[k, i] and, by symmetry, A[j, k] < A[i, k]. Since k is the first position in which A[i] and A[j] differ, A[i]⋠{i, j}A[j].

If k = i, and i < l < j or l > j then since B[k] ≺ A[k], B[i, l] < A[i, l]. As B[i, l] = A[j, l] we have A[j, l] < A[i, l]. Since l is the first position in which A[i] and A[j] differ, A[i]⋠{i, j}A[j]. □

Note that often we may wish to separate vertices of the graph into equivalence classes a priori, and generate a graph that satisfies those equivalence classes. We can still use (extended) lexicographic ordering to help constrain the resulting adjacency matrices, since we can extend Theorem 4 to this case.

Definition 9

(ordered partition)] Let G be a graph. Then P = {P1,…, Pp} is an ordered partition of the vertices of G if ∀1 ≤ i < jp, viPivjPjvi < vj.

Definition 10

(partition preserving permutation) Let P = {P1,…, Pp} be an ordered partition on the vertices of G. A permutation π on the vertices of G is partition preserving for P if ∀1 ≤ ip,∀viPi, π(vi) ∈ Pi.

Example 6

Consider the graph G2 from Fig. 1 and the ordered partition P = {{1,2,3,4},{5,6,7,8},{9}}, which partitions vertices by degree. Then the permutation π = (2,3,4) is partition preserving for P. It maps elements in P1 to other elements in P1 and fixes elements in the other parts.

Definition 11

(canonical partitioned adjacency matrix) The canonical form of a graph G with respect to an ordered partition P is the graph can(G, P) = min≼{π(G)|π is a partition preserving permutation for P}. We say that G is canonical for P if G = can(G, P).

We can define a symmetry breaking predicate for partitioned graphs as follows:

Definition 12

(partitioned lexicographic symmetry break) Let A be an n × n adjacency matrix and P = {P1, P2,…, Pp} be an ordered partition. We define

$$\textsf{sb}^{*}_{\ell}(A,P) = \bigwedge_{k = 1}^{p} \bigwedge_{\tiny \begin{array}{c} \{i,j\} \subseteq P_{k}, i<j\\j-i\neq 2 \end{array}}{} A[i]\preceq_{\{i,j\}} A[j]$$

Theorem 6

LetG be a canonical partitioned graph for an ordered partition P.Then\(\textsf {sb}^{*}_{\ell }(A_{G},P)\)holds.

Proof

Let A be the canonical adjacency matrix for graph G and assume to the contrary that A does not satisfy \(\textsf {sb}^{*}_{\ell }(A,P)\). That is, there exists a partition Pk and {i, j}⊆ Pk with i < j where A[i]⋠{i, j}A[j]. Let B = Aπ(G) where π swaps i and j. Note that π is a partition preserving permutation for P. Using a proof similar to that of Theorem 4 we can show that BA. □

Example 7

Let G be the graph G2 from Fig. 1 and the ordered partition P = {{1,2,3,4},{5,6,7,8},{9}}, which partitions vertices by degree. Then G is canonical for P (even though, as shown in Example 3, G is not canonical). Clearly \(\textsf {sb}^{*}_{\ell }(A_{G},P)\) holds, although \(\textsf {sb}^{*}_{\ell }(A_{G})\) does not.

Note that we exploit the fact that symmetry breaking can be applied without disturbing a defined partitioning in Section 4 when enforcing the existence of embedded stars (but not otherwise).

4 Extremal graph problems

Extremal graph theory [8] is the study of graphs that are maximal (or minimal) in some way (for example in terms of number of edges) and which satisfy a given property. Extremal graph theory has many applications both in other areas of mathematics and fields including, for example, chemistry [21], biology [22] and cryptography [23].

We apply a constraint-based approach to some extremal graph problems and illustrate the advantage of symmetry breaking on the graph representation.

The girth of a graph is the size of the smallest cycle contained in it. Let \(\mathcal {F}_{k}(v)\) denote the set of graphs with v vertices and girth at least k + 1. Let fk(v) denote the maximum number of edges in a graph in \(\mathcal {F}_{k}(v)\). A graph in \(\mathcal {F}_{k}(v)\) with fk(v) edges is called extremal. The number of non-isomorphic extremal graphs in \(\mathcal {F}_{k}(v)\) is denoted Fk(v). Extremal graph problems involve discovering values of fk(v) and Fk(v) and finding witnesses. In [24] the authors attribute the discovery of values f4(v) for v ≤ 24 to [9] and for 25 ≤ v ≤ 30 to [25]. Hand proofs for f4(v) for 40 ≤ v ≤ 49 are presented in [26]. In [9] the authors report values of F4(v) for v ≤ 21. In [9] and [27] algorithms are applied to compute lower bounds on f4(v) for 31 ≤ v ≤ 200. Some of these lower bounds are improved in [28] and improved upper bounds for 33 ≤ v ≤ 42 are proved in [29]. Currently available values of f4(v) and of F4(v) are available as sequences A006856 and A159847 of the On-Line Encyclopedia of Integer Sequences [30].

Fig. 11
figure 11

Basic constraint model for extremal graph problems (no cycles of length 4 or less) with v vertices and e edges

Our basic constraint model is shown in Fig. 11 where we assume given values of v (number of vertices) and e (number of edges) and that A is a v × v matrix of Boolean variables. Constraint (1) states that the graph is simple (symmetric with no self loops), Constraints (2) and (3) express that there are no cycles of length 3 or 4, and Constraint (4), that the number of edges is e. Constraints (2) and (3) are implemented more efficiently. We introduce additional Boolean variables for each triplet of (distinct) vertices i, j, k with i < k: xi, j, kA[i, j] ∧ A[j, k] represents a length 2 path between i and k via j; and xi, k ⇔∨{xi, j, k|ji, jk} represents the existence of any length 2 path between i and k. We then express Constraints (2) and (3) as ∀i, k.A[i, k] + xi, k < 2 and \(\forall _{i,k}. \sum \nolimits _{j} x_{i,j,k} < 2\).

To explain Constraints (5)–(7) we recall Propositions 2.6 and 2.7 from [9] which state that for every graph in \(\mathcal {F}_{4}(v)\) with e edges the minimum and maximum vertex degrees, denoted δ and Δ, satisfy the following equations (assuming v ≥ 1):

$$\begin{array}{@{}rcl@{}} v\geq 1+{\Delta}\delta\geq 1+\delta^{2} \text{, and }\\ \delta\geq e-f_{4}(v-1) \text{, and } {\Delta}\geq\lceil 2e/v\rceil \end{array} $$
(1)

Given values for v and e we model the problem separately for each potential pair (δ,Δ) introducing constraints (5) to (7). In addition to the above constraints we introduce symmetry breaking constraints sb or \(\textsf {sb}^{*}_{\ell }\).

Example 8

For v = 31 and e = 80 the possible (δ,Δ) pairs satisfying Eq. 1 are {(4,6),(4,7),(5,6)}. Similarly, for v = 31 and e = 81 there is a single pair, (5,6).

We describe three experiments to evaluate the impact of different symmetry breaking strategies. Experiments were run using the BEE [31] constraint solver.

Note that for v < 20 our experiments often take longer to run (even with \(\textsf {sb}^{*}_{\ell }\)) than using nauty’s geng tool [6] – for example we take 4.7 seconds to find F4(15), whereas geng takes only 1.8 seconds. However, for larger values of v, geng can not find a solution within the timeout period – for example geng takes over 60 hours to find F4(20). We stress though that the goal of our experiments is to demonstrate the benefit of our symmetry breaks, not to compare the speed of our approach to other algorithms.

We present the results obtained using BEE which compiles finite domain constraints to CNF and solves them using an underlying SAT solver. Our configuration uses CryptoMiniSat v2.5.1 [32]. BEE performs CNF simplification by applying a constraint-driven technique called equi-propagation [33] and partial evaluation. All experiments are performed on a single core of an Intel(R) Core(TM) i5-2400 3.10GHz CPU with 4GB memory under Linux (Ubuntu lucid, kernel 2.6.32-24-generic). BEE is written in Prolog and run using SWI Prolog v6.0.2 64-bits. All experiments were replicated and verified using the Choco constraint programming toolkit. As run times were considerably higher in this case we only give results for the BEE experiments.

4.1 Experiment 1: computing f 4(v)

Fig. 12
figure 12

a the star S6,4, b its adjacency matrix, and c a member of \(\mathcal {F}_{4}(31)\) with 80 edges (0=white, 1=black, not determined = gray)

Table 1 summarizes the results for a constraint-based approach to compute values of f4(v). We compare the computation time for four configurations: (1) “no symmetry break” break, (2) breaking symmetries using sb, (3) breaking symmetries using \(\textsf {sb}^{*}_{\ell }\), and (4) breaking symmetries using sb with an embedded star. The columns in Table 1 specify (from left to right): The number of vertices, v, and the value f4(v). Then, for each of the four configurations, we specify the computation time to compute a graph with f4(v) edges (columns labeled “sat”), and show the non-existence of a graph with f4(v) + 1 edges (columns labeled “unsat”).

Table 1 Computing f4(v) (time in seconds; timeout 4hrs)

For the first three configurations, we apply the constraint model from Fig. 11. For the fourth configuration, we add an additional constraint to the model. To this end we follow [9] where it is noted that every graph in \(\mathcal {F}(v)\) with at least 5 vertices and minimum/maximum vertex degrees (δ,Δ) contains a (Δ, δ − 1)-star. In general, an (m, n)-star is a rooted tree, denoted Sm, n, where the root has m children, each of which has n ≥ 1 children, all of which are leaves. The existence of a SΔ, δ− 1 in any extremal graph (with at least 5 vertices) follows from the fact that the children and grandchildren of any vertex are distinct (as there are no 3 − or 4 −cycles). So, we add constraints to explicitly embed SΔ, δ− 1 in the adjacency matrix. In Section 4.2, in order to distinguish SΔ, δ− 1 stars from new stars that we introduce, we refer to a simple Garnick star. In this setting, based on Theorem 6 we impose symmetry breaking on the Δ clusters of δ − 1 leaves of SΔ, δ− 1 as well as to the cluster of vertices not in the star (and use the partition preserving symmetry break as described in Theorem 6). Figure 12a illustrates the star S6,4.

A 31 × 31 adjacency matrix with an embedded S6,4 is depicted as Fig. 12b. Black and white cells indicate values 1 and 0 respectively, and gray cells indicate unassigned Boolean variables. The last row of the matrix corresponds to the root. Moving up we find the 6 children of the root, and then its 24 grandchildren. Note that although in this particular example all vertices are in the star, this is not generally true. We will later return to explain Fig. 12c.

Examining the first three columns labeled “sat” it appears that there is no significant gain in symmetry breaking when the instance is satisfiable and we need only find a single witness. In fact for many of the instances, the computation with no symmetry break is faster. When instances are unsatisfiable (the “unsat” columns), we encounter two types of instances: those which involve search and those which do not. For the later type, unsatisfiability derives from the propagation of the constraints in Eq. 1 and the computation is fast for all configurations. For the other instances, the solver must explore the entire search space and the first three “unsat” columns indicate that symmetry breaking is then useful.

The bottom two rows in Table 1 describe our results for two open instances, computing f4(31) and f4(32). A lower bound of f4(31) ≥ 80 is given in [9] and a witness (discovered using our model in less than 22 seconds) is depicted as Fig. 12c. It is canonical with respect to a partitioning where the first 24 rows form 6 clusters of size 4 each (the grandchildren), the next 6 rows form a cluster (the children), and the last row is a singleton cluster (the root). With the proof that there is no witness with 81 edges (determined using our model in 33 minutes of CPU time) we conclude that f4(31) = 80. Given that f4(31) = 80, Eq. 1 implies that f4(32) ≤ 85 and hence that the lower bound f4(32) ≥ 85 reported in [9] is the precise value, consequently f4(32) = 85. These are both new results.

A comparison between the configuration sb and sb with an embedded star demonstrates that the search times when considering the embedded stars are much faster.

4.2 Experiment 2: computing F 4(v)

In this experiment we apply a constraint-based approach to compute the number of non-isomorphic extremal graphs with v vertices. We apply a constraint solver to generate all graphs satisfying the constraint model for v vertices and e = f4(v) edges with corresponding symmetry breaking constraints. We then apply nauty to determine the number of non-isomorphic graphs within this set. The time required to run nauty is negligible and not detailed in our results.

For smaller values, v ≤ 15, we consider the constraint model of Fig. 11. Table 2 shows for each value of v the maximum number of edges f4(v), the number of non-isomorphs F4(v), and the number of graphs generated (columns sols) and computation time (time, in seconds), for each of the three configurations. Our results are as expected: improving symmetry breaking makes a significant difference.

Table 2 Computing F4(v) (time in seconds; timeout 4hrs)

For 16 ≤ v ≤ 19, for all configurations we again follow [9] and apply an additional constraint to embed a (Δ, δ − 1) star (as we did for the final configuration in Table 1). An interesting observation is that for 17 ≤ v ≤ 19 the number of graphs generated is the same for sb and \(\textsf {sb}^{*}_{\ell }\). This is due to the structure of the solutions in these cases. Note that for \(\textsf {sb}^{*}_{\ell }\) to generate fewer graphs than sb, for some partition Pk and {i, j}⊆ Pk with i < j, A[i] ≼ A[j] but A[i]⋠{i, j}A[j]. This can only happen if (i, j) is an edge, and so can only occur in the partition containing the vertices that are not in the star. For the case v = 17 and (δ,Δ) = (4,5) there is only one vertex not contained in the star and so this is not possible. The other cases require further analysis but can be explained in a similar way.

For the larger instances where v ≥ 20, we first extend the approach regarding embedding a star in extremal graphs. We denote by \(S_{m,[n_{1},\ldots ,n_{m}]}\), a rooted tree with m children, which respectively have n1,…, nm children (grandchildren of the root). In [13], for each v ≥ 21 and feasible pair (δ,Δ) we have identified stars which must be contained in any extremal graph (see Proposition 1). We embed the indicated structure in the adjacency matrix when performing the encoding using clusters in a similar way to that used previously using simple Garnick embedded stars (and again use the partition preserving symmetry break as described in Theorem 6). Note that we have identified stars containing as many vertices as possible, so that our “not in the star” cluster is a small as possible.

The bottom rows in Table 2 describe our results for four open instances, computing F4(22), F4(23), F4(24), F4(25) and F4(32). These are new results and we did not rely on any previous known bounds to reduce the search space. Because of the embedded star, there is no noticeable difference between the symmetry break using \(\textsf {sb}^{*}_{\ell }\) or when using sb (and so we only present the results for \(\textsf {sb}^{*}_{\ell }\) in the table). However, as indicated by the table, not applying either is catastrophic. The single element of F4(32) is shown in Fig. 13.

Fig. 13
figure 13

Unique girth 5 graph of order 32 with 85 edges

Note that values of F4(v) for 26 ≤ v ≤ 31 have also been calculated (and presented in [13]). In this case, although our approach was used to eliminate some sub-cases, the graphs were largely constructed by hand from graphs in F4(v − 1), so we do not include experimental results here. Note that the values of F4(v) in these cases are however included in Table 3.

Table 3 Embedded stars for 20 ≤ v ≤ 25, and v = 32

Proposition 1

LetG be an extremal graph with v vertices where20 ≤ v ≤ 27 orv = 32.Then, the minimal and maximal degrees, (δ,Δ) of a vertex inG correspond to one of the cases indicatedin Table 3. Furthermore,G has an embedded star of one ofthe forms indicated in Table 3for the respective values of(δ,Δ).Note that if more than one star is indicated then either all graphs haveembedded stars of at least one of the indicated types (where the form is given as\(S_{m,[n_{1},\ldots ,n_{m}]}\) or \(S_{m^{\prime },[n_{1}^{\prime },\ldots ,n_{m}^{\prime }]}\)),or all graphs have both of the indicated types (where the form is given as\(S_{m,[n_{1},\ldots ,n_{m}]}\) and \(S_{m^{\prime },[n_{1}^{\prime },\ldots ,n_{m}^{\prime }]}\)).

The full proof of Proposition 1 is presented in [13].

4.3 Experiment 3: computing f 5(v)

For our final experiment regarding girth we consider the extremal graphs which contain no cycles of length 5 or less. To this end we extend the basic constraint model of Fig. 11 with an additional constraint that states that every sequence of five vertices does not form a cycle, and we consider the optimization problem which computes values of f5(v). Table 4 shows our results. To better illustrate the impact of improved lexicographic symmetry breaking we consider also a predicate \(\textsf {sb}_{\ell }^{+}\) which is like \(\textsf {sb}_{\ell }^{*}\) but only compares consecutive rows of the matrix. The reason we did this was to satisfy our curiosity as to whether the additional work required for \(\textsf {sb}_{\ell }^{*}\) (O(n2) compared to O(n)) is worth it. It is clear that the much larger \(\textsf {sb}_{\ell }^{*}\) pays off. The stricter conditions force partial solutions to be rejected quicker, reducing computation time considerably for the larger instances.

Table 4 Computing f5(v) (time in seconds; timeout 4hrs)

5 Symmetry breaking for graph colorings

In this section we show that all of our results carry over to the search for graph edge colorings where the adjacency matrix contains integer values representing the color of edges instead of Boolean values representing the presence of edges. A graph coloring, in k colors, is a pair (G, κ) consisting of a simple graph G = (V, E) and a mapping κ: E →{1,…, k}. We typically represent (G, κ) with |V | = {1,…, n}, as an n × n adjacency matrix, A, defined such that

$$A[i,j]= \left\{\begin{array}{lll} \kappa(i,j) & \text{ if } (i,j) \in E\\ 0 & \text{otherwise} \end{array}\right. $$

A graph coloring problem is a formula φ(A) where A is an n × n adjacency matrix of integer variables (represented edge colors) together with a set (conjunction) of constraints φ on these variables. A solution is an assignment of integer values to the variables in A which satisfy φ and determine both the graph edges and their colors.

Suppose we are looking for a graph coloring A that satisfies a given property P. Here, solutions are typically closed under permutations both of vertices and of colors. Restricting the search space for a solution modulo such permutations is crucial when trying to solve hard graph coloring problems.

We observe that the notion of a canonical form given as Definition 5 and the lexicographic symmetry breaking predicates sb and \(\textsf {sb}_{\ell }^{*}\) of Definitions 6 and 8 are well-defined also for graph colorings where the non-zero entries in an adjacency matrix have values other than 1. Moreover, the proofs of Theorems 1 and 4 do not rely on the non-zero entries in an adjacency matrix having only value 1. Therefore our symmetry breaking constraints can be applied also in the search for graph colorings. We illustrate the impact of our approach on the search for Ramsey colorings (3,3,3;n) of the complete graph Kn.

An (r1,…, rk;n) Ramsey coloring is an assignment of one of k colors to each edge in the complete graph Kn such that it does not contain a monochromatic complete sub-graph \(K_{r_{i}}\) in color i for 1 ≤ ik. The Ramsey number R(r1,…, rk) is the least n > 0 such that no (r1,…, rk;n) coloring exists. The only known value of a nontrivial multicolor classical Ramsey number is R(3,3,3) = 17. The number of (3,3,3;n) colorings of the complete graph Kn on n vertices is known for 14 ≤ n ≤ 16 [14]. We illustrate here that we can reproduce these numbers using the symmetry break techniques proposed in this paper. Unfortunately, our technique does not suffice to compute the number of (3,3,3;13) colorings.

Figure 14 illustrates the constraints of the Ramsey coloring problem (3,3,3;n). Constraint (8) states that the graph has n vertices, is 3 colored, and is simple (symmetric, and with no self loops). Constraint (9) states that the n vertex graph has no embedded monochromatic sub-graph K3.

Fig. 14
figure 14

Constraint model for Ramsey colorings (3,3,3;n)

Table 5 illustrates the impact of the symmetry breaking predicate sb on the search for (3,3,3;n) Ramsey colorings. The column headed by “#∖” specifies the known number of colorings modulo weak isomorphism [14].

Table 5 The search for (3,3,3;n) Ramsey colorings with and without the symmetry break sb (time in seconds with 24 hr. timeout)

Definition 13 ((weak) isomorphism of graph colorings)

Let (G, κ1) and (H, κ2) be k-color graph colorings with G = ([n], E1) and H = ([n], E2). We say that (G, κ1) and (H, κ2) are weakly isomorphic, denoted (G, κ1) ≈ (H, κ2) if there exist permutations π: [n] → [n] and σ: [k] → [k] such that (u, v) ∈ E1 ⇔ (π(u), π(v)) ∈ E2 and κ1((u, v)) = σ(κ2((π(u), π(v)))). When σ is the identity permutation, we say that (G, κ1) and (H, κ2) are isomorphic.

The columns headed by “#vars” and “#clauses” indicate, respectively, the number of variables and clauses in the corresponding CNF encodings of the coloring problems with and without the symmetry breaking constraint. The columns headed by “time” indicate the time (in seconds) to find all colorings iterating with a SAT solver. The timeout assumed here is 24 hours. The column headed by “#” specifies the number of colorings found when solving with the symmetry break. These include colorings which are weakly isomorphic, but far less than the hundreds of thousands generated without the symmetry break (until the timeout). We have verified using nauty that the colorings obtained using the symmetry break (the last column) reduce modulo isomorphism to the known numbers (the second column).

Figure 15 presents the two non-isomorphic colorings (3,3,3;16) represented as adjacency graphs in the form found using our encoding. Note the lexicographic order on the rows in both matrices. These graphs are isomorphic to the two colorings reported in 1968 by Kalbfleish and Stanton [34] where it is also proven that there are no others (modulo weak isomorphism).

Fig. 15
figure 15

R(3,3,3;16) graphs

There are many papers that consider ways to exploit symmetries in the search for Ramsey numbers and colorings. In particular: Gent and Smith [35], building on the work of Puget [36], study symmetries in graph coloring problems and recognize the importance of breaking symmetries during search. Meseguer and Torras [37] present a framework for exploiting symmetries to heuristically guide a depth first search, and also show results for (3,3,3;n) Ramsey colorings with 14 ≤ n ≤ 17. An advantage of our approach is that it is not specialized. We simply require the rows of the adjacency matrix to be lexicographically ordered.

In [15] we apply the results of this paper to show that the precise value of R(4,3,3) is 30. This was an open problem for over 30 years.

6 Conclusion

We have considered the problem of breaking symmetries during search when identifying undirected graphs satisfying a given property P, assuming a setting where testing for the existence of a graph G satisfying P is posed as a Boolean constraint P(AG) on the variables of the Boolean adjacency matrix AG of G. We have presented two symmetry breaking constraints (sb and \(\textsf {sb}^{*}_{\ell }\)) and formally proved their correctness. We have demonstrated the benefit of our approach by applying it to solve a variety of problems related to extremal graphs of girth at least k + 1, for k = 4 and k = 5, solving some open instances. In particular we have shown how combining our technique with known properties of extremal graphs (in our case, the existence of embedded stars) can increase its effectiveness. We have also extended the approach to apply to graph coloring problems and demonstrated its impact on the search for Ramsey numbers and coloring.