Archive for September, 2009

Hungarian algorithm, take 1

23 September, 2009

Hungarian algorithm is an efficient procedure to find a maximum weight matching in a bipartite graph \Gamma=(V_1\cup V_2,E) with parts V_1 and V_2 and a weight function w:E\to\mathbb{R}_+. Usually it is described in terms of potential functions on V=V_1\cup V_2, and looks quite mysterious, as it hides the origin of the procedure, that is a variation of the augmenting path method for finding a maximum (unweighted) matching in \Gamma. Here we describe this point of view.

Let M be a matching of maximum weight w(M) among all matchings of size |M|. If M is a maximum (unweighted) matching then we are done.

Otherwise, we construct a bipartite digraph D=(V,A) by orienting all the edges in M from V_2 to V_1, and the remaining edges E\setminus M in the opposite direction. We also construct the length function \ell: A\to\mathbb{R} by setting \ell(a)=w(a) for all a\in M, and \ell(a)=-w(a) for all a\in A\setminus M. Observe that there no negative length (directed) cycles in D. Indeed, any directed cycle C in D contains equally many edges in C\cap M from M and in C\setminus M. Note that \ell(C)<0 would mean that w(C\cap M)< w(C\setminus M), and so w(M) is not maximum among all matchings of size |M|, as w(C\triangle M)> w(M), whereas |C\triangle M|=|M|.

Let V_1^M and V_2^M be subsets of V_1, resp. V_2 that are unsaturated by M. We find (say, using the Bellman-Ford algorithm) a minimum length directed path P between V_1^M and V_2^M. By construction of D, P is an augmenting path for M. So M'=M\triangle P is a matching satisfying |M'|=|M|+1.

Claim: w(M') is maximum among all matchings of size |M'|.

We’ll prove this claim below. For now, we remark that it is obvious that this procedure, repeated with M' taking place of M, will find matchings of maximum weight among all matchings of size 1,2,3,\dots In the end it remains to select among these at most \min(|V_1|,|V_2|) matchings one with maximum weight.

Now let us prove the claim. It certainly holds for |M'|=1, as the procedure will select a maximum weight edge in E and M' will consist of this edge. So we can assume |M'|\geq 2. Let N be a matching of size |M'|, and of maximum weight. We need to show that w(N)\leq w(M'). As |N|=|M|+1, in the subgraph N\cup M there
exists a connected component Q that is a M-augmenting path Q.
But P is a shortest M-augmenting path, so \ell(Q)\geq\ell(P).
On the other hand, N\triangle Q is a matching of size |M|, and so w(N\triangle Q)\leq w(M), by induction. Now w(N)=w(N\triangle Q)-\ell(Q)\leq w(M)-\ell(Q)\leq w(M)-\ell(P)=w(M'), as claimed. QED.

As described, we need to do less that n=|V| loops of this procedure. In each loop, the Bellman-Ford dominates the complexity, so in total we need O(n^2|E|) operations. In fact, we can stop as soon as we reached the situation when w(M')\leq w(M).

Let a matching M have maximum w(M) among matchings of size |M|, and any matching M' of size |M'|=|M|+1 satisfy w(M')\leq w(M). Then w(N)\leq w(M) for any matching N of size bigger than |M|.

Proof. Suppose N be a matching contradicting the claim, minimal w.r.t. |N\triangle M|. Let D be the digraph constructed from \Gamma, w, and M as above. The connected components of |N\triangle M| induce oriented cycles and directed paths on D. Each such component will have a nonnegative total length, as neither there can be a negative length cycle in D (see above) nor can there be a negative length M-augmenting path (otherwise we can construct M'). But this implies that w(M)\geq w(N), contradiction. QED.

In particular, the complexity above can be improved to O(n|M||E|).


Bellman-Ford algorithm for shortest paths and potentials

18 September, 2009

Bellman-Ford shortest paths algorithm. Finding minimal distances in directed graphs, when some arc lengths are negative, cannot be done by Dijkstra’s algorithm (at least not without some preparation). Moreover, it is obvious that one cannot even define the minimal distances when the digraph has a negative length cycle.
While negative length paths look artificial, they arise naturally in modeling (say, a driver has to pay to use certain stretches of the road, but is paid for driving up other stretches) and algorithms, not the least in Hungarian method for weighted bipartite matching.

Let \Gamma=(V,A) be a directed graph and \ell:A\to \mathbb{R} a length function on the arcs of \Gamma. Suppose we are interested in the shortest, w.r.t. \ell, (directed) paths from v\in V to all the vertices of \Gamma (reachable from v, that is). We define d_k(u) to be the minimal length of at most k-arc path from v to u\in V.

Certainly, d_0(v)=0 (as our graph does not have a negative length cycle) and d_0(u)=\infty for all v\neq u\in V. Then, by induction,

d_k(u)=\min (d_{k-1}(u),\min_{(w,u)\in A}d_{k-1}(w)+\ell(w,u)).

That’s good, but can k grow arbitrarily? Well, if a walk to u visits a vertex more than once, this means it contains a cycle — unless this cycle has negative length (and we don’t allow it here) it can be removed from the walk, and it will not increase walk’s length. Thus without loss in generality, a shortest path to u has at most |V| vertices, and so d_n(u) is the distance from v to u.

Thus in at most O(|V||E|) operations we will find the distances from v to all the vertices of \Gamma.

Excercise. Modify the procedure to do a simultaneous computation of the distances from each v\in V to all u\in V, i.e. the distance matrix, in O(n^3) operations.

Potentials. In this setting, a useful concept is that of potential, a function p:V\to\mathbb{R} such that

\ell(u,v)\geq p(v)-p(u),\quad for all (u,v)\in A.

When all the arc lengths are nonnegative, the zero function v\mapsto 0 is a potential. When there are negative length arcs, a potential can be used to adjust the distance function so that the new one is nonnegative, and a path is shortest w.r.t. to the new function iff it is shortest w.r.t. the old one: namely, set \ell'(u,v)=\ell(u,v)-p(v)+p(u) for all arcs (u,v). Indeed, when we compute the \ell'-length \ell'(P) of the path P between x and y, we see that \ell'(P)=\ell(P)-p(y)+p(u).

Constructing a potential. Attach to \Gamma an extra vertex v_0 and join it with the vertices v of \Gamma by arcs (v_0,v) of length 0. Run Bellman-Ford algorithm for the distances d(v_0,v), and set p(v)=d(v_0,v). As for any arc (u,v) one has p(v)\leq p(u)+\ell(u,v), we indeed obtain a potential.

Again, here we assume that there are no negative length cycles in \Gamma – indeed, otherwise a potential does not exist (this can be proved by summing the inequalities \ell(u,v)\geq p(v)-p(u) along a cycle.)

Bipartite matchings via linear programming

14 September, 2009

An important tool for combinatorial optimisation problems, and graph problems among them, is linear programming, a method (and a problem) to find the optimum of a linear function on a polyhedron (i.e. on an intersection of half-spaces.) Often, such an intersection is bounded, and then one talks about a (convex) polytope rather than a polyhedron.
The optimum of a linear function on a polytope P is always reached on a vertex, or, more generally, on a set of vertices of P generating an optimal face. (A vertex is a point in P that cannot be written as a nontrivial convex combination of two other points of P.)

Here we demonstrate a way to encode the problem of finding the size of a maximum matching (or, more generally, the maximum weight of a matching) in a bipartite graph \Gamma=(V,E) as a linear programming problem.

As a warmup, we treat perfect matchings. Consider the vector of |E| variables x=(x_{e_1},x_{e_2}...), and the system of inequalities

x_e\geq 0,\ e\in E,\quad \sum_{v\in e\in E} x_e=1,\ v\in V,\qquad (*)

Certainly, if x=x_M is the characteristic vector of a perfect matching M of \Gamma, i.e. x is a 0-1 vector with 1s corresponding to the edges of M, it satisfies the system (*) above. Thus maximizing the linear function \sum_{e\in E} w_e x_e on the polytope defined by (*) gives an upper bound on the size of a maximum weight perfect matching in \Gamma w.r.t. the edge weights w_e, e\in E.
But in fact much more holds: namely, this bound is tight, as follows from the following theorem.

For a bipartite graph \Gamma, the vertices of the polytope P defined by (*) have 0-1 coordinates, and are in the one-to-one correspondence with the perfect matchings of \Gamma.

Proof. Let x be a vertex of P, and X be the support of x, i.e. X=\{e\in E\mid x_e> 0\}. Then X does not contain a cycle. Indeed, a cycle in X can be decomposed into the disjoint union of two (not necessarily perfect) matchings, say M and N, as \Gamma is bipartite and so each cycle in \Gamma is even. Observe that \sum_{v\in e\in E} ((x_M)_e-(x_N)_e)=0 for any v\in V.
Thus there exists \epsilon>0 such that x\pm\epsilon(x_M-x_N)\in P, and so 2x=(x+\epsilon(x_M-x_N))+(x-\epsilon(x_M-x_N))), contradicting the assumption that x is a vertex (and so it cannot be a convex combination of two points in P).

Thus X is a forest. Let e\in X be a leaf of a subtree of X. Then (*) implies that x_e=1. Therefore any connected component of X is an edge, and so X is a matching. It is a perfect matching by (*). QED.

By this theorem, the maximum of \sum_{e\in E} w_e x_e on P is reached on a perfect matching x=x_M. Another almost immediate corollary is the following well-known result on doubly stochastic matrices (an n\times n nonnegative real matrix is called doubly stochastic if its row and column sums are equal to 1.)

A doubly stochastic matrix is a convex combination of permutation matrices.

Indeed, n\times n-permutation matrices (the 0-1 doubly stochastic matrices) are in one-to-one correspondence to the perfect matchings of K_{n,n}, (the characteristic vector of such a matching can be viewed as an n\times n matrix, with rows corresponding to one part of the graph, and columns to the other) and so the doubly stochastic matrices are exactly the points of the perfect matching polytope of K_{n,n}, that has, by our theorem, perfect matchings as vertices (and so any point in it is a convex combination of perfect matchings).

Excercise. Formulate an analogy of system (*) for matchings (not necessarily perfect) in bipartite graphs, and prove the corresponding result for vertices of the polytope in question. Using it, formulate the linear programming problem with optimal value being the size of maximum matching.

Excercise. Show that these constructions fail for non-bipartite graphs.

Bipartite matchings and SDRs

14 September, 2009

Matchings in bipartite graphs and systems of distinct representatives (SDRs) of systems of subsets are closely related. More precisely, let \mathcal{A}=\{A_1,...,A_m\}\subseteq 2^X, for X a finite set. We can construct a bipartite graph \Gamma=(\mathcal{A}\cup X,E), where E=\{(A,x)\mid x\in A\in\mathcal{A}\}.

An SDR of \mathcal{A} is Y\subseteq X such that A_i\cap A_j\cap Y=\emptyset for all i\neq j and |A_i\cap Y|=1 for all i. In particular Y contains exactly one element y_A for each A\in\mathcal{A}.
Then \{(A,y_A)\mid A\in\mathcal{A}\} is a matching of \Gamma saturating the part \mathcal{A}. In particular, it is a maximum matching.
Conversely, a matching of \Gamma saturating \mathcal{A} gives rise to an SDR for \mathcal{A}.

An obvious necessary condition for existing of an SDR for \mathcal{A}
is the following Hall’s condition:

| \cup_{A\in\mathcal{B}} A|\geq |\mathcal{B}|,\qquad \forall\ \mathcal{B}\subseteq\mathcal{A}.

In view of this, Hall’s Theorem for matchings is equivalent to the following

An SDR for \mathcal{A} exists iff it satisfies Hall’s condition.

To see that Hall’s condition is sufficient, we proceed by induction on m. For m=1 the claim is obvious. Assume that it is true for all collections of less than m sets. We now establish that it holds for m-set collections, too.

If | \cup_{A\in\mathcal{B}} A|> |\mathcal{B}|,\quad \forall\ \mathcal{B}\subset\mathcal{A}, then we take any A\in \mathcal{A} and set y_A=y for some y\in A. Now by induction there is an SDR Y' for \mathcal{A}\setminus\{A\}. Then \{y_A\}\cup Y' is an SDR for \{B\setminus\{y_A\}\mid B\in\mathcal{A}\}.

In the remaining case there exists k< m such that | \cup_{A\in\mathcal{B}} A|=|\mathcal{B}|=k,\quad  \mathcal{B}\subset\mathcal{A}. By induction, there is an SDR Y' for \mathcal{B}.
Let \mathcal{C}\subseteq \mathcal{A}\setminus\mathcal{B}. Then | \cup_{A\in\mathcal{C}} A\setminus Y'|\geq |\mathcal{C}|=s, for otherwise | \cup_{A\in\mathcal{C}\cup\mathcal{B}} A|< s+k, contradicting Hall’s condition. Now we can take an SDR Y'' for \{A\setminus Y'\mid A\in\mathcal{A}\setminus\mathcal{B}, and observe that Y'\cup Y'' is an SDR for \mathcal{A}.

5-list-colouring of planar graphs

8 September, 2009

A \mathcal{C}-list-colouring of a graph \Gamma=(V,E) with a colour list \mathcal{C}=\{C(v_1),\dots,C(v_n)\}, for V=\{v_1,\dots,v_n\}, is a proper colouring of V by elements of \cup_j C(v_j) so that the adjacent vertices u,v are coloured in different colours and the colour for v\in V is in C(v). In particular, when all the C(v)'s are equal to each other we have a classic graph colouring.

\Gamma is called k-choosable, or k-list-colourable if it is \mathcal{C}-list-colourable for any \mathcal{C} satisfying |C(v)|\geq k for all v\in V. The minimal k for which is \Gamma is k-choosable is called the list chromatic number of \Gamma and denoted by \chi_\ell(\Gamma). Certainly, \chi(\Gamma)\leq \chi_\ell(\Gamma).
It is not hard to construct examples when \chi(\Gamma)< \chi_\ell(\Gamma), e.g. take \Gamma=K_{2,4} and consider \mathcal{C}=\{\{1,2\},\{3,4\},\{1,3\},\{1,4\},\{2,3\},\{2,4\}\}, where vertices v_1,v_2 in the size two part of \Gamma get colour sets C(v_1)=\{1,2\}, C(v_2)=\{3,4\}.

Using the fact that each planar graph has a vertex of degree at most 5, as can be easily established using the Euler formula, one can prove that each planar graph is 6-choosable (employ induction by removing a minimal degree vertex).

It is less easy to show that

any planar graph \Gamma is 5-choosable.

This was established in 1994 by Thomassen.
In fact \chi_\ell(\Gamma)=5 cannot be improved — there are examples of planar \Gamma for which \chi_\ell(\Gamma)=5. The smallest known (in 2003) example has 63 vertices.

The classical analogy of this, the Heawood theorem that says that each planar graph is 5-colourable, is known since 1890, and the 4-colourability is a recent and quite difficult result by Appel and Haken.

To show the 5-choosability, one first observes that it suffices to prove it for the planar graphs for which every face, except possibly the exterior face of the plane embedding, is a triangle. We call such graphs almost triangulated. Then, one proceeds by induction on |V|. Assume that we know that for any almost triangulated planar \Gamma with |V\Gamma|\leq n, exterior face B, and \mathcal{C} such that |C(v)|\geq 3 for all v \in B, resp. |C(v)|\geq 5 for all v\in V\setminus B, we can extend a proper \mathcal{C}-list-colouring of an edge on B to a full proper \mathcal{C}-list-colouring (as is obviously true provided |V|\leq 3, giving us a basis of induction).

Let us now show this for \Gamma with |V\Gamma|=n+1. Let (x,y) be an edge of B that we colour using \alpha\in C(x) and \beta\in C(y)\setminus\{\alpha\}.
If B has a chord, i.e. there is an edge (u,v) that joins two non-adjacent vertices of the cycle induced on B (and so (u,v) is not cutting through the fact bounded by the cycle on B), we spilt \Gamma into two planar parts \Gamma_1 and \Gamma_2, so that \Gamma_1 and \Gamma_2 have (u,v) in common. W.l.o.g. (x,y)\in E\Gamma_1; by induction, we can extend this colouring to a \mathcal{C}-list-colouring \mathcal{A}_1 of \Gamma_1. This colouring will fix a proper \mathcal{C}-list-colouring of the edge (u,v), that is on the exterior face of \Gamma_2. Again, by induction, we can extend this colouring of (u,v) to a \mathcal{C}-list-colouring \mathcal{A}_2 of \Gamma_2. Combining \mathcal{A}_1 and \mathcal{A}_2, we obtain a \mathcal{C}-list-colouring of \Gamma, as required.

It remains to consider the case when B does not have a chord.
Then let z\in B\setminus\{x,y\} be the vertex adjacent to x. Take any two colours \delta,\mu in C(z)\setminus\{\alpha\} (this is possible as |C(z)|\geq 3) and remove \delta,\mu from C(t) for all t\in\Gamma(z)\setminus B. Then remove z and all the edges on it from \Gamma. We obtain a planar graph \Gamma' with one vertex less, and with the valid exterior face conditions. By induction, we can extend the colouring of (x,y) to a \mathcal{C}-list-colouring of \Gamma'. It remains to see that z can be properly coloured, too. Indeed, we can colour it with either \delta or \mu, depending on the colour of the neighbour of z on B distinct from x. Q.E.D.