Bipartite Matching in Haskell

In this post, we implement in Haskell, the augmenting path method to find a maximum matching in a bipartite graph. We use function composition and recursion mostly. We obtain a simple implementation of an algorithm to compute maximum matching in a bipartite graph. As a side affect of this exercise, you should know about algorithms to compute terminal objects, and fix points in a category. Only a basic familiarity with Haskell is assumed here. We do not use any higher-order functions, so read on.

Data Types

Let us start with data types to represent a graph and a matching. A bipartite graph is represented as (A, B, E) where A, B is the bipartition of the vertices and E is the list of edges with ends points in A and B. We create two types to represent the vertices. The graph is stored a Map, in which the key corresponds to a vertex in A, and the value is the list of the adjacent vertices in B. The matching is simply stored as Map from B to A, with the key being a vertex in B, and value is the end point of the edge in A. The imports and the type declarations are below.

import Data.Map as Map
import Data.List.Split
import Data.List
import Data.Maybe
import System.Random
type A = Int
type B = Int
type Graph = Map A [B]
type Matching = Map B A

Next, we construct a random bipartite graph over $(10,10)$ vertices, with approximately $50$ edges. We first generate a list of $50$ random numbers. The $i^{th}$ node in A is connected to approximately $5$ nodes starting at position $(i-1)*5$ in this list. The graph g is constructed using the builtin fromList which creates a Map from a list of (key, value) pairs. Index of vertices in A is different from indices in B.

nbhrs :: A -> [B]    -- random list of edges
nbhrs k = 
      take k $ randomRs (11,20) (mkStdGen 1024) :: [B]  
g :: Graph        -- random graph
g = fromList $ [(x, nub y) | x <- [1..10], y <- nbhrList]
      where nbhrList = chunksOf 5 (nbhrs 50)
m :: Matching        -- empty matching
m = fromList []

Augmenting Paths

A pair of edges is called independent is they do not share a vertex. A matching $M$ is a collection of independent edges. Given a matching $M$, a vertex $v$, is $M$-saturated if some edge in $M$ is incident on $v$, it is $M$-unsaturated otherwise. Given two matchings $M$ and $M’$ where $|M| < |M’|$, the edges in the symmetric difference $M \Delta M’$, is a union of disjoint paths or even cycles. The edges in such a path or cycle alternate, in the sense that one is from $M$, the next one is from $M*$ and so on. One of these paths starts and ends at a $M$-unsaturated vertex. Given a matching $M$, and an augmenting path $P$ wrt $M$, we can increase the size of the matching, by adding the edges of the path and removing the edges that are common to $M$ and $P$.

Function extendb takes a graph, matching and a current alternating path as an input, and extends the path. The path is given as a list and the next vertex is added to the front of the path. As the first vertex is of type B we extend the path using an edge from the matching if one exists. The resulting alternating path is added to the front of a list of lists containing all the alternating paths.

extendb :: Graph ->  Matching -> [Int] -> [[Int]]
extendb g m (b : p)              -- last vertex added was in B
   | e == Nothing = [[]]        -- is a deadend
   | otherwise = [[fromJust e] ++ (b:p)]    -- extend the path
       where e = Map.lookup b m

If the first vertex is of type A then we extend the path by using an edge not in the matching m. The edge should not lead to a previously visited vertex in the path p. There are multiple such extensions, possibly. The function extenda below returns a list of lists all such possible extensions. The path cannot be extended to a vertex is $b \in B$, if $b$ is already visited, or $b$ is not connected by an edge, not in the matching. To test whether a vertex is repeated, we check if the length of the list decreases when the duplicate elements are removed from the list (using builtin nub).

extenda :: Graph ->  Matching -> [Int] -> [[Int]]
extenda g m (a : p)              -- last vertex added was in A
   | es == Nothing = [[]]        -- deadend
   | otherwise =  [ y | y <- res, length (nub y) == length y] 
        where es = Map.lookup a g 
               -- list of neighbhours of a
                 res = [ [x] ++ (a:p) | x <- fromJust es, Map.member x m == False] 

An augmenting path is discovered by performing a breadth first search starting at $M$-unsaturated vertices in A as shown in augPath.
The initial matching is empty. We assume that an augmenting path starts at a vertex in A and ends at a vertex in B. The absence of an augmenting path is denoted by the empty list []. The third argument is the set of candidate paths.

augPath :: Graph -> Matching -> [[Int]] -> [Int]
augPath g m []  = []
augPath g m (p:ps)
       |  p == []  = []
       | (mod (length p) 2) == 0 && Map.lookup (p!!0) m == Nothing = p
       | mod (length p) 2 == 1 = augPath g m ( ps ++ (extenda g m p) )
       | mod (length p) 2 == 0 = augPath g m ( ps ++ (extendb g m p) )
       -- new candidates are added to the end of the list 

On a categorical note, augPath computes a terminal object in a category in which the objects are the alternating paths, the morphisms are the functions extenda and extendb, the composition is function composition, and identity is the id function. This approach (bfs) can be used to find a terminal object in any category, as long as we can compute (follow) the morphisms.

Repeat while

The final step is to repeatedly augment the current matching using an augmenting path if one exists. The function augmenting returns an augmenting path as a list of vertices. From this path, we construct two Maps one for the edges in the current matching $M$, and the second for the edges not in $M$. The new matching is obtained by removing the edge in the first Map and adding the edges in the second Map. Care should be taken here in the construction of the Maps, as the augmenting paths are listed in the reverse order.

matching :: Graph -> Matching -> Matching
matching g m 
   | augmenting == [] = m
   | otherwise = matching g augMatch
       augmenting = augPath g m us         
       augnotMedges = chunksOf 2 augmenting
       augMedges = chunksOf 2 $ take ((length augmenting)-1) 
                  $ drop 1 augmenting
       augMMap = fromList [(y, x) | [x,y] <- augMedges]
       augnotMMap = fromList [(x,y) | [x,y] <- augnotMedges]
       augMatch = Map.union (difference m augMMap) augnotMMap
       unsat =  (keys g) Data.List.\\ (elems m)
       us =  [[x] | x <- unsat]    -- list of M-unsaturated vertices

Function matching computes a fix point in the category of matchings. Objects in this category are matchings. Morphisms are the functions that return augmenting paths. Two objects a connected by a morphism, if one matching can be augmented into another one. Augmenting paths compose, therefore the morphism compose. Identity is the id.

Finally, we compute a maximum matching in graph g starting with empty matching m.

r :: Matching
r =  matching g m


The random graph and the resulting maximum matching are shown below.

*Main> g
*Main> m
fromList []
*Main> r
fromList [(11,4),(17,2),(18,3),(19,1)]

The code is available here. As an exercise try to extract all the vertex disjoint paths of the shortest length and augment them in a single iteration (Hopcroft-Karp).

For details and the background please see, “The design and analysis of algorithms” by Kozen and “Basic category theory” (Ch 2) by Leinster. Please send an email if you have any comments.