Network Flows and Matching Problems
AI-Generated Content
Network Flows and Matching Problems
Network flow theory provides the mathematical backbone for optimising the movement of resources through interconnected systems, from data packets across the internet to goods in a supply chain. By abstracting these systems into graphs—networks of nodes (junctions) and edges (connections)—we can solve complex logistical problems with precision. This field hinges on powerful dualities, most notably the relationship between maximum flow and minimum cut, and extends elegantly into solving matching problems, such as assigning workers to optimal tasks.
Modelling a Flow Network and the Maximum Flow Problem
A flow network is a directed graph where each edge has a non-negative capacity . We designate two special nodes: a source , where flow originates, and a sink , where flow terminates. A flow in this network is a function that assigns a real value to each edge, subject to two key constraints.
First, the capacity constraint ensures flow on any edge does not exceed its limit: for all edges. Second, the flow conservation constraint (except at and ) requires that the total flow into any intermediate node equals the total flow out of it. Formally, for all , we have . The value of a flow is the net flow out of the source, which must equal the net flow into the sink. The maximum flow problem asks: what is the greatest possible flow value that can be pushed from to without violating these constraints?
Consider a simplified road network between a factory () and a port (). Each road is an edge with a capacity representing the maximum number of trucks per hour. The flow is the actual number of trucks using that road. The goal is to maximise the total throughput from factory to port, ensuring no road is overloaded and that, at every intermediate town (node), the incoming trucks equal the outgoing ones—no trucks are magically created or lost.
The Ford-Fulkerson Method and the Labelling Algorithm
The Ford-Fulkerson method is a seminal approach for finding maximum flow. Its core idea is iterative: start with a zero flow and repeatedly find a path from to where we can send more flow, called an augmenting path, and then augment the flow along that path. The algorithm terminates when no augmenting path remains. The specific implementation using a systematic search for augmenting paths is often called the labelling algorithm or the Edmonds-Karp algorithm when breadth-first search is used to find the shortest augmenting path.
The labelling procedure works by systematically exploring the network from the source. We "label" nodes we can reach with additional flow, keeping track of the path and the maximum possible flow increase along it. Crucially, we must consider not just forward edges with leftover capacity, but also backward edges. A backward edge in the residual network represents our ability to reduce flow on an existing edge, effectively redirecting it. This is the key to finding globally optimal solutions.
Worked Example (Labelling Algorithm): Consider a network with nodes and edges: (capacity 10), (capacity 10), (capacity 5), (capacity 8), (capacity 10). We aim to find the maximum flow from to .
- Start with zero flow. Find an augmenting path, e.g., . The bottleneck capacity is . Send 8 units of flow along this path. Update residual capacities: now has residual capacity 2, is saturated (residual capacity 0).
- In the residual network, find another augmenting path. From , we can go to (residual capacity 2) or to (residual capacity 10). Consider path . Bottleneck capacity is . Send 10 units along this path. Update residuals: is saturated, is saturated.
- Now, check for more augmenting paths. From , the only edge with residual capacity is (2). From , we can go to (residual capacity 5, since no flow on yet) but not to (saturated). From , all edges to are saturated, and there is no backward edge from to because no flow has been sent on . Thus, no augmenting path from to exists in the residual graph.
The total flow is . To confirm, consider the cut with capacity , equal to the flow value, so by the max-flow min-cut theorem, this is optimal.
The Maximum Flow-Minimum Cut Theorem
The maximum flow-minimum cut theorem is the cornerstone of network flow theory, establishing a profound duality. A cut partitions the graph's nodes into two sets, and , such that and . The capacity of a cut is the sum of the capacities of all edges going from to : .
The theorem states two equivalent facts:
- The maximum flow value from to is equal to the minimum capacity of any -cut.
- A flow is maximum if and only if the residual network contains no augmenting path from to .
This means the "bottleneck" for the entire network is defined by the minimum cut. In our road network example, the minimum cut identifies the most constrained set of roads whose total capacity limits the entire system's throughput. Strengthening roads that are not part of the minimum cut will not increase overall throughput.
Bipartite Graphs, Matching, and the Hungarian Algorithm
A bipartite graph is one whose vertices can be divided into two disjoint sets, and , such that every edge connects a vertex in to one in . These graphs naturally model assignment problems: could be a set of workers, a set of tasks, and an edge indicates a worker is qualified for a task.
A matching is a set of edges with no shared vertices. A maximum matching is a matching of the largest possible size. In our worker-task model, a maximum matching assigns the maximum number of workers to distinct, suitable tasks. A matching is perfect if every vertex is incident to an edge in the matching (requiring ).
The connection to network flows is elegant. We can transform a bipartite matching problem into a maximum flow problem: Add a super-source connected to all vertices in , and a super-sink connected from all vertices in . Assign a capacity of 1 to every edge in the original bipartite graph and to these new edges. A maximum integer flow in this network corresponds directly to a maximum matching, where a flow of 1 on an original edge means that worker is matched to that task.
For weighted bipartite graphs where edges have costs or profits (e.g., the efficiency of a worker on a task), the goal shifts to finding a minimum-cost maximum matching or a maximum-weight matching. The classic solution is the Hungarian algorithm.
The Hungarian algorithm operates on a cost matrix and finds an optimal assignment by manipulating potential values (row and column reductions). It systematically covers zero entries in a reduced cost matrix using a minimum number of lines. The core steps are:
- Row and Column Reduction: Subtract the smallest element in each row from all elements in that row, then do the same for each column. This creates a matrix with at least one zero in every row and column.
- Cover Zeros: Attempt to cover all zeros in the matrix using a minimum number of horizontal and vertical lines. If the number of lines equals the size of the matrix, an optimal assignment exists among the zeros.
- Matrix Adjustment: If fewer lines are needed, find the smallest uncovered element. Subtract it from all uncovered elements and add it to elements covered twice. This creates new zeros. Repeat from step 2 until an optimal assignment is found.
This algorithm efficiently solves the assignment problem without directly invoking network flow, though it is underpinned by similar primal-dual principles.
Applications to Real-World Optimisation
Network flow and matching models are ubiquitous in operations research and computer science.
- Transportation & Logistics: Maximising the flow of goods from warehouses to retail outlets subject to road/rail capacity. The minimum cut identifies critical infrastructure vulnerabilities.
- Communication Networks: Data packets routed through the internet can be modelled as flow. Maximum flow algorithms help in designing bandwidth-efficient network topologies.
- Resource Allocation: Matching medical residents to hospitals (the National Resident Matching Program uses a variant of these algorithms) or assigning computing jobs to servers in a cloud data centre.
- Project Planning: In critical path method (CPM) analysis, finding the minimum cut in a project network can identify tasks whose delay will directly delay the entire project.
Common Pitfalls
- Ignoring Backward Edges in Augmenting Paths: A common error in executing the labelling algorithm is only looking for paths with available forward capacity. Failing to consider backward edges, which allow the algorithm to undo and re-route previous flow, will prevent you from finding the true maximum flow in many networks. Always work with the full residual graph.
- Confusing Flow Value with Edge Flows: The objective is to maximise the value of the flow—the total net out of the source. It is possible for individual edge flows to be less than their capacity in an optimal solution. The maximum flow is about the system-wide throughput, not maximising every single edge.
- Misapplying the Hungarian Algorithm to Non-Bipartite or Unbalanced Problems: The Hungarian algorithm requires a complete square cost matrix for a balanced assignment. For unbalanced problems (more workers than tasks), or for graphs that are not bipartite, the algorithm does not apply directly. You must first transform the problem, often by adding "dummy" rows/columns with zero costs or reverting to a more general maximum-flow or linear programming approach.
- Equating Minimum Cut with Smallest Set of Edges: The capacity of a cut is the sum of capacities, not the count of edges. The minimum cut is defined by the smallest total capacity crossing from to , which may involve a few high-capacity edges or many low-capacity ones.
Summary
- The maximum flow problem seeks the greatest possible resource movement from a source to a sink in a capacity-constrained network, solvable using iterative augmenting path algorithms like Ford-Fulkerson.
- The maximum flow-minimum cut theorem proves the dual relationship where the value of the maximum flow equals the capacity of the minimum cut, identifying the network's true bottleneck.
- Bipartite graphs model assignment problems, where a maximum matching can be found by converting the problem into an equivalent network flow problem with unit capacities.
- The Hungarian algorithm provides an efficient method for solving the weighted assignment problem (minimum-cost or maximum-weight matching) on a bipartite graph by manipulating a cost matrix.
- These techniques have powerful real-world applications in optimising transportation, communications, resource allocation, and logistics.