Matrix Chain Multiplication

Deciding how to group a sequence of matrix multiplications isn’t just a syntactic choice—it’s a critical optimization problem with dramatic consequences for computational speed. The matrix chain multiplication problem asks you to find the most efficient way to multiply a chain of matrices, minimizing the number of scalar multiplications. While matrix multiplication is associative, the chosen parenthesization, or the order of operations, has no effect on the final product but a massive impact on the computational cost. Understanding the dynamic programming solution to this problem teaches you a powerful interval-based DP paradigm applicable to numerous other algorithmic challenges.

The Core Problem and Cost

The fundamental operation cost is tied to the dimensions of the matrices. When multiplying two matrices, where the first is $p \times q$ and the second is $q \times r$ , the result is a $p \times r$ matrix. The cost, in terms of scalar multiplications, is $p \times q \times r$ . This is because each of the $p \times r$ entries in the result requires $q$ multiplications.

Now, consider a chain of matrices $A_{1}, A_{2}, ..., A_{n}$ , where matrix $A_{i}$ has dimensions $p_{i - 1} \times p_{i}$ . The goal is to parenthesize the product $A_{1} A_{2} ... A_{n}$ to minimize the total scalar multiplication cost. A naive brute-force approach would consider all possible parenthesizations, which grows exponentially via the Catalan number sequence. This is computationally infeasible for long chains.

Dynamic Programming Formulation

The efficient solution uses dynamic programming (DP) to build optimal solutions from smaller chains to larger ones. The key insight is that any optimal parenthesization of a chain from $i$ to $j$ must break the chain at some point $k$ , forming two optimal sub-chains $(A_{i} ... A_{k})$ and $(A_{k + 1} ... A_{j})$ . The total cost is the cost of computing each optimal sub-chain plus the cost of multiplying the two resulting matrices.

We define a DP table m[i][j] as the minimum number of scalar multiplications needed to compute the product of matrices from $i$ to $j$ . The dimensions array p[] stores the dimensions, where matrix $A_{i}$ has size $p_{i - 1} \times p_{i}$ .

The DP recurrence relation is: $m [i] [j] = ⎩ ⎨ ⎧ 0 i \leq k < j min {m [i] [k] + m [k + 1] [j] + p_{i - 1} \times p_{k} \times p_{j}} if i = j if i < j$

We also maintain a table s[i][j] to store the optimal split point k that achieved the minimum cost for the chain [i, j]. This allows us to reconstruct the optimal parenthesization later.

Walking Through a Concrete Example

Let's trace the algorithm on a chain of four matrices with dimensions:

$A_{1}$ : $10 \times 20$
$A_{2}$ : $20 \times 30$
$A_{3}$ : $30 \times 40$
$A_{4}$ : $40 \times 30$

Thus, the dimension array p[] = [10, 20, 30, 40, 30]. We will compute the m table for chains of increasing length L. We initialize all m[i][i] = 0.

For chain length L=2:

m[1][2]: Possible k=1 only. Cost = $m [1] [1] + m [2] [2] + (10 * 20 * 30) = 0 + 0 + 6000 = 6000$ .
m[2][3]: $0 + 0 + (20 * 30 * 40) = 24000$ .
m[3][4]: $0 + 0 + (30 * 40 * 30) = 36000$ .

For chain length L=3:

m[1][3]:
k=1: $m [1] [1] + m [2] [3] + (10 * 20 * 40) = 0 + 24000 + 8000 = 32000$ .
k=2: $m [1] [2] + m [3] [3] + (10 * 30 * 40) = 6000 + 0 + 12000 = 18000$ (minimum).

So, m[1][3]=18000, s[1][3]=2.

m[2][4]:
k=2: $m [2] [2] + m [3] [4] + (20 * 30 * 30) = 0 + 36000 + 18000 = 54000$ .
k=3: $m [2] [3] + m [4] [4] + (20 * 40 * 30) = 24000 + 0 + 24000 = 48000$ (minimum).

So, m[2][4]=48000, s[2][4]=3.

For chain length L=4:

m[1][4]:
k=1: $m [1] [1] + m [2] [4] + (10 * 20 * 30) = 0 + 48000 + 6000 = 54000$ .
k=2: $m [1] [2] + m [3] [4] + (10 * 30 * 30) = 6000 + 36000 + 9000 = 51000$ .
k=3: $m [1] [3] + m [4] [4] + (10 * 40 * 30) = 18000 + 0 + 12000 = 30000$ (minimum).

So, m[1][4]=30000, s[1][4]=3.

The minimal cost is 30,000 multiplications. Using the s table, we reconstruct the optimal parenthesization. Since s[1][4]=3, the optimal split is $(A_{1} A_{2} A_{3}) (A_{4})$ . Then, since s[1][3]=2, the first group splits as $(A_{1} A_{2}) (A_{3})$ . The final optimal parenthesization is $((A_{1} A_{2}) A_{3}) A_{4}$ .

Applications and Broader Paradigm

This problem is a cornerstone for learning a specific dynamic programming pattern. The solution strategy—defining a table over intervals [i, j] and iterating by chain length—is a classic example of interval-based dynamic programming. This exact paradigm translates directly to other complex problems:

Optimal polygon triangulation: Finding the triangulation of a convex polygon that minimizes the sum of the weights of its component triangles.
Optimal binary search tree (BST) construction: Arranging keys with given access probabilities to minimize expected search cost.

In both cases, the DP state represents an interval (a sequence of vertices or keys), and the recurrence considers all possible ways to split that interval into two optimal sub-intervals, plus a cost for combining them, mirroring the matrix chain logic.

Common Pitfalls

Misunderstanding the dimension array: A common error is incorrectly defining the dimension array p. For a chain of $n$ matrices, you need $n + 1$ dimensions. Remember: matrix $A_{i}$ has dimensions $p_{i - 1} \times p_{i}$ . Confusing this leads to incorrect cost calculations in the recurrence.
Incorrect DP filling order: The table must be filled in increasing order of chain length L, from 2 to $n$ . If you try to fill by rows (i) and columns (j) without respecting this order, you will attempt to access m[i][k] or m[k+1][j] before they have been computed. Always iterate by length first.
Forgetting the reconstruction table: While the m table gives you the optimal cost, the s table (storing the split index k) is essential for outputting the actual optimal parenthesization. Neglecting to build it means you cannot report how to achieve the minimum cost.
Overlooking the base case: The base case m[i][i] = 0 must be set explicitly. This represents the cost of a "chain" of a single matrix, which requires no multiplication. Starting with uninitialized values will propagate errors through all calculations.

Summary

The matrix chain multiplication problem seeks the optimal parenthesization of a matrix sequence to minimize the number of scalar multiplications, which is determined by the matrices' dimensions.
The efficient solution uses a dynamic programming approach that builds solutions from smaller sub-chains to larger ones, employing a cost recurrence: $m [i] [j] = min (m [i] [k] + m [k + 1] [j] + p_{i - 1} p_{k} p_{j})$ .
The algorithm requires proper initialization, filling a DP table in increasing order of chain length, and maintaining a separate table to reconstruct the optimal solution.
This problem is a foundational example of interval-based dynamic programming, a paradigm directly applicable to other optimization problems like optimal polygon triangulation and optimal binary search tree construction.
Avoiding pitfalls like misindexing the dimension array or filling the DP table in the wrong order is crucial for a correct implementation.

Matrix Chain Multiplication

Matrix Chain Multiplication

The Core Problem and Cost

Dynamic Programming Formulation

Walking Through a Concrete Example

Applications and Broader Paradigm

Common Pitfalls

Summary

Write better notes with AI