Inclusion-Exclusion Principle and Applications

Counting the union of overlapping sets is a fundamental challenge in combinatorics. Directly summing the sizes of individual sets leads to overcounting, as elements in the intersections are counted multiple times. The Inclusion-Exclusion Principle provides a systematic, alternating formula to correct for this overcounting, transforming a seemingly messy problem into a manageable calculation. Its power extends far beyond simple set theory, offering elegant solutions to problems in number theory, graph theory, and algebra, and is generalized by the profound idea of Möbius inversion on partially ordered sets.

Statement and Proof of the Principle

The Inclusion-Exclusion Principle is a combinatorial formula that gives the cardinality of a finite union of sets in terms of the sizes of their various intersections. For a collection of finite sets $A_{1}, A_{2}, \dots, A_{n}$ , the principle states:

$i = 1 ⋃ n A_{i} = i = 1 \sum n ∣ A_{i} ∣ - 1 \leq i < j \leq n \sum ∣ A_{i} \cap A_{j} ∣ + 1 \leq i < j < k \leq n \sum ∣ A_{i} \cap A_{j} \cap A_{k} ∣ - \dots + (- 1)^{n + 1} ∣ A_{1} \cap A_{2} \cap \dots \cap A_{n} ∣.$

The proof proceeds by induction or, more intuitively, by considering an arbitrary element $x$ that belongs to exactly $r$ of the $n$ sets. On the right-hand side of the formula, $x$ is counted once in each of the $(1 r)$ single-set terms, subtracted once in each of the $(2 r)$ pairwise intersection terms, added back in each of the $(3 r)$ triple intersections, and so on. The total contribution of $x$ is therefore:

$(1 r) - (2 r) + (3 r) - \dots + (- 1)^{r + 1} (r r) .$

This alternating sum of binomial coefficients is equal to 1, which you can verify using the Binomial Theorem on $(1 - 1)^{r} = 0$ . Thus, every element in the union is counted exactly once, proving the formula correct.

Application to Derangements

A derangement is a permutation of $n$ elements where no element appears in its original position. Counting derangements is the classic "hat-check" problem. Let $A_{i}$ be the set of permutations where element $i$ is in its correct position (a fixed point). The union $⋃_{i = 1}^{n} A_{i}$ is the set of permutations with at least one fixed point. We want the complement: $D_{n} = n! - ∣ ⋃_{i = 1}^{n} A_{i} ∣$ .

Using Inclusion-Exclusion, the size of any $k$ -fold intersection $∣ A_{i_{1}} \cap \dots \cap A_{i_{k}} ∣$ is $(n - k)!$ , because we fix $k$ elements and permute the remaining $n - k$ freely. There are $(k n)$ such intersections. Applying the principle:

$i = 1 ⋃ n A_{i} = (1 n) (n - 1)! - (2 n) (n - 2)! + \dots + (- 1)^{n + 1} (n n) 0! .$

Therefore, the number of derangements is:

$D_{n} = n! (1 - \frac{1}{1 !} + \frac{1}{2 !} - \frac{1}{3 !} + \dots + (- 1)^{n} \frac{1}{n !}) .$

This formula beautifully approximates $n! / e$ for large $n$ .

Counting Surjections and the Stirling Connection

How many surjective (onto) functions are there from a set of $m$ elements to a set of $n$ elements? Let the target set be $B = {1, 2, \dots, n}$ . Define $A_{i}$ as the set of functions $f : [m] \to B$ that miss element $i$ (i.e., $i \in / f ([m])$ ). The union $⋃_{i = 1}^{n} A_{i}$ is then the set of non-surjective functions. Its complement is the set of surjections we want to count, $S (m, n)$ .

The size of a $k$ -fold intersection $∣ A_{i_{1}} \cap \dots \cap A_{i_{k}} ∣$ is the number of functions whose image is contained in a set of size $n - k$ , which is $(n - k)^{m}$ . There are $(k n)$ such choices. By Inclusion-Exclusion on the complement:

$S (m, n) = n^{m} - i = 1 ⋃ n A_{i} = k = 0 \sum n (- 1)^{k} (k n) (n - k)^{m} .$

This sum is closely related to the Stirling numbers of the second kind, which count the number of ways to partition a set of $m$ objects into $n$ non-empty unlabeled subsets. In fact, $S (m, n) = n! \cdot S (m, n)$ , where $S (m, n)$ denotes the Stirling number.

Computing the Euler Totient Function

The Euler totient function $φ (n)$ counts the number of integers between 1 and $n$ that are relatively prime to $n$ . This is a prime candidate for Inclusion-Exclusion. Let $n$ have prime factorization $p_{1}^{a_{1}} p_{2}^{a_{2}} \dots p_{r}^{a_{r}}$ . An integer shares a common factor with $n$ if it is divisible by at least one of these primes.

Define $A_{i}$ as the set of integers in ${1, \dots, n}$ divisible by $p_{i}$ . Then $φ (n) = n - ∣ ⋃_{i = 1}^{r} A_{i} ∣$ . The size of an intersection $A_{i_{1}} \cap \dots \cap A_{i_{k}}$ is $n / (p_{i_{1}} p_{i_{2}} \dots p_{i_{k}})$ , as it counts multiples of that product. Applying Inclusion-Exclusion:

$φ (n) = n - i \sum \frac{n}{p _{i}} + i < j \sum \frac{n}{p _{i} p _{j}} - i < j < k \sum \frac{n}{p _{i} p _{j} p _{k}} + \dots .$

Factoring out $n$ yields the classic product formula:

$φ (n) = n i = 1 \prod r (1 - \frac{1}{p _{i}}) .$

Chromatic Polynomials of Graphs

The chromatic polynomial $P (G, k)$ counts the number of proper vertex colorings of a graph $G$ using at most $k$ colors. For a graph with $n$ vertices and $m$ edges, a naive count gives $k^{n}$ colorings, but we must exclude those where adjacent vertices share a color.

Let the edges be $e_{1}, e_{2}, \dots, e_{m}$ . For edge $e_{i}$ , let $A_{i}$ be the set of colorings where the endpoints of $e_{i}$ have the same color (a bad coloring). Then the set of proper colorings is the complement of the union $⋃ A_{i}$ . Therefore:

$P (G, k) = k^{n} - i = 1 ⋃ m A_{i} .$

Applying Inclusion-Exclusion, the size of an intersection $A_{i_{1}} \cap \dots \cap A_{i_{k}}$ is $k^{c}$ , where $c$ is the number of connected components in the subgraph formed by those $k$ edges. This is because contracting each edge forces its endpoints to share a color, effectively reducing the number of freely colorable components. Thus,

$P (G, k) = F \subseteq E (G) \sum (- 1)^{∣ F ∣} k^{c (F)},$

where the sum is over all subsets of edges, and $c (F)$ is the number of components in the graph with vertex set $V (G)$ and edge set $F$ . This formula shows $P (G, k)$ is indeed a polynomial and connects directly to the graph's matroid structure.

Möbius Inversion as a Generalization

Möbius inversion generalizes the alternating sum pattern of Inclusion-Exclusion to any locally finite partially ordered set (poset). In the classic principle, the poset is the Boolean lattice of subsets of ${1, \dots, n}$ , ordered by inclusion.

Given two functions $f, g$ on a poset, Möbius inversion states that if $g (x) = y \leq x \sum f (y)$ for all $x$ , then we can invert to find $f$ : $f (x) = y \leq x \sum μ (y, x) g (y) .$ Here, $μ$ is the Möbius function of the poset, defined recursively. In the subset poset, $μ (Y, X) = (- 1)^{∣ X ∣ - ∣ Y ∣}$ for $Y \subseteq X$ . Substituting this into the inversion formula directly recovers the Inclusion-Exclusion Principle, where $g (X)$ is the size of the intersection of sets indexed by $X$ , and $f (X)$ is the size of the intersection of exactly those sets (excluding contributions from larger intersections). This powerful abstraction applies to number theory (the divisor poset, where $μ$ is the classical Möbius function), combinatorics, and beyond.

Common Pitfalls

Misidentifying the "Bad" Events: In applications like derangements or surjections, a common error is incorrectly defining the sets $A_{i}$ . Always define $A_{i}$ as the set of outcomes violating a specific, simple condition you wish to exclude. For derangements, $A_{i}$ is "element i is fixed," not "element i is not fixed."
Incorrect Intersection Cardinality: Calculating $∣ A_{i} \cap A_{j} ∣$ requires careful thought about the dependency between conditions. In the derangement, conditions are independent, leading to $(n - 2)!$ . In the surjection, conditions are "miss element i" and "miss element j," leading to $(n - 2)^{m}$ . Always model the combined condition logically.
Sign Errors in the Alternating Sum: The formula alternates, starting with a positive sum of single sets. A useful mnemonic: the sign for a term involving $k$ sets is $(- 1)^{k + 1}$ . When applying the principle to count a desired set that is the complement of a union (as in derangements), remember to subtract the union's size from the total universe.
Over-generalizing Without Care: The principle requires finitely many sets. Applying it to infinite unions or in probabilistic contexts (where it becomes the principle of inclusion-exclusion for probabilities) requires attention to additivity and measure-theoretic details.

Summary

The Inclusion-Exclusion Principle corrects for overcounting by providing an alternating sum formula for the size of a union of sets: add singles, subtract pairs, add triples, etc.
Its power is showcased in counting derangements (permutations with no fixed points) and surjections (onto functions), yielding compact formulas connected to $e$ and Stirling numbers.
It provides a direct combinatorial proof for the product formula of the Euler totient function $φ (n)$ .
In graph theory, it leads to a defining formula for the chromatic polynomial, expressing it as a sum over edge subsets weighted by $(- 1)^{∣ F ∣} k^{c (F)}$ .
The principle is a special case of the more general Möbius inversion on partially ordered sets, where the alternating signs are encoded by the poset's Möbius function.

Inclusion-Exclusion Principle and Applications

Inclusion-Exclusion Principle and Applications

Statement and Proof of the Principle

Application to Derangements

Counting Surjections and the Stirling Connection

Computing the Euler Totient Function

Chromatic Polynomials of Graphs

Möbius Inversion as a Generalization

Common Pitfalls

Summary

Write better notes with AI