Linear Algebra: Markov Chains and Stochastic Matrices

Markov chains provide a powerful framework for modeling systems that transition randomly between states, from predicting weather patterns to ranking web pages. At their core, they transform complex probabilistic dynamics into a problem of linear algebra, where matrices become tools for forecasting long-term behavior. Mastering this connection equips you to analyze everything from game algorithms to population genetics with mathematical precision.

Probability Vectors and Stochastic Matrices

The foundation of any Markov model is the concept of state. A probability vector is a column vector whose entries are non-negative and sum to 1. Each entry $v_{i}$ represents the probability of being in state $i$ . For example, if a weather model has states "Sunny" and "Rainy," the vector $v = [0.7 0.3]$ means a 70% chance of sun and 30% chance of rain now.

The engine of change is the stochastic matrix (or transition matrix). This is a square matrix where every column is a probability vector. Each entry $P_{ij}$ represents the probability of moving to state $i$ from state $j$ in one step. Crucially, columns sum to 1. If our weather model states that a sunny day is followed by another sunny day 80% of the time and a rainy day 20% of the time, while a rainy day is followed by a sunny day 50% of the time and rain 50% of the time, the stochastic matrix is: $P = [0.8 0.2 0.5 0.5] .$ Here, $P_{11} = 0.8$ is the probability of going from Sunny (column 1) to Sunny (row 1).

State Transition Diagrams and Step-by-Step Evolution

A state transition diagram is a weighted graph that visually represents the Markov chain. Each state is a node, and directed edges show possible transitions, labeled with their probabilities. For our matrix $P$ , the diagram has two nodes (Sunny, Rainy). An edge from Sunny to Sunny has weight 0.8, from Sunny to Rainy has weight 0.2, from Rainy to Sunny has weight 0.5, and from Rainy to Rainy has weight 0.5. This diagram makes the process intuitive and helps verify that outgoing probabilities from each node sum to 1.

To compute the probability distribution after one step, you use matrix multiplication. If the current state is given by probability vector $v$ , then the distribution after one step is $P v$ . Starting with a sunny day, $v_{0} = [10]$ . The forecast for tomorrow is: $v_{1} = P v_{0} = [0.8 0.2 0.5 0.5] [10] = [0.8 0.2] .$ For the day after tomorrow, you apply $P$ again: $v_{2} = P v_{1} = P^{2} v_{0}$ . This reveals the core principle: the probability distribution after $n$ steps is given by $v_{n} = P^{n} v_{0}$ . Computing long-term behavior via matrix powers becomes a central task, often done by diagonalization or finding the steady state directly.

Steady-State Distributions

For many Markov chains, as $n$ grows large, $P^{n}$ converges to a matrix where all columns are identical. This column is the steady-state distribution, often denoted $q$ . It represents the long-run probability of being in each state, independent of the starting point. Mathematically, $q$ satisfies $P q = q$ , making it an eigenvector of $P$ with eigenvalue 1. It is also a probability vector (entries sum to 1).

To find $q = [q_{1} q_{2}]$ for our weather model, solve $P q = q$ : $[0.8 0.2 0.5 0.5] [q_{1} q_{2}] = [q_{1} q_{2}] .$ This gives the system: $0.8 q_{1} + 0.5 q_{2} = q_{1}$ and $0.2 q_{1} + 0.5 q_{2} = q_{2}$ . Both simplify to $0.2 q_{1} = 0.5 q_{2}$ , or $2 q_{1} = 5 q_{2}$ . Using the probability condition $q_{1} + q_{2} = 1$ , we substitute $q_{2} = (2/5) q_{1}$ to get $q_{1} + (2/5) q_{1} = 1$ , so $(7/5) q_{1} = 1$ . Thus, $q_{1} = 5/7 \approx 0.714$ and $q_{2} = 2/7 \approx 0.286$ . In the long run, about 71.4% of days are sunny.

Absorbing Markov Chains and Their Analysis

An absorbing Markov chain has at least one absorbing state—a state that, once entered, cannot be left (its transition probability to itself is 1). Other states are called transient. A classic example is a simple gambling game where you go broke or reach a target wealth; both are absorbing states.

To analyze these, we rearrange the transition matrix into canonical form: $P = [I R 0 Q] .$ Here, $I$ is an identity matrix for the absorbing states, $Q$ governs transitions between transient states, and $R$ covers transitions from transient to absorbing states. The fundamental matrix $N = (I - Q)^{- 1}$ is key. Its entry $N_{ij}$ gives the expected number of visits to transient state $j$ starting from transient state $i$ before absorption. Furthermore, $NR$ gives the probabilities of being absorbed in each absorbing state.

Applications to Web Ranking and Population Modeling

The power of Markov chains shines in diverse applications. Web ranking, as in the original PageRank algorithm, models a "random surfer" who clicks links at random. Each webpage is a state. The stochastic matrix $P$ has entries $P_{ij}$ representing the probability of moving from page $j$ to page $i$ . A steady-state vector $q$ then ranks pages by their long-run visitation probability, interpreted as importance. To handle pages with no outgoing links (absorbing states), the model adds a "teleportation" factor, ensuring a unique steady state.

In population modeling, Markov chains can track genetic traits, species migration, or disease states. For instance, a population might be divided into "Susceptible," "Infected," and "Recovered" compartments in a simplified SIR model. A stochastic matrix can model the weekly probabilities of individuals moving between these health states. Analyzing the steady state or absorption times (like time until an epidemic ends) provides critical insights for public health planning.

Common Pitfalls

Misinterpreting matrix entries: The most common error is confusing rows and columns. Remember: in a stochastic matrix $P$ , the entry $P_{ij}$ is the probability of going to state $i$ (row) from state $j$ (column). Writing the matrix with rows as "from" and columns as "to" will invert all your calculations.

Correction: Always define your matrix explicitly as $P = [p_{ij}]$ where $p_{ij} = P (next state is i ∣ current state is j)$ . Verify that each column sums to 1.

Assuming a unique steady state always exists: Not all Markov chains have a single, well-defined steady-state distribution. Chains that are periodic or have multiple disconnected communicating classes may not converge.

Correction: Check if the chain is ergodic (irreducible and aperiodic). For an ergodic chain, a unique steady-state vector is guaranteed. For absorbing chains, look for the fundamental matrix and absorption probabilities instead.

Incorrectly solving for the steady state: When solving $P q = q$ , it's tempting to treat it as a standard homogeneous system $(P - I) q = 0$ . However, $P - I$ is always singular (since columns sum to zero), leading to infinite solutions.

Correction: Use the equation $P q = q$ along with the normalization condition $q_{1} + q_{2} + ... + q_{n} = 1$ . Replace one of the equations from $P q = q$ with this normalization condition to get a unique, invertible system.

Neglecting to check for absorption in applications: When modeling real-world systems like web graphs, failing to account for dangling nodes (pages with no links) can create absorbing states and break the model.

Correction: As in PageRank, incorporate a damping factor or teleportation probability. This adds a chance to jump to any random page, making the adjusted transition matrix ergodic and ensuring a meaningful steady state.

Summary

Markov chains model systems that transition randomly between states using probability vectors to describe current state distributions and stochastic matrices (columns sum to 1) to define transition probabilities.
The state distribution after $n$ steps is computed by $v_{n} = P^{n} v_{0}$ , and long-term behavior is analyzed by finding the steady-state distribution $q$ , an eigenvector of $P$ with eigenvalue 1.
Absorbing Markov chains contain states that cannot be left; their analysis uses the canonical form and the fundamental matrix $N = (I - Q)^{- 1}$ to calculate expected visits and absorption probabilities.
These tools are applied in foundational algorithms like web ranking (PageRank), where the steady-state vector determines page importance, and in population modeling for genetics, ecology, and epidemiology.
Success requires careful attention to the definition of the transition matrix, conditions for steady-state existence, and practical adjustments (like teleportation) to ensure models are robust and analytically tractable.

Linear Algebra: Markov Chains and Stochastic Matrices

Linear Algebra: Markov Chains and Stochastic Matrices

Probability Vectors and Stochastic Matrices

State Transition Diagrams and Step-by-Step Evolution

Steady-State Distributions

Absorbing Markov Chains and Their Analysis

Applications to Web Ranking and Population Modeling

Common Pitfalls

Summary

Write better notes with AI