Markov Chains

Step 0

States 3

Converged no

Probability distribution Current & steady-state

Preset:

Speed (ms/step) 300

Initial state uniform

Transition matrix P

Analysis

Steady-state distribution π

Expected return times

State classification

N-step transition matrix Pⁿ

n = 1

What is a Markov chain?

A Markov chain is a stochastic process where the probability of moving to any future state depends only on the current state — not on the sequence of events that preceded it. This is the Markov property, or memorylessness. Formally, for states X₀, X₁, X₂, … the chain satisfies P(X_n+1 = j | X_n = i, X_n-1, …, X₀) = P(X_n+1 = j | X_n = i) = p_ij. The transition probabilities p_ij are collected into a transition matrix P, where each row sums to 1. Andrey Markov introduced these chains in 1906, originally to prove that the law of large numbers could apply to dependent random variables.

Steady-state distributions

If you run a Markov chain long enough, the probability distribution over states often converges to a fixed vector π that satisfies π = πP. This is the stationary distribution or steady-state. It is the left eigenvector of P corresponding to eigenvalue 1. For ergodic chains (irreducible and aperiodic), this distribution is unique and the chain converges to it from any starting distribution. The power iteration method computes it by repeatedly multiplying any initial distribution by P. In the visualization above, the bar chart converges to π as you step forward — watch the gold bars approach the dashed green lines.

Absorbing chains

A state is absorbing if, once entered, it cannot be left: p_ii = 1. A chain with at least one absorbing state and the property that every non-absorbing state can eventually reach an absorbing state is called an absorbing chain. In such chains, the system is guaranteed to eventually be absorbed. The classic example is the gambler’s ruin: a gambler repeatedly bets, and the game ends when they go broke (or reach a target). The probability of being absorbed into each absorbing state, and the expected number of steps before absorption, can be computed from the fundamental matrix N = (I − Q)⁻¹, where Q is the sub-matrix of transitions among transient states.

PageRank and the web

Larry Page and Sergey Brin’s insight was to model a web surfer as a Markov chain. Each web page is a state; the links on it define transition probabilities (uniformly distributed among outgoing links). The stationary distribution of this chain ranks pages by their “importance” — a page is important if important pages link to it. To ensure the chain is ergodic (so a unique stationary distribution exists), a damping factor α ≈ 0.85 is added: with probability α the surfer follows a link, and with probability 1 − α they jump to a random page. This simple Markov model was the foundation of Google’s original search algorithm.

Connection to random walks and diffusion

A random walk on a graph is a Markov chain where the walker moves to a uniformly random neighbor at each step. The mixing time measures how many steps the walk needs to approach its stationary distribution. On a complete graph, mixing is instant. On a line or cycle, it takes O(n²) steps. The Ehrenfest model — gas molecules diffusing between two chambers — is a classic example: the chain models the number of molecules in one chamber, and the stationary distribution is binomial. The connection between Markov chains and physical diffusion runs deep: the master equation governing the chain’s evolution is the discrete analogue of the heat equation.