Kingman's Coalescent — Genealogy & Neutral Evolution

Ancestral lineages merge backward in time; the gene tree of a population sample

Population Parameters

Coalescent Statistics

TMRCA (in 2N gen)
E[TMRCA] (theory)
Total branch length
Number of mutations
Tajima's D (est.)
Segregating sites S

Kingman's Coalescent (1982)

Looking backward in time from a sample of n gene copies in a population of size N_e, the probability that two lineages share a common ancestor in any previous generation is 1/(2N_e) (for diploids). Kingman (1982) showed that in the limit n/N_e → 0, the process converges to the coalescent: when there are k lineages, the waiting time until the next coalescence is Exponential(k(k−1)/2) measured in units of 2N_e generations.

E[T_k] = 2N_e × 2/(k(k−1)) E[TMRCA] = 2N_e × 2(1 − 1/n)

Under the infinite sites mutation model, mutations fall on the tree at rate μ per generation per lineage. Segregating sites S ~ Poisson(θ × H_{n-1}) where θ = 4N_eμ and H_{n-1} = Σ 1/i is the (n−1)-th harmonic number. The site frequency spectrum (SFS) shows how many mutations appear in exactly i copies — under neutrality, E[ξ_i] = θ/i (Tajima 1983). Deviations from this spectrum detect selection, population expansion, or bottlenecks.