Ancestral lineages merge backward in time; the gene tree of a population sample
Looking backward in time from a sample of n gene copies in a population of size N_e, the probability that two lineages share a common ancestor in any previous generation is 1/(2N_e) (for diploids). Kingman (1982) showed that in the limit n/N_e → 0, the process converges to the coalescent: when there are k lineages, the waiting time until the next coalescence is Exponential(k(k−1)/2) measured in units of 2N_e generations.
Under the infinite sites mutation model, mutations fall on the tree at rate μ per generation per lineage. Segregating sites S ~ Poisson(θ × H_{n-1}) where θ = 4N_eμ and H_{n-1} = Σ 1/i is the (n−1)-th harmonic number. The site frequency spectrum (SFS) shows how many mutations appear in exactly i copies — under neutrality, E[ξ_i] = θ/i (Tajima 1983). Deviations from this spectrum detect selection, population expansion, or bottlenecks.