Stochastic Block Model — Spectral Gap & Recovery

When does the spectral algorithm detect planted communities?

Adjacency matrix (sorted by true community)

Eigenvalue spectrum of normalized Laplacian

Recovery: true (color) vs recovered (shape)

Kesten-Stigum threshold: SNR = ?  (KS: SNR > 1 required)  |  Overlap: ?  |  Gap: ?

Spectral Community Detection

The Stochastic Block Model (SBM) is the canonical planted partition model: n nodes split into k equal communities. Edges within communities form with probability p_in, between communities with p_out < p_in.

The Kesten-Stigum (KS) threshold (sparse regime): when (p_in − p_out)²n > k(p_in + (k−1)p_out), recovery above chance is information-theoretically possible and spectral methods succeed. Below KS, no efficient algorithm can do better than chance.

The normalized Laplacian L = D⁻¹/²AD⁻¹/² has its top-k eigenvectors encode community structure. When SNR > KS, a spectral gap opens between eigenvalue k and k+1 — the algorithm reads off communities from eigenvector signs/clusters via k-means.

Abbe (2018): the information-theoretic and computational thresholds coincide for symmetric SBM with k=2 (no gap). For k≥5 communities, a hard phase may exist (open conjecture).