Bias-Variance Tradeoff

MSE = Bias² + Variance + Noise. As model complexity increases, bias falls but variance rises — the classic tradeoff before double descent.

3
0.10
10
15
Decomposition: E[(ŷ−y)²] = (E[ŷ]−f*)² + E[(ŷ−E[ŷ])²] + σ²
Left plot: Multiple polynomial fits on different datasets (thin lines) and their mean (bold). The spread is variance; the gap from truth is bias.
Right plot: How bias², variance, and total error change with polynomial degree. The U-shape of total error defines the optimal complexity.