t-SNE — Dimensionality Reduction

t-distributed Stochastic Neighbor Embedding: high-dim similarities as Gaussians, low-dim as t-distributions (heavy tails prevent crowding). Watch clusters emerge.

Perplexity: 30
N points: 200
Clusters: 5
High-dim: 10
Iteration: 0 / KL: -
How t-SNE works:
1. Compute pairwise similarities Pᵢⱼ in high-D (Gaussian kernel, σ set by perplexity)

2. Initialize random 2D embedding Qᵢⱼ using t-distribution

3. Gradient descent minimizes KL(P‖Q)

4. Heavy-tailed Q prevents cluster collapse (crowding problem)