CONTRASTIVE LEARNING

Self-supervised representation learning by attracting positives, repelling negatives

0.20
4
8
0.020
Contrastive learning (e.g. SimCLR, MoCo) learns representations without labels by pulling together positive pairs (augmented views of the same image) and pushing apart negative pairs (different images). The NT-Xent loss is L = −log exp(sim(z_i,z_j)/τ) / ∑_{k≠i} exp(sim(z_i,z_k)/τ), where sim is cosine similarity and τ is temperature. Low temperature sharpens the distribution and forces tighter clustering. The key insight from uniformity and alignment analysis (Wang & Isola 2020): good representations are aligned on positives AND uniformly distributed on the hypersphere. The left panel shows the embedding space evolving — watch same-color points cluster together while different classes repel.