Self-Attention Visualizer

Edit tokens • Adjust Q/K/V weights • See attention heatmap and output

Tokens

1.0

Attention Heatmap

Rows = query tokens • Cols = key tokens • Brighter = higher attention

Output Representation

Each token's attended output (weighted sum of values)

Q/K/V Weight Matrices

Head 1
Head 2
Query (Q) weights — click to perturb
Key (K) weights
Value (V) weights
d_model=4, d_k=4 • Attention(Q,K,V)=softmax(QKᵀ/√d)V