Rotary Positional Embedding (RoPE)

0
1000
RoPE: multiply query/key by rotation matrix R(mθ) — dot product depends only on relative position m−n