In-Context Learning — Implicit Gradient Descent
Task:
Linear regression
Binary classification
Translation pattern
Shots:
4
New Task
+1 Shot
Akyürek et al. 2022: Transformers implement gradient descent in their forward pass via attention