TD(0) Learning — Iris Lab

States N: 7 α (TD step): 0.10 α_MC (MC step): 0.05 γ: 1.00 Speed: 5 ep/frame

Episodes: 0
TD RMSE: —
MC RMSE: —

Random walk with absorbing boundaries (left=0 reward, right=1 reward). True values are linear 1/(N+1)…N/(N+1).

TD(0)

Monte Carlo

True V