Markov Decision Processes

Value Iteration & Policy Iteration on a Grid World

Algorithm Discount γ: 0.90 Grid Size: 6 Edit Mode

Click Run to start
Iterations: 0
Max ΔV: —

Click the grid to edit. Arrows show optimal policy. Color shows state values.

Goal

Penalty

Wall