Pólya Urn & Reinforcement Learning

Rich-get-richer dynamics — path dependence, power laws, and lock-in

Draws: 0
Colors: 0
Leader share:
New color prob:
The Pólya urn: draw a ball, return it with α extra of the same color — rich-get-richer. The Chinese Restaurant Process (θ parameter) adds new colors: probability θ/(n+θ) of a new color at step n. Together they generate the Pitman-Yor process, producing power-law (Zipf) frequency distributions. Initial randomness locks in permanently: different runs have wildly different winners.