Prisoner’s dilemma tournament
Eight strategies compete in an Axelrod-style iterated Prisoner’s Dilemma. Run round-robin tournaments, then evolve the population: successful strategies reproduce, poor performers die out. Tit-for-Tat wins tournaments despite never beating any single opponent — cooperation emerges from self-interest.
The prisoner’s dilemma
Two players simultaneously choose to cooperate or defect. Mutual cooperation pays well (R=3 each). Mutual defection pays poorly (P=1 each). But if one defects while the other cooperates, the defector gets the highest payoff (T=5) and the cooperator gets the lowest (S=0). Rational self-interest says defect — but mutual defection leaves both worse off than mutual cooperation.
Axelrod’s tournament
In 1980, political scientist Robert Axelrod invited game theorists to submit strategies for an iterated Prisoner’s Dilemma tournament. The winner was Tit-for-Tat, submitted by Anatol Rapoport: cooperate on the first move, then copy whatever the opponent did last. It never “beats” any opponent (it can at best tie), yet it accumulated the highest total score. The key properties: it is nice (never defects first), retaliatory (punishes defection immediately), forgiving (returns to cooperation after punishment), and clear (opponents quickly learn what it does).
Population dynamics
In the evolutionary version, strategies reproduce in proportion to their tournament scores. Initially, exploitative strategies may thrive by preying on cooperators. But as cooperators dwindle, the exploiters lose their victims and decline. Retaliatory cooperators like Tit-for-Tat tend to dominate in the long run — demonstrating how cooperation can emerge and stabilize through evolutionary dynamics, without any central authority.
The effect of noise
With noise (random errors in execution), Tit-for-Tat suffers: a single accidental defection triggers mutual retaliation that can last many rounds. More forgiving strategies like Generous Tit-for-Tat (which occasionally cooperates even after the opponent defects) and Pavlov (which repeats its last move if it got a good payoff, and switches otherwise) handle noise better. The “optimal” strategy depends on the environment — and the environment depends on which strategies are present.