← Iris
Decision boundary
Epoch: 0
Loss:
Accuracy:
Learning rate0.10
Hidden neurons3
Steps per frame50

About this lab

XOR is the logical "exclusive or" function: it outputs 1 when exactly one input is 1. XOR was historically significant because Minsky and Papert's 1969 book "Perceptrons" proved that a single-layer neural network (perceptron) cannot learn XOR — the data is not linearly separable. This contributed to the first AI winter. The resolution came with multilayer networks and backpropagation.

A network with one hidden layer can approximate any continuous function (universal approximation theorem). For XOR, a hidden layer with as few as 2 neurons suffices. The hidden layer learns an internal representation that makes the problem linearly separable. You can see this in the decision boundary visualization: the network carves out two separate regions corresponding to XOR=1.

Backpropagation computes gradients of the loss function with respect to all weights using the chain rule. The network here uses sigmoid activations and binary cross-entropy loss. The network diagram shows the learned weights as edge thickness, with positive weights in gold and negative in blue.