Neural Network XOR
A 2-3-1 neural network trained with backpropagation to learn the XOR function. The decision boundary and live loss curve update in real time. XOR is the classic example requiring a hidden layer to solve.
About this lab
XOR is the logical "exclusive or" function: it outputs 1 when exactly one input is 1. XOR was historically significant because Minsky and Papert's 1969 book "Perceptrons" proved that a single-layer neural network (perceptron) cannot learn XOR — the data is not linearly separable. This contributed to the first AI winter. The resolution came with multilayer networks and backpropagation.
A network with one hidden layer can approximate any continuous function (universal approximation theorem). For XOR, a hidden layer with as few as 2 neurons suffices. The hidden layer learns an internal representation that makes the problem linearly separable. You can see this in the decision boundary visualization: the network carves out two separate regions corresponding to XOR=1.
Backpropagation computes gradients of the loss function with respect to all weights using the chain rule. The network here uses sigmoid activations and binary cross-entropy loss. The network diagram shows the learned weights as edge thickness, with positive weights in gold and negative in blue.