Neural Network — XOR Visualizer

2-2-1 MLP learning XOR via backpropagation — watch weights and decision boundary evolve

Learning rate: 1.0 Activation:

Loss: — | Epoch: 0

Decision Boundary

Network Architecture

Loss Curve

XOR is not linearly separable — a single perceptron cannot solve it, which famously stalled AI research in the 1970s. A hidden layer with nonlinear activations creates a curved decision boundary that separates the XOR classes. Backpropagation computes ∂Loss/∂w via the chain rule, propagating gradients backward through the network. The sigmoid σ(x) = 1/(1+e⁻ˣ) has vanishing gradients for large |x|; tanh is zero-centered; ReLU = max(0,x) avoids saturation.