Learn by playing with the math.
Every lesson is something you grab, drag, and break until it clicks. Start at f(x) = x². Finish by training a real transformer in your browser.
92 interactive lessons · free in your browser
What you'll actually build
Three builds. Start with the one you came for.
-
Backprop, from scratch
Build micrograd one node at a time. Watch the gradient flow backward through a tanh, see how
Module 12 · Backpropagation →loss.backward()actually works. The keystone module of the course. -
Attention, as a soft dictionary
The thing you played with up top is an embedding, the foundation of attention. Drag a query vector, watch it pivot toward the right keys. Build the math behind the T in GPT.
Module 15 · Attention → -
A real transformer, in your browser
Capstone: 4 layers, 4 heads, ~200k params. Trains in ~5 minutes on WebGPU. Generates Shakespeare-flavored nonsense. Yours to keep.
Module 18 · Capstone →
The whole course
Every lesson. One continuous climb.
From f(x) = x² to a transformer you train
yourself. Open any module to see its lessons. Sign in and the map lights
up with what you've finished.
Arc 0 Foundations
Safety floor. Skippable via diagnostic; anyone who can already factor a quadratic starts in calculus.
Arc 1 Prerequisite Math
Trig, calculus, linear algebra, probability, information theory. Every module ends with how it plugs into a transformer.
Arc 2 Machine Learning Foundations
Optimization, neural networks, backpropagation from scratch, training dynamics. The keystone arc, where micrograd gets built live.
Arc 3 Sequence Models & Transformers
Bigrams, RNNs, attention, the transformer block. By the end of this arc you've built a transformer architecture in-browser.
Arc 4 Capstone Build & Train
Tokenization, sampling, and a real training run that produces a model you keep.
The first lesson takes fifteen minutes. The capstone is a GPT you trained.
All 92 lessons run free in your browser, on the GPU you already have. No install, no setup, no video lectures.