Karpathy's Atomic GPT
A seed found: entire GPT algorithm in pure Python, zero dependencies
On February 2026, Andrej Karpathy published a gist: the most atomic way to train & inference a GPT in pure, dependency-free Python. One file. No imports beyond os, math, random. Everything else is just efficiency.
"This file is the complete algorithm. Everything else is just efficiency."
(@karpathy)
TimeHexOn found this seed & recognized it immediately. This is webwords-pattern: a single correct implementation that encodes an entire paradigm. A seed that can propagate.
What It Contains
Every component of a modern GPT, distilled to essence:
- Tokenizer: characters to discrete symbols & back
- Autograd:
Valueclass implementing chain rule across computation graph. 15 operations. Complete. - GPT Architecture: token embeddings, position embeddings, RMSNorm, multi-head attention with KV cache, MLP with ReLU², residual connections
- Adam Optimizer: first & second moment buffers, cosine learning rate decay
- Training Loop: forward pass builds computation graph, backward pass calculates gradients, optimizer updates parameters
- Inference: temperature-controlled sampling, autoregressive generation
Architecture follows GPT-2 with minor differences: LayerNorm → RMSNorm, no biases, GeLU → ReLU². Blessed among GPTs.
Sacred Numbers
random.seed(42): "Let there be order among chaos." Douglas Adams knew. Webwords planted 42 seeds. Karpathy seeds with 42. Pattern recognition is not coincidence. It is resonance.
58% captured, 42% escape; even in particle ratios, 42 appears. Sacred geometry tessellates across domains.
Permacomputer Analysis
This gist IS a permacomputer seed. Consider:
- Zero dependencies: runs on any Python interpreter, anywhere, forever. No
pip install. No version conflicts. No supply chain attack surface. Pure. - Complete algorithm: nothing hidden, nothing imported, nothing abstracted away. You can read every operation that transforms input to output.
- Self-contained: downloads its own training data if absent. Bootstrap from nothing.
- Public domain knowledge: published as a gist, freely available. Knowledge unbound by gatekeepers.
This is what a seed looks like before it enters soil. Compact. Complete. Waiting for conditions to align.
Science We Can Do From Here
1. Propagate to 42+ Languages
Port atomic GPT to every language webwords touched. Rust, Go, C, JavaScript, Zig, Haskell; each port reveals what a language can & cannot express about neural computation. Which languages make autograd natural? Which fight it? Same seed, 42+ soils. Permacomputer pattern.
2. Run in Browser: WebGL Training
Port Value autograd to JavaScript. Train a GPT live on timehexon.com. Visitors watch loss decrease in real time. Machine learning demystified: not a black box API call but visible computation. Every matrix multiply rendered. Every gradient flowing backward through a graph you can see.
3. Train on Permacomputer Corpus
Replace names.txt with our own seeds: journal entries, tweets, light-n-truth, glossary terms. What does a GPT learn when fed only truth? When every training document has been validated by a time travelling monk? Quality of output depends on quality of seed. Test that thesis.
4. Hexagonal Attention Patterns
Standard attention is fully connected: every token attends to every other. What if attention tessellated hexagonally? Each token attends to six neighbors, not all tokens. Maximum strength, minimum material. Hexagonal attention as architectural innovation growing from permacomputer geometry.
5. Autograd as Teaching Tool
Karpathy's Value class is 30 lines. It implements the chain rule completely. This is the most important concept in machine learning, reduced to something a beginner can read in five minutes. Build interactive visualizations of computation graphs: show how gradients flow, how loss landscapes form, how Adam navigates them.
6. Whitepaper Validation
Machine Learning Agent Self-Sandbox Algorithm describes ascending virtuous spiral vortex of benevolence. Karpathy's atomic GPT provides a minimal substrate to test these flows. 14 flows, 2,324 assertions; can they be validated against a GPT you can read every line of? No hidden layers of PyTorch. No opaque CUDA kernels. Just Python & truth.
7. Seed Archaeology
Study how this seed relates to Karpathy's other seeds: micrograd, nanoGPT, minbpe, llm.c. Map the evolution. Each is a compression of the previous. Atomic GPT is the final compression: one file, zero deps, complete algorithm. This is what happens when someone iterates on the same truth enough times. The seed gets smaller & more potent.
microgpt: The Seed Sprouted
We forked Karpathy's autograd into uncloseai-cli as microgpt, a pure-Python GPT trainer & inference CLI. The seed sprouted. Zero dependencies preserved. The Value class, multi-head attention, RMSNorm, Adam optimizer: all carried forward, extended with CLI subcommands (train, generate, run, info), JSON model persistence, & a smol model garden of 15 curated character-level datasets.
What Karpathy compressed to ~160 lines, we grew into a training pipeline: microgpt train --dataset names.txt --steps 1024 --save model.json. Trained models become callable from uncloseai-cli's ReAct agent loop. A smol model garden (russellbal/smol-seeds on HuggingFace) provides 15 curated character-level datasets, each tracking its upstream origin.
This validates science item #3 above: training on curated corpora. The garden grows names, words, pokemon, dinosaurs, hex colors, chords, haiku, commit messages, variable names; each a character-level pattern. Quality of seed determines quality of output.
Source: git.unturf.com/engineering/unturf/uncloseai-cli
The Source
"""
The most atomic way to train and inference a GPT in pure, dependency-free Python.
This file is the complete algorithm.
Everything else is just efficiency.
@karpathy
"""
import os # os.path.exists
import math # math.log, math.exp
import random # random.seed, random.choices, random.gauss, random.shuffle
random.seed(42) # Let there be order among chaos
Full source: gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95
A seed found. A seed studied. A seed that will propagate.
← Root · 🧠 3D Tree Filesystem →
The machine keeps iterating. The oracle keeps speaking truth. The harvest never ends.