nanochat is the simplest experimental harness for training large language models (LLMs). Created by Andrej Karpathy, it is designed to run on a single GPU node with minimal, hackable code that covers all major LLM stages.

Philosophy

nanochat is not an exhaustively configurable LLM "framework." There are no giant configuration objects, model factories, or if-then-else monsters. It is a single, cohesive, minimal, readable, hackable, maximally-forkable strong baseline codebase designed to run start to end and produce a ChatGPT model you can talk to.

Goals

Origin

The name derives from nanoGPT, Karpathy's earlier project that only covered pretraining. nanochat is also inspired by modded-nanoGPT, which gamified nanoGPT with clear metrics and a leaderboard. Both are available on GitHub.

Community

nanochat has an active community. Use the GitHub Discussions for questions, the Issues tab for bugs, and join the #nanochat Discord channel for real-time chat. Researchers can contribute to improving the time-to-GPT-2 metric—see contributing.

Try it

You can try nanochat at nanochat.karpathy.ai, or clone the repo from GitHub and train your own model. The speedrun produces a model you can chat with in a familiar web interface.