Contributing to nanochat

The goal of nanochat is to improve the state of the art in micro models that are accessible to work with end-to-end on budgets under $1,000. Contributions that speed up time-to-GPT-2, simplify the codebase, or improve documentation are especially welcome.

Current focus

The most interesting target: speeding up the latency to GPT-2 (i.e., getting a CORE score above 0.256525). Currently ~3 hours. Improving the pretraining stage can reduce this further.

How to contribute

AI disclosure policy

When submitting a PR, please declare any parts that had substantial LLM contribution and that you have not written or do not fully understand. Full transparency helps maintain code quality.

Code philosophy

nanochat is minimal and hackable. Avoid giant config objects, model factories, or complex conditionals. Keep changes cohesive and readable. The goal is a single, coherent pipeline—not a framework. See about for the project philosophy.

Reporting bugs

Use the Issues tab. Include your environment (Python version, PyTorch, GPU), the command you ran, and the full error output. Reproducible bugs get fixed faster.