nanochat Guides
Documentation and how-to guides for customizing and extending nanochat. Whether you want to give your model a distinct personality, teach it new skills, or understand the pipeline in depth, these guides have you covered.
Infusing identity
Tune your nanochat's personality through synthetic data generation and mixing that data into the SFT stage. See the Guide: infusing identity to your nanochat in GitHub Discussions.
Example: dev/gen_synthetic_data.py shows synthetic data for identity.
Adding abilities
Learn how to add new abilities to your model. The Guide: counting r in strawberry (and how to add abilities generally) walks through the process.
Original nanochat post
The Oct 13 2025 original nanochat post introduced the project. Some info is deprecated and the model has improved significantly since.
Miniseries v1
Jan 7 miniseries v1 documents the first nanochat miniseries of models and scaling experiments.
Data and packaging
dev/repackage_data_reference.py shows how to generate pretraining data shards. nanochat uses FineWeb-Edu for pretraining and SmolTalk for SFT. See datasets and file structure for the full layout.
Quick links
All guides live in GitHub Discussions. The original nanochat post, miniseries v1 docs, identity guide, and abilities guide are linked from the sections above. For code examples, check dev/gen_synthetic_data.py for synthetic identity data.