A small, educational codebase and notebook that implements and documents core concepts behind large language models (LLMs) from first principles. This repository collects notes, math derivations, and simple implementations so you can learn the building blocks of modern LLMs and experiment with training & inference.
This repo is intended for learning and research — not a production LLM.
-
Create and activate a virtual environment
python3 -m venv .venv source .venv/bin/activate.fish -
Install dependencies
pip install -r requirements.txt -
Try the demo (simple runner)
python demo/main.py
-
Open the interactive tutorial
-
The primary notebook is
tutorial.ipynb— open it with Jupyter or JupyterLab.jupyter notebook tutorial.ipynb
-
tutorial.ipynb— the core learning notebook with notes, math, and runnable code.models/— simple model code and experimental model wrappers (e.g.modeling_gemma.py,paligemma.py,siglip.py).training/— training scripts for text and image experiments (train_text.py,train_image.py).demo/— small demo runner (main.py) to exercise parts of the project.data/— data utilities (seedata/clean.py) for cleaning/preparing data used by experiments.checkpoints/— directory to save and load model checkpoints (gitignored large files).explanability/— supporting code that documents explainability concepts used in the repo.
Below is a short, practical summary of what the codebase actually implements and what the tutorial covers:
-
Code implemented (what's done)
- Lightweight transformer components: input embeddings, positional encodings, multi-head self-attention, feed-forward blocks, layer-norm + residual wrappers (see
tutorial.ipynbandmodels/). - Training loop examples and data pipeline for small experiments using the TinyStories dataset (see
training/anddata/). - KV caching and simple quantization helpers for inference efficiency (
tutorial.ipynbshowsKVCache/QuantizedKVCache). - A minimal demo runner in
demo/main.pyto try out model routines locally. - Explainability visualizations and diagnostics (
explanability/four_pillars.py) for data and attention analysis.
- Lightweight transformer components: input embeddings, positional encodings, multi-head self-attention, feed-forward blocks, layer-norm + residual wrappers (see
-
What the tutorial covers (high-level)
- Concept walkthrough: tokenization, embeddings, positional encodings, attention, feed-forward layers, normalization, and the language modeling head.
- Code walkthroughs for each transformer piece with small runnable examples (forward pass, loss computation, simple training on TinyStories).
- Data analysis and explainability: topic modeling, readability metrics, and dataset diagnostics to surface biases or imbalances.
- Practical tips for inference: KV caching, basic quantization, and lightweight optimizations for small-scale experiments.
This repository is aimed at learning and experimentation. If you'd like the README to be even shorter (single-paragraph summary) or to include a one-command demo example, tell me which and I'll adjust it.
- Training scripts are lightweight examples — they are designed for learning and small-scale experiments.
- Check
training/train_text.py --helpfor available flags and configuration options. - Checkpoints and logs are stored under
checkpoints/by default; back them up if you want to keep long runs.
- If you'd like to contribute, open an issue describing the change or improvement.
- Keep changes small and focused; add tests or notebook examples when possible.
- A recent Python 3 (3.8+) installation. See
requirements.txtfor Python packages used in the project.