This repository qualitatively benchmarks generative models.
In its first iteration, it includes the implementations of:
- VAE (Variational Autoencoder)
- GAN (Generative Adversarial Network)
- DDPM (Denoising Diffusion Probabilistic Model)
Currently, the only supported dataset is MNIST (handwritten digits at 28×28 resolution).
Setup a conda
environment using environment.yaml
conda env create -f environment.yml
conda activate gm
pip install -e .
Run training and evaluation with:
python src/main.py --config_file="<PATH-TO-CONFIG.yaml>"
Examples with included configs:
python src/main.py --config_file="assets/config/train_vae.yaml"
python src/main.py --config_file="assets/config/train_gan.yaml"
python src/main.py --config_file="assets/config/train_ddpm.yaml"
All training/evaluation settings are handled via YAML config files (see /assets/config/*
)
Each config controls:
- Dataset name and image resolution
- Model type (vae, gan, ddpm)
- Optimizer settings and training schedule
- Evaluation and sample generation
The repository includes checkpoints for the three trained models under their respective output folders. To evaluate a model without training, modify your config like this:
eval: True
eval_ckpt_path: <path-to-ckpt.pt>
The script will store the resulting image samples_eval.png
on the respective output folder.
VAE | GAN | DDPM |
---|---|---|
![]() |
![]() |
![]() |
- More hyperparameter tuning of GAN.
- Add support for higher-resolution datasets (e.g., CIFAR-10, CelebA)
- Include FID/IS quantitative metrics
- Experiment with GAN and DDPM architectural variants
DDPM Implementation: https://github.com/bot66/MNISTDiffusion