LLM-Based State Abstraction

This repository contains the code for a thesis exploring how large language models can derive state abstractions for a simple grid‑world game. Heavy computation runs in Rust for performance and memory safety, while Python orchestrates configuration, LLM calls and evaluation.

For full docs please visit: dennislent.github.io/llm-abstraction

Project layout

├── src/                # Rust crate with core game logic and utilities
├── llm_abstraction/    # Python package wrapping the Rust library and analysis helpers
├── main.py             # Command line entry point for experiments
├── container/          # Apptainer/Singularity definition used in CI and HPC environments
└── tests/              # Python and Rust tests

Requirements

Python 3.10+
rustup (Nightly toolchain)
Linux (tested on Ubuntu 24.04 LTS & Manjaro 6.12)

Install all dependencies and build the Rust extension with:

./setup.sh

Usage

Configuration lives in config.yml and config_prompts.yml. The CLI exposes several commands:

python main.py preview-prompts                      # print generated prompts
python main.py preview-maps                         # save map PNGs and metadata to outputs/
python main.py mcts                                 # run baseline MCTS agents
python main.py score-prompts -i 0 -m llama2         # score abstractions for a model
python main.py benchmark-llm -i 0 -m llama2         # run MCTS with LLM abstraction
python main.py analysis                             # produce plots and ranking tables

Results are written to the outputs/ directory.

Configuration files

config.yml – specifies grid maps, simulation settings under mcts_variables, and which prompt compositions to use via llm.
config_prompts.yml – defines reusable prompt fragments referenced by config.yml.

The utilities read these files to decide which maps to process, how prompts are assembled and how evaluations run.

Linting

Python style is enforced with flake8. The list of ignored rules and their justification is documented in docs/flake8-ignores.md.

Rust core library

The rust_core Python module exposes the Rust computation routines:

PyRunner – run simulations and MCTS from Python
max_returns and min_turns – compute theoretical bounds for a world
visualize_world_map and visualize_abstraction – render grids and abstractions as PNGs
generate_representations_py – produce JSON, text and adjacency list representations
generate_mdp – build transition and reward matrices with cluster labels

Testing

Run the full test suite (Rust and Python):

cargo test
pytest

Continuous integration additionally runs a small end‑to‑end check that executes the CLI on sample data to ensure the Python and Rust components integrate correctly.

License

This project is released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
container		container
docs		docs
llm_abstraction		llm_abstraction
outputs		outputs
py/utils		py/utils
resources		resources
src		src
tests		tests
.coveragerc		.coveragerc
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
abstraction_cache.json		abstraction_cache.json
check_files.sh		check_files.sh
codecov.yml		codecov.yml
config.yml		config.yml
config_prompts.yml		config_prompts.yml
main.py		main.py
mkdocs.yml		mkdocs.yml
readme.md		readme.md
requirements.txt		requirements.txt
setup.sh		setup.sh
sitecustomize.py		sitecustomize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM-Based State Abstraction

Project layout

Requirements

Usage

Configuration files

Linting

Rust core library

Testing

License

About

Uh oh!

Releases

Packages

Languages

DennisLent/llm-abstraction

Folders and files

Latest commit

History

Repository files navigation

LLM-Based State Abstraction

Project layout

Requirements

Usage

Configuration files

Linting

Rust core library

Testing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages