Monopoly RL Agent

A Reinforcement Learning Environment and a DQN Agent for Monopoly Game Strategy Optimization

About it

This project implements a comprehensive reinforcement learning environment for the Monopoly board game, featuring a novel hybrid Deep Q-Network (DQN) architecture that combines algorithmic strategies with learned behaviors. The system was developed as part of a bachelor's thesis project to explore advanced AI decision-making in complex, multi-agent, stochastic environments.

Key Innovations

Novel Hybrid Architecture: The project introduces a groundbreaking approach where multiple specialized neural networks handle different decision types, while algorithmic agents provide expert knowledge for complex scenarios with large action spaces.
Multi-Network DQN Design: Unlike traditional single-network approaches, this implementation uses separate networks for each major action type (property management, trading, financial decisions), enabling focused learning and improved decision accuracy.
Expert Learning Integration: The system implements supervised pre-training using expert agents, followed by reinforcement learning through environmental interaction, significantly reducing training time and improving convergence.
Optimized Game Simulation: Custom-built Monopoly environment with faithful rule implementation, optimized for fast simulation and RL training with proper state validation and error handling.

Achievement Highlights

90% win rate against the parent algorithmic agent
5.3% point improvement in 12-agent round-robin tournaments compared to the baseline model
Faithful reproduction of official Monopoly rules with RL-friendly adaptations
Real-time game visualization with React frontend
Comprehensive tournament management system for agent comparison
Support for multiple agent types: random, algorithmic, strategic variants, and advanced DQN
Ideal for research, benchmarking, and educational purposes in multi-agent reinforcement learning

How to use it

Prerequisites

Python 3.10+ (developed and tested with Python 3.10.15)
macOS with MPS support (for GPU acceleration) or Windows/Linux with CUDA
Node.js 16+ and npm for the frontend interface

NOTE: For this you will need to have conda installed. If you do not have it, you can install it by following the instructions from the official documentation.

Backend Setup (Python Environment)

1. Create and activate conda environment:

conda create -n monopoly-rl python=3.10
conda activate monopoly-rl

2. Install dependencies:

# Install requirements
pip install -r requirements.txt

# For macOS with MPS (GPU acceleration)
conda install -c apple tensorflow-deps
pip install tensorflow-macos==2.10.0
pip install tensorflow-metal==0.6.0

# For Windows/Linux (CPU/CUDA)
pip install tensorflow

Frontend Setup (React Interface)

1. Install dependencies:

cd frontend
npm install

Quick Start Guide

Play a game against the DQN Agent:

1. Run the script which will take care of everything:

cd src
python main.py

2. Access the interface:

Open your browser and navigate to http://localhost:5173

Train a new DQN agent:

For training a new agent, please refer to the training scripts, found in the dqn folder. There you can find training scripts for each of the specialized networks, ehich you can configure however you want.

How it works

Environment Architecture

The system is built with a modular, object-oriented design following software engineering best practices:

Game State Management: Centralized state class that preserves all game attributes (player positions, balances, properties). All state updates are validated to ensure legal moves and maintain game integrity.
Game Manager: Coordinates game logic, player turns, and rule enforcement. Acts as the main controller, interfacing with specialized managers for different game aspects.
Specialized Managers: Modular components handle specific game mechanics: dice rolling, chance/community chest cards, trading, and property management with built-in validation.
Player Base Class: Abstract interface defining callback methods for agent decision-making. Supports multiple agent types with consistent API for easy extensibility.

Hybrid DQN Architecture

---
config:
  theme: neutral
---
graph TD
    GS[Game State] --> SE[State Encoder<br/>100 features]
    SE --> MN[Multiple Specialized Networks]
    subgraph "Specialized Q-Networks"
        MN --> BP[Buy Property<br/>Network]
        MN --> UP[Property Upgrade<br/>Network]
        MN --> FN[Financial Management<br/>Network]
        MN --> JL[Jail Decision<br/>Network]
    end
    BP --> HD{Hybrid Decision<br/>Layer}
    UP --> HD
    FN --> HD
    JL --> HD
    SA[Strategic Agent<br/>Fallback] --> HD
    subgraph "Training Pipeline"
        EL[Expert Learning<br/>Pre-training] --> RL[Reinforcement Learning<br/>Self-play]
        RL --> ER[Experience Replay<br/>Separate buffers]
        ER --> TN[Target Networks<br/>Stable learning]
    end
    HD --> AC[Action]
    EL -.-> BP
    EL -.-> UP
    EL -.-> FN
    EL -.-> JL
    RL -.-> BP
    RL -.-> UP
    RL -.-> FN
    RL -.-> JL
    classDef coreNode fill:#e3f2fd,stroke:#1976d2,stroke-width:3px,color:#000000
    classDef networkNode fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000000
    classDef trainingNode fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000000
    classDef hybridNode fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000000
    class SE,MN coreNode
    class BP,UP,FN,JL networkNode
    class EL,RL,ER,TN trainingNode
    class HD hybridNode

The core innovation lies in the hybrid approach that combines the best of algorithmic and learning-based strategies:

Multi-Network Design

Property Purchase Network: Specialized for buy/pass decisions on unowned properties
Property Management Network: Handles upgrade/downgrade decisions for owned properties
Jail Network: Manages player interactions while in jail, including decisions to pay bail or use a "Get Out of Jail Free" card
Financial Network: Handles mortgage/unmortgage and cash management decisions

Expert Learning Phase

The system uses supervised learning to bootstrap the networks:

Collect gameplay data from expert algorithmic agents
Pre-train each network on relevant decision scenarios
Initialize with expert knowledge to reduce exploration time

Reinforcement Learning Phase

Networks continue learning through environmental interaction:

Experience Replay: Store and sample past experiences for stable learning
Target Networks: Separate target networks for stable Q-value updates
Epsilon-Greedy Exploration: Balance exploration vs exploitation
Reward Shaping: Custom reward functions for different game phases

State Representation and Action Spaces

The environment encodes the complex Monopoly state into a format suitable for neural networks:

State Vector Components:

Player positions, cash, and property ownership (40 properties × 4 players)
Property development levels and mortgage status
Game phase indicators (early, mid, late game)
Recent action history and opponent behavior patterns
Dice roll outcomes and card draw results

Action Space Discretization:

Binary decisions: Buy/Pass, Upgrade/Hold, Accept/Reject trade
Categorical choices: Which properties to develop, mortgage priorities
Hybrid decisions: Algorithmic for complex trades, learned for simple choices

Tournament and Evaluation System

Comprehensive evaluation framework to assess agent performance:

Round-Robin Tournaments: All agents play against each other multiple times
Statistical Analysis: Win rates, average game length, financial performance
Strategy Analysis: Property acquisition patterns, trading behavior
Performance Visualization: Real-time dashboard showing training progress

Tech specs

Development Environment

Language: Python 3.10.15
AI Framework: TensorFlow 2.10.0 with Keras
GPU Acceleration: TensorFlow Metal 0.6.0 (macOS), CUDA (Windows/Linux)
Frontend: React 18.3.1 with Vite 6.0.1
API: FastAPI for backend services
Styling: Tailwind CSS 3.4.16

Machine Learning Architecture

Algorithm: Deep Q-Networks (DQN) with experience replay
Network Architecture: Multiple specialized networks for different action types
Training Paradigm: Hybrid approach combining supervised learning and reinforcement learning
Optimization: Adam optimizer with learning rate scheduling
Memory Management: Experience replay buffer

System Requirement on which the project was developed

Component	Minimum
RAM	16GB (shared memory)
CPU	10-core
Storage	10GB
GPU	M2 Pro GPU(16 cores)
OS	macOS Squoia 15.4.1

Performance Optimizations

Vectorized Operations: NumPy and TensorFlow operations for fast computation
Batch Processing: Efficient batch training with configurable batch sizes
Memory Pooling: Reuse of game state objects to reduce garbage collection
Parallel Simulation: Multiprocessing for tournament execution
Model Optimization: TensorFlow model optimization for deployment

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
data		data
frontend		frontend
latex-documentation		latex-documentation
misc		misc
src		src
.gitignore		.gitignore
Lucrare_de_licenta_Ocnaru_Mihai-Octavian.pdf		Lucrare_de_licenta_Ocnaru_Mihai-Octavian.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Monopoly RL Agent

A Reinforcement Learning Environment and a DQN Agent for Monopoly Game Strategy Optimization

About it

Key Innovations

Achievement Highlights

How to use it

Prerequisites

Backend Setup (Python Environment)

Frontend Setup (React Interface)

Quick Start Guide

How it works

Environment Architecture

Hybrid DQN Architecture

Multi-Network Design

Expert Learning Phase

Reinforcement Learning Phase

State Representation and Action Spaces

Tournament and Evaluation System

Tech specs

Development Environment

Machine Learning Architecture

System Requirement on which the project was developed

Performance Optimizations

About

Uh oh!

Releases

Packages

Languages

w-i-l/monopoly-reinforcement-learning-agent

Folders and files

Latest commit

History

Repository files navigation

Monopoly RL Agent

A Reinforcement Learning Environment and a DQN Agent for Monopoly Game Strategy Optimization

About it

Key Innovations

Achievement Highlights

How to use it

Prerequisites

Backend Setup (Python Environment)

Frontend Setup (React Interface)

Quick Start Guide

How it works

Environment Architecture

Hybrid DQN Architecture

Multi-Network Design

Expert Learning Phase

Reinforcement Learning Phase

State Representation and Action Spaces

Tournament and Evaluation System

Tech specs

Development Environment

Machine Learning Architecture

System Requirement on which the project was developed

Performance Optimizations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages