Redbench-Eval

Evaluation scripts and benchmarking tools for Redbench

📋 Overview

Redbench-Eval provides evaluation and benchmarking infrastructure for workloads generated by Redbench. This repository includes:

Query Execution: Tools to execute workloads on various database systems (e.g., DuckDB)
Performance Analysis: Scripts to analyze execution traces and caching behavior
Visualization: Jupyter notebooks for generating performance plots and similarity analyses

🚀 Quick Start

Prerequisites

Python 3.11 or higher
uv package manager (recommended) or pip

Installation

Clone the repository

git clone https://github.com/DataManagementLab/Redbench-Eval.git
cd Redbench-Eval

Set up Python environment
```
uv sync
```
Activate the environment
```
source .venv/bin/activate
```
Generate workload files

First, run Redbench to generate the workload files. All necessary data files (including DuckDB databases) will be automatically downloaded.

📊 Usage

Running Benchmarks

DuckDB Workload Execution

Execute generated workloads on DuckDB using the following command:

python src/redbench_eval/duckdb/execute_queries.py \
  --result_dir "../Redbench/output" \
  --dataset imdb \
  --redset_dataset serverless \
  --exp_hash ede5387599ee1e65c105eaa9b17c5c3c \
  --cluster_id 0 \
  --database_id 0 \
  --strategy generation \
  --db_file "../Redbench/output/tmp_generation/imdb/db_augmented_x2.duckdb"

Parameters:

--result_dir: Directory containing Redbench output artifacts
--dataset: Dataset name (e.g., imdb)
--redset_dataset: Redset dataset type (e.g., serverless)
--exp_hash: Experiment hash identifier
--cluster_id: Cluster identifier
--database_id: Database identifier
--strategy: Execution strategy (e.g., generation)
--db_file: Path to the DuckDB database file

Output: The execution trace will be saved to {result_dir}/{dataset}/{redset_dataset}/cluster_{cluster_id}/database_{database_id}/{strategy}_{exp_hash}/run_duckdb.parquet

Generating Visualizations

Redset Similarity Analysis

Run the Jupyter notebook for similarity analysis:

jupyter notebook src/redbench_eval/plots/paper_plots_redset_similarity.ipynb

Speedup and Caching Analysis

Generate speedup plots and caching drilldown visualizations:

jupyter notebook src/redbench_eval/plots/paper_plots.ipynb

Adding New Database Systems

To add support for additional database systems:

Create a new directory under src/redbench_eval/
Implement an execution script similar to duckdb/execute_queries.py
Update the plotting notebooks to include the new system's results

👥 Authors

Johannes Wehrstein - johannes.wehrstein@cs.tu-darmstadt.de

🔗 Related Projects

Redbench - Original workload generator

Note: The DuckDB execution example shown above is provided for reference and was not used in the original paper evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
figures		figures
src/redbench_eval		src/redbench_eval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Redbench-Eval

📋 Overview

🚀 Quick Start

Prerequisites

Installation

📊 Usage

Running Benchmarks

DuckDB Workload Execution

Generating Visualizations

Redset Similarity Analysis

Speedup and Caching Analysis

Adding New Database Systems

👥 Authors

🔗 Related Projects

About

Uh oh!

Languages

License

DataManagementLab/Redbench-Eval

Folders and files

Latest commit

History

Repository files navigation

Redbench-Eval

📋 Overview

🚀 Quick Start

Prerequisites

Installation

📊 Usage

Running Benchmarks

DuckDB Workload Execution

Generating Visualizations

Redset Similarity Analysis

Speedup and Caching Analysis

Adding New Database Systems

👥 Authors

🔗 Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages