Automating the Optimal Privacy Budget Selection for Differential Privacy in Federated Learning Environments

This repository contains the official implementation for the research paper on dynamically and automatically selecting the optimal privacy budget (epsilon, ε) in a Differentially Private Federated Learning (DP-FL) system. Our method removes the need for manual tuning by introducing an epsilon-aware strategy that adapts the privacy-utility trade-off in real-time.

📖 Table of Contents

The Challenge: Balancing Privacy and Utility
System Architecture
The Epsilon-Aware Strategy
How It Works: The Algorithm
Getting Started
Key Results
Citation

🎯 The Challenge: Balancing Privacy and Utility

Federated Learning (FL) enables collaborative machine learning without sharing raw data. When combined with Differential Privacy (DP), it offers strong privacy guarantees. However, this introduces the critical privacy budget (ε) parameter.

Low ε: Stronger privacy, but high noise can hurt model accuracy.
High ε: Better accuracy, but weaker privacy guarantees.

Finding the optimal ε is crucial, but manual tuning is inefficient and doesn't adapt to changing conditions during training. This project solves that problem by creating a system that autonomously selects the best ε in each round of federated training.

🏛️ System Architecture

Our system uses a standard client-server FL architecture, built with the following key technologies:

Federated Learning Framework: Flower (flwr) to manage communication and aggregation.
Differential Privacy: Opacus to inject noise and provide DP guarantees during client-side training.
Deep Learning: PyTorch for building and training neural network models.

The core logic is orchestrated by auto_DP_FL.py, which simulates the entire federated network and implements our dynamic epsilon selection strategy.

🤖 The Epsilon-Aware Strategy

Our core innovation is an adaptive, epsilon-aware strategy that intelligently selects the best privacy budget after each round of federated training.

Here is the high-level workflow:

Client Training: Clients train their local models and send updates to the server.
Aggregation & Cloning: The server aggregates the updates using FedAvg. It then creates multiple clones of this new global model.
Candidate Evaluation: Each model clone is assigned a different candidate ε from a predefined list (e.g., [0.5, 1.0, 2.0, 5.0]).
Proxy Training: The server trains each clone for a few epochs on a small, representative proxy dataset. This step is crucial for evaluating how the model performs under different privacy constraints.
Optimal Epsilon Selection: The server calculates an "optimal budget score" for each clone based on its performance (e.g., F1-score) and its assigned ε.
Distribution: The ε that yields the highest score is selected as the optimal budget for the next round of federated training and is sent back to the clients along with the updated global model.

This creates a closed-loop system that continuously adapts the privacy budget based on empirical performance.

System Workflow Diagram

⚙️ How It Works: The Algorithm

The selection of the optimal epsilon is based on a weighted objective function that balances model performance (F1 Score) and the privacy budget (epsilon).

Algorithm: Calculate Optimal Epsilon

Define Objective Function: Create a function to balance utility and privacy.
Normalize Metrics:
- Normalize the F1 scores of all model clones to a [0, 1] range.
- Normalize the candidate epsilon values to a [0, 1] range.
Assign Weights: Define weights w1 (for performance) and w2 (for privacy) based on the specific requirements of the task.
Combine Metrics: Calculate the optimal_budget_score for each candidate:
```
score = (w1 * Normalized_F1) - (w2 * Normalized_Epsilon)
```
Select Best Epsilon: The epsilon corresponding to the highest optimal_budget_score is chosen.
Return: The optimal epsilon for the next round.

🚀 Getting Started

Prerequisites

Python 3.8+
PyTorch
Flower (flwr)
Opacus
NumPy, Pandas, Scikit-learn

Installation

Clone the repository and install the required dependencies.

git clone [https://github.com/fms-faisal/Auto-Optimal-Privacy-Budget-DP-FL.git](https://github.com/fms-faisal/Auto-Optimal-Privacy-Budget-DP-FL.git)
cd Auto-Optimal-Privacy-Budget-DP-FL
pip install torch torchvision pandas numpy scikit-learn opacus flwr tqdm Pillow psutil

Running the Simulation

To start the federated learning process with automatic epsilon selection, run the main script:

python auto_DP_FL.py

During training, it will:

Adjust the fraction of data used by each client per round
Apply differential privacy with dynamic epsilon selection
Log training loss, accuracy, time, and memory usage
Save the final model as dynamic_epsilon_final.pth
Record progress and metrics in training_log_final.txt

📊 Key Results

Our experiments show that the system effectively balances the privacy-utility trade-off:

Dynamic Selection: The optimal epsilon dynamically changes from round to round, typically converging to values between 0.5 and 2.0, demonstrating the system's ability to adapt.
Peak Performance: The optimal budget score consistently peaked at an epsilon of ε = 2.0 in early rounds, achieving the best balance of high accuracy and strong privacy.
Efficiency: The epsilon selection process adds a consistent and manageable computational overhead, making it practical for real-world applications.

📜 Citation

If you use this work in your research, please cite our paper:

The BibTeX entry will be added here once the paper is published.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
AlexNet_results		AlexNet_results
DenseNet-121_results		DenseNet-121_results
EfficientNet_results		EfficientNet_results
GoogleNet_results		GoogleNet_results
RegNetX_results		RegNetX_results
ResNet50_results		ResNet50_results
Resnext_results		Resnext_results
ShuffleNetV2_results		ShuffleNetV2_results
VGG16_results		VGG16_results
Vision_Transformer_results		Vision_Transformer_results
inceptionV3_results		inceptionV3_results
mobileNet_results		mobileNet_results
AlexNet.py		AlexNet.py
CustomCNN_updated.py		CustomCNN_updated.py
DP_FL_with_calculation.py		DP_FL_with_calculation.py
EfficientNet.py		EfficientNet.py
InceptionV3_Updated.py		InceptionV3_Updated.py
MobileNet.py		MobileNet.py
Opacus_EfficientNet.py		Opacus_EfficientNet.py
README.md		README.md
Resnext.py		Resnext.py
ShuffleNetV2.py		ShuffleNetV2.py
VGG16.py		VGG16.py
Vision_Transformer.py		Vision_Transformer.py
auto_DP_FL.py		auto_DP_FL.py
auto_DP_FL_training_log.txt		auto_DP_FL_training_log.txt
automate_DP_FL.py		automate_DP_FL.py
denseNet-121.py		denseNet-121.py
federated_denseNet.py		federated_denseNet.py
federated_round_accuracies.txt		federated_round_accuracies.txt
federated_with_saving.py		federated_with_saving.py
googleNet.py		googleNet.py
inception.py		inception.py
index.html		index.html
light_weight_CNN.py		light_weight_CNN.py
opacus_densenet_epsilon_training_log_more_info.txt		opacus_densenet_epsilon_training_log_more_info.txt
optuna_lightweight.py		optuna_lightweight.py
regNetX.py		regNetX.py
resnet_50_updated.py		resnet_50_updated.py
skinLesionDataset.py		skinLesionDataset.py
stateDiagramCodes.txt		stateDiagramCodes.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automating the Optimal Privacy Budget Selection for Differential Privacy in Federated Learning Environments

📖 Table of Contents

🎯 The Challenge: Balancing Privacy and Utility

🏛️ System Architecture

🤖 The Epsilon-Aware Strategy

System Workflow Diagram

⚙️ How It Works: The Algorithm

🚀 Getting Started

Prerequisites

Installation

Running the Simulation

📊 Key Results

📜 Citation

About

Uh oh!

Releases

Packages

Languages

fms-faisal/Auto-Optimal-Privacy-Budget-DP-FL

Folders and files

Latest commit

History

Repository files navigation

Automating the Optimal Privacy Budget Selection for Differential Privacy in Federated Learning Environments

📖 Table of Contents

🎯 The Challenge: Balancing Privacy and Utility

🏛️ System Architecture

🤖 The Epsilon-Aware Strategy

System Workflow Diagram

⚙️ How It Works: The Algorithm

🚀 Getting Started

Prerequisites

Installation

Running the Simulation

📊 Key Results

📜 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages