Automating the Optimal Privacy Budget Selection for Differential Privacy in Federated Learning Environments
An innovative epsilon-aware strategy that dynamically adapts the privacy-utility trade-off, eliminating the need for manual tuning.
The Privacy-Utility Dilemma
The core challenge in Differentially Private Federated Learning (DP-FL) is setting the privacy budget, epsilon (ε).
Low Epsilon (ε)
Provides strong privacy by adding more noise to the data. However, this high level of noise can significantly reduce the model's accuracy, making it less useful.
High Epsilon (ε)
Leads to better model accuracy by adding less noise. The trade-off is a weaker privacy guarantee, increasing the risk of exposing sensitive information.
Finding the perfect balance is critical, but manual tuning is inefficient and rarely optimal.
Our Solution: The Epsilon-Aware Strategy
We've developed an adaptive system that automates the selection of the optimal epsilon in each round of federated training. Here’s how it works:

How It Works: The Algorithm Explained
Our algorithm finds the best epsilon by treating the selection as a mini-optimization problem after each round of federated learning.
Step 1: Candidate Evaluation & Proxy Training
After aggregating client updates, the server creates several copies (clones) of the new global model. Each clone is paired with a different candidate epsilon (e.g., 1.0, 2.0, 5.0). The server then trains each clone for a few epochs on a small, public proxy dataset to quickly estimate its performance.
Step 2: Performance Measurement & Normalization
The performance (F1-score) of each trained clone is measured. To make a fair comparison between performance and privacy cost, both the F1-scores and the epsilon values are normalized to a common scale of [0, 1].
Step 3: Weighted Scoring Calculation
A custom score is calculated for each candidate using a weighted formula. This formula is designed to reward high model performance while penalizing high epsilon values (weaker privacy).
score = (w1 * Norm_F1) - (w2 * Norm_Epsilon)
Here, `w1` and `w2` are weights that can be tuned to prioritize either model utility or privacy stringency.
Step 4: Selection and Distribution
The candidate epsilon that achieves the highest final score is selected as the optimal choice for the next round. This winning epsilon is then distributed to all clients along with the updated global model, ensuring the system adapts for the next iteration of training.
Key Results
Our experiments demonstrate the effectiveness of the adaptive system.
Dynamic Selection
The optimal epsilon dynamically adapted each round, typically converging between 0.5 and 2.0.
Peak Performance
The system consistently found that ε = 2.0 offered the best balance of accuracy and privacy in early rounds.
Practical Efficiency
The selection process adds a consistent and manageable computational overhead, making it viable for real-world use.
Technology Stack
PyTorch
Flower
Opacus
Python
Citation & Resources
If you use this work in your research, please cite our paper.
The BibTeX entry will be added here once the paper is published.
}