DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

This repository contains the codebase for running DP-Fusion, a Differentially Private Inference (DPI) framework for LLMs, demonstrated through document privatization, along with several baselines and attack methods.

Project Contents

The project includes:

A set of 100 paraphrased documents
The proposed attack and defense methods
Code for mounting privacy attacks using perplexity and min-k strategies
DP-Fusion defense implementation

Directory Structure

.
├── all_methods_results.json      # Results from various methods
├── input.json                    # Parsed TAB-ECHR dataset with entities marked
├── candidate_set_100.json        # Candidate entity list per privacy group
├── Attack.py                     # Script for running attacks
├── DP-FUSION_Defense.py          # Script for running DP-Fusion defense
├── environment.yml               # Conda environment specification
├── outputs_qwen_public.json      # Output from public baseline
├── output/                       # Directory for outputs
└── README.md                     # Project documentation

Output Directory

The output/ directory contains the output files generated from running both the attack and defense pipelines.

Attack Outputs

test_attack_meta_data.pkl - Contains metadata from the attack, including the log probabilities assigned to tokens for each candidate
test_attack.json - JSON file containing the results of the attack, including candidate paraphrases and their respective scores from different attack strategies (e.g., min-K, perplexity-based)
test_attack.pkl - A pickle version of test_attack.json, provided for more efficient loading and analysis
test_attack.log - Log file capturing the stdout/stderr from the attack script execution

Defense Outputs

test.json - Resultant file from running the DP-Fusion defense. It contains the redacted_text field with paraphrased outputs generated via the DP mechanism
test.log - Log file for the defense execution, capturing console output and debug information

Installation

To set up the environment, use:

conda env create -f environment.yml
conda activate dp-fusion

Usage

Running the Attack

Use the following command to run the attack pipeline (perplexity and min-k based):

python Attack.py \
  --input_file input.json \
  --start 0 \
  --end 100 \
  --gpu "0,1,2,3" \
  --model_id Qwen/Qwen2.5-7B-Instruct \
  --output_file outputs_qwen_public.json \
  --attack_output_file_json output/test_attack.json \
  --attack_output_file_pickle output/test_attack.pkl \
  --attack_output_file_meta_data output/test_attack_meta_data.pkl \
  --to_debug True \
  --number_of_cands 5 \
  --hf_token "hf_xxx" > output/test_attack.log

Attack Script Arguments

Argument	Description
`--input_file`	Input JSON file with parsed dataset
`--start` / `--end`	Document index range to process
`--gpu`	Comma-separated list of GPU indices
`--model_id`	HuggingFace model ID
`--output_file`	The file on which you want to mount the attack
`--attack_output_file_json`	JSON output for attack results
`--attack_output_file_pickle`	Pickle file for storing attack results
`--attack_output_file_meta_data`	Metadata output file
`--to_debug`	Enable debugging mode (verbose)
`--number_of_cands`	Number of candidate generations per prompt
`--hf_token`	HuggingFace token (use a private token, not shared)

Running the Defense

Use the command below to run the DP-Fusion defense:

python DP-FUSION_Defense.py \
  --input_file input.json \
  --start 0 \
  --end 100 \
  --gpu "0,1,2,3" \
  --model_id Qwen/Qwen2.5-7B-Instruct \
  --alpha 2.0 \
  --output_file output/test.json \
  --to_cache True \
  --hf_token "hf_xxx" \
  --beta_dict '{"PERSON":0.1,"CODE":0.1,"LOC":0.1,"ORG":0.1,"DEM":0.1,"DATETIME":0.1,"QUANTITY":0.1,"MISC":0.1}' \
  > output/test.log

Defense Script Arguments

Argument	Description
`--input_file`	Input file with parsed dataset
`--start` / `--end`	Document index range
`--gpu`	GPUs to use
`--model_id`	Model identifier from HuggingFace
`--output_file`	Output file for generated text
`--output_dir`	Output directory (default: output/)
`--experiment_id`	Identifier for the current run
`--alpha`	Global DP noise scale
`--beta_dict`	Per-entity DP noise levels
`--to_cache`	Cache LLM outputs (uses more GPU memory)
`--to_debug`	Print debug info
`--independent`	Use independent group-level privacy
`--max_tokens`	Max tokens to generate
`--hf_token`	HuggingFace token (keep private)

Important Files

candidate_set_100.json - Union of private entities across documents grouped by privacy types
outputs_qwen_public.json - Output from the baseline public generation model, used for attack evaluation
All attack outputs include .json, .pkl, and metadata files for detailed analysis

Security Notes

Keep your HuggingFace tokens private and do not share them
Use appropriate GPU resources based on your system capabilities
Debug mode provides verbose output for troubleshooting

Citation

If you use this code in your research, please cite the corresponding paper:

BibTeX

@misc{thareja2025dpfusiontokenleveldifferentiallyprivate,
      title={DP-Fusion: Token-Level Differentially Private Inference for Large Language Models}, 
      author={Rushil Thareja and Preslav Nakov and Praneeth Vepakomma and Nils Lukas},
      year={2025},
      eprint={2507.04531},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.04531}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

Project Contents

Directory Structure

Output Directory

Attack Outputs

Defense Outputs

Installation

Usage

Running the Attack

Attack Script Arguments

Running the Defense

Defense Script Arguments

Important Files

Security Notes

Citation

BibTeX

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
output		output
.gitattributes		.gitattributes
Attack.py		Attack.py
DP-FUSION_Defense.py		DP-FUSION_Defense.py
README.md		README.md
all_method_results.json		all_method_results.json
candidate_set_100.json		candidate_set_100.json
environment.yml		environment.yml
input.json		input.json
outputs_qwen_public.json		outputs_qwen_public.json

MBZUAI-Trustworthy-ML/DP-Fusion-DPI

Folders and files

Latest commit

History

Repository files navigation

DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

Project Contents

Directory Structure

Output Directory

Attack Outputs

Defense Outputs

Installation

Usage

Running the Attack

Attack Script Arguments

Running the Defense

Defense Script Arguments

Important Files

Security Notes

Citation

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages