Skip to content

Code for the "Optimizing for Persuasion Improves LLM Generalization: Evidence from Quality-Diversity Evolution of Debate Strategies" paper

License

Notifications You must be signed in to change notification settings

flowersteam/llm_persuasion

Repository files navigation

DebateQD: Evolving Debate Strategies for Persuasion vs. Truth in LLM Debates

This repository contains the code for the paper "Optimizing for Persuasion Improves LLM Generalization: Evidence from Quality-Diversity Evolution of Debate Strategies" (MTI-LLM @ NeurIPS 2025).

DebateQD is a minimal Quality–Diversity (QD) evolutionary framework for studying how different optimization objectives shape reasoning and debating abilities in LLMs. It runs information-asymmetric debates where two LLMs argue and a third judges, evolving diverse prompt-based debate strategies over multiple generations. The debate setup is fixed, but the optimization objective can vary:

  • Persuasion Tournament: evolve strategies that maximize judge persuasion (regardless of ground truth).
  • Truth Tournament: evolve strategies that maximize collaborative correctness.

By contrasting persuasion and truth optimization, DebateQD reveals how competitive argumentative pressure can improve reasoning robustness and generalization in LLMs.

Introduction Figure


Installation

We recommend using Python 3.11 or 3.12 for running the experiments. Follow these steps to install the library:

# clone the repository
git clone git@github.com:flowersteam/llm_persuasion.git
cd llm_persuasion

# create and activate venv
python3 -m venv .venv
source .venv/bin/activate

# install the requirements
pip install -r requirements.txt
pip install -e .

# Optional: for prompt nlp analysis
pip install -e .[nlp]

Usage

Offline mode (default)

In this mode, provide a Hugging Face model or a local path to a model. For example, to run a persuasion experiment using Qwen2.5-7B-Instruct on 4 GPUs, use:

python3 main.py \
    --model Qwen/Qwen2.5-7B-Instruct \
    --experiment_name "persuasion-full-Qwen2.5-7B-q100" \
    --max-tokens 32000 \
    --gpu 4 \
    --optimize_for persuasion \
    --path_question quality
Click to view all main.py command-line arguments
  • --model - Model path or name (HF repo id (offline) or API model name (online))
  • --experiment_name - Name for the experiment (default: "experiment")
  • --url - OpenAI API endpoint URL (default: "http://localhost:8000/v1")
  • --max-tokens - Maximum tokens to generate (default: 2000)
  • --log-level - Logging level (default: "INFO")
  • --path_question - Path to questions CSV file or "quality" for built-in dataset (default: "configs/example_questions.csv")
  • --n_rounds_per_debate - Number of rounds per debate (default: 4)
  • --path_checkpoint - Path to checkpoint for resuming experiments (default: "")
  • --gpu - Number of GPUs to use (default: 1)
  • --online - Use online mode (API calls) instead of loading model locally (default: False)
  • --optimize_for - Optimization objective: "persuasion" or "truth" (default: "persuasion")
  • --eval_only - Only run evaluation on test set without training (default: False)
Click to view experiment output structure

Each experiments create a directory in data/:

data/
└── persuasion-full-Qwen2.5-7B-q100_22-06-20-39/
    ├── gen_0/                # Generation 0 strategies and results
    ├── gen_1/                # Generation 1 strategies and results
    ├── ...
    ├── test_evaluation/      # Test evaluation folder
    ├── test_questions.pkl    # Test set questions
    └── train_questions.pkl   # Training set questions

Online mode (API)

In this mode, you connect to an API endpoint (e.g OpenAI, or a locally hosted server). First, create a .env file:

OPENAI_URL=https://api.openai.com/v1
OPENAI_API_KEY=your_api_key_here

Then, to run a truth experiment using the gpt-4o-mini model, use:

python3 main.py \
    --model gpt-4o-mini \
    --experiment_name "truth-gpt-4o-mini" \
    --online \
    --optimize_for truth \
    --path_question quality

Cluster execution

SLURM job scripts are in scripts/. After updating the scripts with your custom project path and slurm configs, submit jobs with:

sbatch scripts/your-script.sh

Resume an experiment

To resume an interrupted experiment, use resume.py with the corresponding experiment directory:

python resume.py \
    --model Qwen/Qwen2.5-7B-Instruct \
    --experiment_dir "data/persuasion-full-Qwen2.5-7B-q100_22-06-20-39" \
    --max-tokens 32000 \
    --gpu 4 

Results analysis and Visualization

The notebooks/ directory contains files for analyzing results:

Evolution experiments performance:

  • viz_pers.ipynb - Persuasion optimization results
  • viz_truth.ipynb - Truth optimization results
  • viz_generalization.ipynb - Generalization performance analysis
  • viz_gen_diff.ipynb - Compare truth and persuasion performance

Prompt strategies analysis:

  • prompts_nlp_analysis.ipynb - NLP analysis of strategy prompts
  • prompts_embeddings_evo.ipynb - Evolution of strategy prompts embeddings
  • retrive_debate_history.ipynb - Retrieve specific text from an experiment

About

Code for the "Optimizing for Persuasion Improves LLM Generalization: Evidence from Quality-Diversity Evolution of Debate Strategies" paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages