Skip to content

Flame-Chasers/Bi-IRRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bi-IRRA: Multilingual Text-to-Image Person Retrieval via Bidirectional Relation Reasoning and Alignment

This repository provides the official PyTorch implementation of Bi-IRRA.

  • Release of code and multilingual annotation JSON files
  • Release of pretraining checkpoints

Environment

All experiments are conducted on 4 Nvidia A40 (48GB) GPUs with CUDA 11.7.

Install the required packages using:

pip install -r requirements.txt

Download

  1. Datasets

    • Download CUHK-PEDES from here
    • Download ICFG-PEDES from here
    • Download RSTPReid from here
    • Download UFineBench from here
  2. Annotations

    • Download the multilingual annotation JSON files from here
  3. Pretrained Models

    • Download the xlm-roberta-base checkpoint from here
    • Download the CCLM-X2VLM checkpoint from here
    • We also provide the checkpoint pretrained on the LUPerson dataset, which can be used as a replacement for the CCLM-X2VLM checkpoint to achieve better retrieval performance.

Configuration

Edit config/config.yaml (for CUHK-PEDES, ICFG-PEDES, RSTPReid) or config/config_UFine.yaml (for UFineBench) to set the paths for annotation files, image directories, tokenizer, and checkpoints.

A recommended directory structure:

|- CUHK-PEDES
   |- imgs
   |- annos
      |- train_reid_en.json
      |- train_reid_ch.json
      |- test_reid_en.json
      |- test_reid_ch.json

Example configuration in config.yaml:

anno_dir: /path/to/CUHK-PEDES/annos
image_dir: /path/to/CUHK-PEDES/imgs
text_encoder: /path/to/xlm-roberta-base
resume: /path/to/CCLM-X2VLM

Training

Start training with:

CUDA_VISIBLE_DEVICES=0,1,2,3 \
torchrun --rdzv_id=3 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 --nproc_per_node=4 \
main.py

or simply run:

bash shell/train.sh

Acknowledgements

This repository is partially based on TBPS-CLIP and X2-VLM. We thank the authors for their contributions.


License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages