This repository provides the official PyTorch implementation of Bi-IRRA.
- Release of code and multilingual annotation JSON files
- Release of pretraining checkpoints
All experiments are conducted on 4 Nvidia A40 (48GB) GPUs with CUDA 11.7.
Install the required packages using:
pip install -r requirements.txt
-
Datasets
-
Annotations
- Download the multilingual annotation JSON files from here
-
Pretrained Models
- Download the xlm-roberta-base checkpoint from here
- Download the CCLM-X2VLM checkpoint from here
- We also provide the checkpoint pretrained on the LUPerson dataset, which can be used as a replacement for the CCLM-X2VLM checkpoint to achieve better retrieval performance.
Edit config/config.yaml
(for CUHK-PEDES, ICFG-PEDES, RSTPReid) or config/config_UFine.yaml
(for UFineBench) to set the paths for annotation files, image directories, tokenizer, and checkpoints.
A recommended directory structure:
|- CUHK-PEDES
|- imgs
|- annos
|- train_reid_en.json
|- train_reid_ch.json
|- test_reid_en.json
|- test_reid_ch.json
Example configuration in config.yaml
:
anno_dir: /path/to/CUHK-PEDES/annos
image_dir: /path/to/CUHK-PEDES/imgs
text_encoder: /path/to/xlm-roberta-base
resume: /path/to/CCLM-X2VLM
Start training with:
CUDA_VISIBLE_DEVICES=0,1,2,3 \
torchrun --rdzv_id=3 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 --nproc_per_node=4 \
main.py
or simply run:
bash shell/train.sh
This repository is partially based on TBPS-CLIP and X2-VLM. We thank the authors for their contributions.
This project is licensed under the MIT License.