-
Notifications
You must be signed in to change notification settings - Fork 830
Closed
Description
已经在huggingface上下载好数据集了,但是在finetuning的时候总是要重新下载,下面是我的命令,已经在其中指定数据集路径:
NPROC_PER_NODE=8
MAX_PIXELS=1003520
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
swift sft
--model cache/hub/Qwen2.5_VL
--model_type qwen2_5_vl
--train_type full
--use_hf True
--dataset cache/hub/datasets--lmms-lab--LLaVA-Video-178K
--torch_dtype bfloat16
--attn_impl flash_attn
--freeze_vit true
--freeze_llm true
--freeze_aligner false
--num_train_epochs 3
--per_device_train_batch_size 2
--learning_rate 5e-6
--gradient_accumulation_steps 8
--eval_steps -1
--save_steps 1000
--save_total_limit 10
--logging_steps 5
--max_length 8192
--output_dir output
--warmup_ratio 0.05
--dataloader_num_workers 4
--dataset_num_proc 8
--deepspeed zero2
每次仍然会重新下载数据集:
- 指定路径名字就报错:
[rank2]: datasets.data_files.EmptyDatasetError: The directory at /group/30105/weizhaoyang/cache/hub/datasets--lmms-lab--LLaVA-Video-178K doesn't contain any data files
[rank0]:[W606 17:26:52.212998651 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) - 指定id名字就重新下载
Metadata
Metadata
Assignees
Labels
No labels