Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何提高微调seaco_paraformer模型时的GPU利用率 #2232

Open
JVfisher opened this issue Nov 26, 2024 · 4 comments
Open

如何提高微调seaco_paraformer模型时的GPU利用率 #2232

JVfisher opened this issue Nov 26, 2024 · 4 comments
Labels
question Further information is requested

Comments

@JVfisher
Copy link

JVfisher commented Nov 26, 2024

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

使用单卡,12万条语音微调seaco_paraformer模型时,dataset_conf.num_workers设置为16,cpu利用率100%,但显卡的利用率一直在5%-20%之间波动,显存占用95%,如何提高显存利用率?

Code

torchrun
--nnodes 1
--nproc_per_node ${gpu_num}
/mnt/data/finetunedir/FunASR/funasr/bin/train_ds.py
++model="${model_name_or_model_dir}"
++train_data_set_list="${train_data}"
++valid_data_set_list="${val_data}"
++dataset="AudioDatasetHotword"
++dataset_conf.index_ds="IndexDSJsonl"
++dataset_conf.data_split_num=1
++dataset_conf.batch_sampler="BatchSampler"
++dataset_conf.batch_size=7500
++dataset_conf.max_token_length=2000
++dataset_conf.batch_type="token"
++dataset_conf.num_workers=16
++train_conf.max_epoch=60
++train_conf.log_interval=1
++train_conf.resume=true
++train_conf.validate_interval=8000
++train_conf.save_checkpoint_interval=8000
++train_conf.avg_keep_nbest_models_type='loss'
++train_conf.keep_nbest_models=10
++optim_conf.lr=0.0002
++output_dir="${output_dir}" &> ${log_file}

What have you tried?

已经尝试过增加或者减少num_workers,没有效果

What's your environment?

  • OS (e.g., Linux): ubuntu
  • FunASR Version (e.g., 1.0.0): 1.1.6
  • How you installed funasr (pip, source): source
  • GPU (e.g., V100M32) 4070
  • CUDA/cuDNN version (e.g., cuda11.7): 12.5
@JVfisher JVfisher added the question Further information is requested label Nov 26, 2024
@rumi-taorui
Copy link

在modelscope框架中使用日语asr推理的时候GPU利用率只有15%左右,显存使用不是很多,sfmn推理时候的GPU利用率只有 3%左右,不知道这个GPU利用率还能不能设置更高,以增加推理速度。

@R1ckShi
Copy link
Collaborator

R1ckShi commented Dec 5, 2024

seaco和普通paraformer训练的区别在于seaco需要在训练时dataloader内部随机截取热词 可能这一步cpu处理的慢了导致gpu利用率上不去

@JVfisher
Copy link
Author

JVfisher commented Dec 6, 2024

seaco和普通paraformer训练的区别在于seaco需要在训练时dataloader内部随机截取热词 可能这一步cpu处理的慢了导致gpu利用率上不去

感谢回复!很有可能是这个原因,我把seaco做两阶段训练,第一阶段冻结了热词相关的"bias_encoder", "seaco_decoder", "hotword_output_layer"层,cpu利用率100%,显存占用很高90%以上,但gpu利用率在5-20%,一直上不去,第二阶段冻结了asr部分,只微调热词"bias_encoder", "seaco_decoder", "hotword_output_layer"层,显存不到40%,但gpu利用率有60%-70%

目前我尝试过提高worker num的数量,也没影响,是不是可以通过其他参数提高asr微调阶段的gpu利用率?

@R1ckShi
Copy link
Collaborator

R1ckShi commented Dec 6, 2024

可能还是需要看一下finetune普通paraformer的时候的利用率是多少来分析是不是热词dataloader的问题,之前确实没有关注过gpu利用率

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants