We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
使用单卡,12万条语音微调seaco_paraformer模型时,dataset_conf.num_workers设置为16,cpu利用率100%,但显卡的利用率一直在5%-20%之间波动,显存占用95%,如何提高显存利用率?
torchrun --nnodes 1 --nproc_per_node ${gpu_num} /mnt/data/finetunedir/FunASR/funasr/bin/train_ds.py ++model="${model_name_or_model_dir}" ++train_data_set_list="${train_data}" ++valid_data_set_list="${val_data}" ++dataset="AudioDatasetHotword" ++dataset_conf.index_ds="IndexDSJsonl" ++dataset_conf.data_split_num=1 ++dataset_conf.batch_sampler="BatchSampler" ++dataset_conf.batch_size=7500 ++dataset_conf.max_token_length=2000 ++dataset_conf.batch_type="token" ++dataset_conf.num_workers=16 ++train_conf.max_epoch=60 ++train_conf.log_interval=1 ++train_conf.resume=true ++train_conf.validate_interval=8000 ++train_conf.save_checkpoint_interval=8000 ++train_conf.avg_keep_nbest_models_type='loss' ++train_conf.keep_nbest_models=10 ++optim_conf.lr=0.0002 ++output_dir="${output_dir}" &> ${log_file}
已经尝试过增加或者减少num_workers,没有效果
pip
The text was updated successfully, but these errors were encountered:
在modelscope框架中使用日语asr推理的时候GPU利用率只有15%左右,显存使用不是很多,sfmn推理时候的GPU利用率只有 3%左右,不知道这个GPU利用率还能不能设置更高,以增加推理速度。
Sorry, something went wrong.
seaco和普通paraformer训练的区别在于seaco需要在训练时dataloader内部随机截取热词 可能这一步cpu处理的慢了导致gpu利用率上不去
感谢回复!很有可能是这个原因,我把seaco做两阶段训练,第一阶段冻结了热词相关的"bias_encoder", "seaco_decoder", "hotword_output_layer"层,cpu利用率100%,显存占用很高90%以上,但gpu利用率在5-20%,一直上不去,第二阶段冻结了asr部分,只微调热词"bias_encoder", "seaco_decoder", "hotword_output_layer"层,显存不到40%,但gpu利用率有60%-70%
目前我尝试过提高worker num的数量,也没影响,是不是可以通过其他参数提高asr微调阶段的gpu利用率?
可能还是需要看一下finetune普通paraformer的时候的利用率是多少来分析是不是热词dataloader的问题,之前确实没有关注过gpu利用率
No branches or pull requests
Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
What is your question?
使用单卡,12万条语音微调seaco_paraformer模型时,dataset_conf.num_workers设置为16,cpu利用率100%,但显卡的利用率一直在5%-20%之间波动,显存占用95%,如何提高显存利用率?
Code
torchrun
--nnodes 1
--nproc_per_node ${gpu_num}
/mnt/data/finetunedir/FunASR/funasr/bin/train_ds.py
++model="${model_name_or_model_dir}"
++train_data_set_list="${train_data}"
++valid_data_set_list="${val_data}"
++dataset="AudioDatasetHotword"
++dataset_conf.index_ds="IndexDSJsonl"
++dataset_conf.data_split_num=1
++dataset_conf.batch_sampler="BatchSampler"
++dataset_conf.batch_size=7500
++dataset_conf.max_token_length=2000
++dataset_conf.batch_type="token"
++dataset_conf.num_workers=16
++train_conf.max_epoch=60
++train_conf.log_interval=1
++train_conf.resume=true
++train_conf.validate_interval=8000
++train_conf.save_checkpoint_interval=8000
++train_conf.avg_keep_nbest_models_type='loss'
++train_conf.keep_nbest_models=10
++optim_conf.lr=0.0002
++output_dir="${output_dir}" &> ${log_file}
What have you tried?
已经尝试过增加或者减少num_workers,没有效果
What's your environment?
pip
, source): sourceThe text was updated successfully, but these errors were encountered: