ValueError: Default process group has not been initialized, please make sure to call init_process_group. #139

RichardMLuu · 2024-05-25T12:17:35Z

您好，我在使用的时候遇到了问题，我发现如果不使用分布式训练就需要修改源代码。直接在命令行中令distributed为False并不能解决该类问题，请问应该如何解决分布式训练带来的问题，只能注释掉相关代码来解决吗。
----translation-----
Hello, I'm having problems with it and I realized that I need to modify the source code if I don't use distributed training. Directly making distributed to False on the command line does not solve this type of problem, how should I solve the problem caused by distributed training, can I only comment out the relevant code to solve it.

----log-----
Traceback (most recent call last):
File "F:\Projects\Multi Modal\ALBEF\Pretrain.py", line 203, in
main(args, config)
File "F:\Projects\Multi Modal\ALBEF\Pretrain.py", line 175, in main
dist.barrier()
File "F:\anaconda3\envs\albef\lib\site-packages\torch\distributed\c10d_logger.py", line 72, in wrapper
return func(*args, **kwargs)
File "F:\anaconda3\envs\albef\lib\site-packages\torch\distributed\distributed_c10d.py", line 3428, in barrier
opts.device = _get_pg_default_device(group)
File "F:\anaconda3\envs\albef\lib\site-packages\torch\distributed\distributed_c10d.py", line 644, in _get_pg_default_device
group = group or _get_default_group()
File "F:\anaconda3\envs\albef\lib\site-packages\torch\distributed\distributed_c10d.py", line 977, in _get_default_group
raise ValueError(
ValueError: Default process group has not been initialized, please make sure to call init_process_group.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #139

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #139

RichardMLuu commented May 25, 2024

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #139

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #139

Comments

RichardMLuu commented May 25, 2024