You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi , After I installed tensorflow-allreduce, I tried to run the allreduce-test.py , below is the command and outputs:
$srun --ntasks=1 python allreduce-test.py --train-data train.txt --validation-data valid.txt --vocab vocab.txt --vocab-size 5 --batch-size 64 --max-iterations 10
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
File "allreduce-test.py", line 93, in
MPI_RANK = int(os.environ["PMI_RANK"])
File "/home/deng/anaconda/lib/python2.7/UserDict.py", line 23, in getitem
raise KeyError(key)
KeyError: 'PMI_RANK'
It seems there is no such PMI_RANK environments value. So how should I solve this? Thanks
The text was updated successfully, but these errors were encountered:
@t-brito Can you give your testbed environment ? like OS version, openmpi version ? is the special build option required for openmpi, python version, tensorflow version, cuda version and cudnn version .
I tried the code but fail with multiple different errors
Hi , After I installed tensorflow-allreduce, I tried to run the allreduce-test.py , below is the command and outputs:
$srun --ntasks=1 python allreduce-test.py --train-data train.txt --validation-data valid.txt --vocab vocab.txt --vocab-size 5 --batch-size 64 --max-iterations 10
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
File "allreduce-test.py", line 93, in
MPI_RANK = int(os.environ["PMI_RANK"])
File "/home/deng/anaconda/lib/python2.7/UserDict.py", line 23, in getitem
raise KeyError(key)
KeyError: 'PMI_RANK'
It seems there is no such PMI_RANK environments value. So how should I solve this? Thanks
The text was updated successfully, but these errors were encountered: