Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probelm with PyTorch knn #165

Closed
KatharinaSchmidt opened this issue Aug 6, 2020 · 8 comments
Closed

Probelm with PyTorch knn #165

KatharinaSchmidt opened this issue Aug 6, 2020 · 8 comments

Comments

@KatharinaSchmidt
Copy link

Hey,

thanks for the available code for DenseFusion!
I want to use it for my own synthetic dataset (created with NDDS), but I've got some problems getting started with DenseFusion.

My System:

  • Ubuntu 18.02
  • Cuda 10.2
  • Python 3.6.9
  • PyTorch 0.4.1
  • Torchvision 0.2.2

The error:
When running ./experiments/scripts/eval_linemod.sh I get the following error message:

+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python3 ./tools/eval_linemod.py --dataset_root ./datasets/linemod/Linemod_preprocessed --model trained_checkpoints/linemod/pose_model_9_0.01310166542980859.pth --refine_model trained_checkpoints/linemod/pose_refine_model_493_0.006761023565178073.pth
Traceback (most recent call last):
  File "./tools/eval_linemod.py", line 20, in <module>
    from lib.loss import Loss
  File "/home/katharina/Schreibtisch/DenseFusion/lib/loss.py", line 9, in <module>
    from lib.knn.__init__ import KNearestNeighbor
  File "/home/katharina/Schreibtisch/DenseFusion/lib/knn/__init__.py", line 7, in <module>
    from lib.knn import knn_pytorch as knn_pytorch
  File "/home/katharina/Schreibtisch/DenseFusion/lib/knn/knn_pytorch/__init__.py", line 3, in <module>
    from ._knn_pytorch import lib as _lib, ffi as _ffi
ImportError: /home/katharina/Schreibtisch/DenseFusion/lib/knn/knn_pytorch/_knn_pytorch.so: undefined symbol: state

I don't where that error comes from. Can you help me with that issue please?

@KatharinaSchmidt
Copy link
Author

solved it by using PyTorch 1.6.0 and the branch for PyTorch 1.0

@drapado
Copy link

drapado commented Aug 18, 2020

Hi @KatharinaSchmidt , I'm curious, how did you manage to make it run with Pytorch 1.6.0? Could you give some hints?
Because I'm running the branch PyTorch 1.0 with Pytorch 1.2.0. However, if I try Pytorch 1.6.0 I get a similar error as yours:
Traceback (most recent call last): File "./tools/train.py", line 30, in <module> from lib.loss import Loss File "/mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/loss.py", line 9, in <module> from lib.knn.__init__ import KNearestNeighbor File "/mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/knn/__init__.py", line 7, in <module> from lib.knn import knn_pytorch as knn_pytorch ImportError: /mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/knn/knn_pytorch.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c1011CPUTensorIdEv

Thanks! :D

@KatharinaSchmidt
Copy link
Author

I rebuilt the knn module like in #33
Afterwards there were some minor fixed needed (but I don't remember them all).

@drapado
Copy link

drapado commented Aug 19, 2020

@KatharinaSchmidt could you share the lib/knn folder? Because when trying to rebuild the knn module with pytorch 1.6.0 and cuda 10.2 I face errors regarding THCState_getCurrentStream.
Thanks!! :D

@KatharinaSchmidt
Copy link
Author

I think I never had an error message containing THCState_getCurrentStream
very sorry, but I can't give you any advice.
My lib/knn folder contains:

  • build
  • dist
  • knn_pytorch
  • __pycache__
  • src
  • __init__.py
  • knn_pytorch.so
  • setup.py

@drapado
Copy link

drapado commented Aug 19, 2020

@KatharinaSchmidt thanks a lot for your help, I managed to solve it and build the knn lib. However, I'm facing a problem I already know it existed with newer versions of pytorch:
Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)
Did you face this error? How did you solve it?

@drapado
Copy link

drapado commented Aug 19, 2020

I managed to solve it. Now it runs for me on Pytorch 1.6.0 and CUDA 10.2. I created a pull request in case someona can make use of my changes

@Destinycjk
Copy link

Hello! @drapado Thank you very much for your contribution! I aslo face the same problem /code/DenseFusion-Pytorch/lib/knn/knn_pytorch.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_10E I follow the steps as you introduced in #170. Do you know how to solve the problem? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants