Probelm with PyTorch knn #165

KatharinaSchmidt · 2020-08-06T07:23:36Z

Hey,

thanks for the available code for DenseFusion!
I want to use it for my own synthetic dataset (created with NDDS), but I've got some problems getting started with DenseFusion.

My System:

Ubuntu 18.02
Cuda 10.2
Python 3.6.9
PyTorch 0.4.1
Torchvision 0.2.2

The error:
When running ./experiments/scripts/eval_linemod.sh I get the following error message:

+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python3 ./tools/eval_linemod.py --dataset_root ./datasets/linemod/Linemod_preprocessed --model trained_checkpoints/linemod/pose_model_9_0.01310166542980859.pth --refine_model trained_checkpoints/linemod/pose_refine_model_493_0.006761023565178073.pth
Traceback (most recent call last):
  File "./tools/eval_linemod.py", line 20, in <module>
    from lib.loss import Loss
  File "/home/katharina/Schreibtisch/DenseFusion/lib/loss.py", line 9, in <module>
    from lib.knn.__init__ import KNearestNeighbor
  File "/home/katharina/Schreibtisch/DenseFusion/lib/knn/__init__.py", line 7, in <module>
    from lib.knn import knn_pytorch as knn_pytorch
  File "/home/katharina/Schreibtisch/DenseFusion/lib/knn/knn_pytorch/__init__.py", line 3, in <module>
    from ._knn_pytorch import lib as _lib, ffi as _ffi
ImportError: /home/katharina/Schreibtisch/DenseFusion/lib/knn/knn_pytorch/_knn_pytorch.so: undefined symbol: state

I don't where that error comes from. Can you help me with that issue please?

The text was updated successfully, but these errors were encountered:

KatharinaSchmidt · 2020-08-06T10:00:33Z

solved it by using PyTorch 1.6.0 and the branch for PyTorch 1.0

drapado · 2020-08-18T10:57:38Z

Hi @KatharinaSchmidt , I'm curious, how did you manage to make it run with Pytorch 1.6.0? Could you give some hints?
Because I'm running the branch PyTorch 1.0 with Pytorch 1.2.0. However, if I try Pytorch 1.6.0 I get a similar error as yours:
Traceback (most recent call last): File "./tools/train.py", line 30, in <module> from lib.loss import Loss File "/mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/loss.py", line 9, in <module> from lib.knn.__init__ import KNearestNeighbor File "/mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/knn/__init__.py", line 7, in <module> from lib.knn import knn_pytorch as knn_pytorch ImportError: /mnt/data/code/Repos/phd4_david/world_modeling/code/utils/densefusion/lib/knn/knn_pytorch.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c1011CPUTensorIdEv

Thanks! :D

KatharinaSchmidt · 2020-08-19T08:10:37Z

I rebuilt the knn module like in #33
Afterwards there were some minor fixed needed (but I don't remember them all).

drapado · 2020-08-19T16:33:56Z

@KatharinaSchmidt could you share the lib/knn folder? Because when trying to rebuild the knn module with pytorch 1.6.0 and cuda 10.2 I face errors regarding THCState_getCurrentStream.
Thanks!! :D

KatharinaSchmidt · 2020-08-19T17:47:31Z

I think I never had an error message containing THCState_getCurrentStream
very sorry, but I can't give you any advice.
My lib/knn folder contains:

build
dist
knn_pytorch
__pycache__
src
__init__.py
knn_pytorch.so
setup.py

drapado · 2020-08-19T18:42:10Z

@KatharinaSchmidt thanks a lot for your help, I managed to solve it and build the knn lib. However, I'm facing a problem I already know it existed with newer versions of pytorch:
Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)
Did you face this error? How did you solve it?

drapado · 2020-08-19T19:34:01Z

I managed to solve it. Now it runs for me on Pytorch 1.6.0 and CUDA 10.2. I created a pull request in case someona can make use of my changes

Destinycjk · 2022-07-24T14:18:56Z

Hello! @drapado Thank you very much for your contribution! I aslo face the same problem /code/DenseFusion-Pytorch/lib/knn/knn_pytorch.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_10E I follow the steps as you introduced in #170. Do you know how to solve the problem? Thank you!

KatharinaSchmidt closed this as completed Aug 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probelm with PyTorch knn #165

Probelm with PyTorch knn #165

KatharinaSchmidt commented Aug 6, 2020

KatharinaSchmidt commented Aug 6, 2020

drapado commented Aug 18, 2020 •

edited

Loading

KatharinaSchmidt commented Aug 19, 2020

drapado commented Aug 19, 2020

KatharinaSchmidt commented Aug 19, 2020

drapado commented Aug 19, 2020

drapado commented Aug 19, 2020

Destinycjk commented Jul 24, 2022

Probelm with PyTorch knn #165

Probelm with PyTorch knn #165

Comments

KatharinaSchmidt commented Aug 6, 2020

KatharinaSchmidt commented Aug 6, 2020

drapado commented Aug 18, 2020 • edited Loading

KatharinaSchmidt commented Aug 19, 2020

drapado commented Aug 19, 2020

KatharinaSchmidt commented Aug 19, 2020

drapado commented Aug 19, 2020

drapado commented Aug 19, 2020

Destinycjk commented Jul 24, 2022

drapado commented Aug 18, 2020 •

edited

Loading