Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda error when the depth value is large. #47

Open
jarvishou829 opened this issue Oct 31, 2023 · 0 comments
Open

Cuda error when the depth value is large. #47

jarvishou829 opened this issue Oct 31, 2023 · 0 comments

Comments

@jarvishou829
Copy link

jarvishou829 commented Oct 31, 2023

I record a data sequence by myself and run the code. After processing about 800 frames, the following error appears. It seems that the dim of map_states["voxel_vertex_idx"] and map_states["voxel_center_xyz"] exceeds the num_embeddings in the config file which is set to 20000. When I set the num_embeddings to 40000, after 1400+ frames the error appears again. How can I solve this correctly? I find that when the depth value is large, the dim of map_states["voxel_vertex_idx"] and map_states["voxel_center_xyz"] turns to get large. When I scale the depth value to 0.5 of the origin value the error no longer appears, but the rendered result is not good.

../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [164,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [164,0,0], thread: [124,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [164,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [164,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [164,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Process Process-2:
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/voxfusion/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/user/miniconda3/envs/voxfusion/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/nerf_ws/ori/voxfusion/src/mapping.py", line 128, in spin
    self.do_mapping(share_data, tracked_frame, writer=writer)
  File "/home/user/nerf_ws/ori/voxfusion/src/mapping.py", line 182, in do_mapping
    bundle_adjust_frames(
  File "/home/user/nerf_ws/ori/voxfusion/src/utils/renderer.py", line 496, in bundle_adjust_frames
    final_outputs = render_rays(
  File "/home/user/nerf_ws/ori/voxfusion/src/utils/renderer.py", line 288, in render_rays
    chunk_inputs = get_features(chunk_samples, map_states, voxel_size)
  File "/home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/nerf_ws/ori/voxfusion/src/utils/renderer.py", line 96, in get_features
    point_feats = F.embedding(F.embedding(
  File "/home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/nn/functional.py", line 2199, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered
terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: device-side assert triggered
Exception raised from operator() at ../c10/cuda/CUDACachingAllocator.cpp:1808 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f29bcada20e in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x2759b (0x7f29bcb5559b in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #2: <unknown function> + 0x27621 (0x7f29bcb55621 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x608180 (0x7f29afd28180 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x4669f8 (0x7f29afb869f8 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #5: c10::TensorImpl::release_resources() + 0x175 (0x7f29bcac17a5 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #6: <unknown function> + 0x3628c5 (0x7f29afa828c5 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x67ca08 (0x7f29afd9ca08 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #8: THPVariable_subclass_dealloc(_object*) + 0x2d5 (0x7f29afd9cdd5 in /home/user/miniconda3/envs/voxfusion/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #9: <unknown function> + 0x114b78 (0x55f151578b78 in /home/user/miniconda3/envs/voxfusion/bin/python)
frame #10: <unknown function> + 0x13b248 (0x55f15159f248 in /home/user/miniconda3/envs/voxfusion/bin/python)
frame #11: <unknown function> + 0x121e38 (0x55f151585e38 in /home/user/miniconda3/envs/voxfusion/bin/python)
frame #12: <unknown function> + 0x1330d8 (0x55f1515970d8 in /home/user/miniconda3/envs/voxfusion/bin/python)
frame #13: <unknown function> + 0x1330c1 (0x55f1515970c1 in /home/user/miniconda3/envs/voxfusion/bin/python)
frame #14: <unknown function> + 0x1330c1 (0x55f1515970c1 in /home/user/miniconda3/envs/voxfusion/bin/python)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant