Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(localization): add nerf_based_localizer #5312

Conversation

SakodaShintaro
Copy link
Contributor

@SakodaShintaro SakodaShintaro commented Oct 15, 2023

Description

Add the NeRF Based Localizer package.

nerf_result_movie.mp4

See README for details

https://github.com/SakodaShintaro/autoware.universe/tree/feat/add_nerf_based_localizer/localization/nerf_based_localizer

Tests performed

It has been confirmed that logging_simulator works with the trained data and rosbag shown in README.md.

A quantitative comparison in rosbag for ar_tag_based_localizer is shown in the table below.

Method Mean Error[m] Notes
NeRF Based Localizer 0.123 Initial pose is provided by GT
AR Tag Based Localizer 0.624 Initial pose is provided by GT
YabLoc 0.535 -
NDT 0.055 -

Effects on system behavior

It does not affect autoware unless you download libtorch properly.

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

  • There are no open discussions or they are tracked via tickets.

After all checkboxes are checked, anyone who has write access can merge the PR.

Signed-off-by: Shintaro Sakoda <[email protected]>
@github-actions github-actions bot added type:documentation Creating or refining documentation. (auto-assigned) component:localization Vehicle's position determination in its environment. (auto-assigned) component:launch Launch files, scripts and initialization tools. (auto-assigned) labels Oct 15, 2023
Copy link
Contributor

@kminoda kminoda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🎉

@SakodaShintaro
Copy link
Contributor Author

This pull request will create a new dependency on libtorch, so it is difficult to know how to handle that. I think it could be rejected and closed. It is important that once it went out as a pull request and was discussed.

I have not yet fixed spell check, linter, etc., but if there is a chance that the merge will be possible, I will do.

@kminoda
Copy link
Contributor

kminoda commented Oct 16, 2023

@SakodaShintaro I think that's a reasonable speculation. IMHO it seems OK since CMake won't be built as long as libtorch is not found, but how about creating a Discussion or ask perception team internally?

@KYabuuchi
Copy link
Contributor

In my environment, the build failed due to atomicAdd().
I added the following code into hash_3d_anchored.cu based on this link, and it successfully built.

__device__ float atomicAdd(__half * a, float b)
{
  bool uplo =
    ((unsigned long long)a) &
    2;  // check if the atomic is for the upper or lower 16-bit quantity in the aligned 32-bit item
  unsigned * addr = reinterpret_cast<unsigned *>(
    ((unsigned long long)a) & 0xFFFFFFFFFFFFFFFCULL);  // get the 32-bit aligned address
  unsigned old = *addr;
  unsigned val;
  do {
    val = old;
    float newval = __half2float(__ushort_as_half(
                     uplo ? ((unsigned short)(val >> 16)) : ((unsigned short)(val)))) +
                   b;
    unsigned short newval_s = __half_as_ushort(__float2half(newval));
    unsigned newval_u = val & (uplo ? (0x0FFFFU) : (0xFFFF0000U));
    newval_u |= uplo ? (((unsigned)newval_s) << 16) : (newval_s);
    old = atomicCAS(addr, old, newval_u);
  } while (old != val);
  return __half2float(__ushort_as_half(uplo ? (old >> 16) : (old)));
}

@KYabuuchi
Copy link
Contributor

KYabuuchi commented Oct 16, 2023

In my GPU, I could not run it at 1x speed with the default settings, but it surely performs localization! 👏

@SakodaShintaro
Copy link
Contributor Author

The problem about atomicAdd may be described in the following page.

The 32-bit __half2 floating-point version of atomicAdd() is only supported by devices of compute capability 6.x and higher.

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#atomicadd

It may be possible to speed up the process by changing the following constant from 1024 to 512, but I will try this later as I suspect there are other variables that will need to be changed if this is changed.

@KYabuuchi
Copy link
Contributor

The problem about atomicAdd may be described in the following page.

The 32-bit __half2 floating-point version of atomicAdd() is only supported by devices of compute capability 6.x and higher.
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#atomicadd

My GPU is a GeForce GTX 1080ti with compute capability 6.1, and it seems like atomicAdd() for __half is not available.
Now I understand. Thanks for sharing the information. 👍

It may be possible to speed up the process by changing the following constant from 1024 to 512, but I will try this later as I suspect there are other variables that will need to be changed if this is changed.

This is the result in my environment.

MAX_SAMPLE_PER_RAY Localize Time
1024 200ms
512 140ms
256 100ms

Exposing MAX_SAMPLE_PER_RAY as a YAML file would be convenient for users. 🙏

@SakodaShintaro
Copy link
Contributor Author

Fixed so that sample_num_per_ray can be set from yaml file.

sample_num_per_ray Average Process Time [ms] Average Score (the higher the better)
1024(default) 96.9 152.4
512 59.8 147.2
256 36.2 131.8

Accuracy is somewhat reduced, but speed is improved.

@Gatsby23
Copy link

Wonderful work!. Is there some reference paper about this nerf-based localization methods?

@SakodaShintaro
Copy link
Contributor Author

@Gatsby23
Thank you for your interest!

NeRF Part

The code added in this pull request is primarily based on the implementation of F2-NeRF. You can find more information here: https://totoro97.github.io/projects/f2-nerf/

F2-NeRF can be summarized as Instant-NGP enhanced with spatial partitioning via Octree and Warping.

However, this pull request omits the spatial partitioning and Warping using Octree, a key innovation of F2-NeRF, due to the following two drawbacks:

  • The initialization process requires about a minute, leading to a time-consuming startup each time.
    It was challenging to backpropagate gradients to the Camera pose for pose optimization.
  • Therefore, the process implemented here is almost equivalent to that of Instant-NGP.

NeRF-based Localization Part

This pull request supports two methods of using NeRF for self-localization:

  • Scattering self-pose in a Monte Carlo fashion and scoring them by comparing the image reconstructed by NeRF with the actual image.
  • Calculating self-pose through differentiation when reconstructing with NeRF.

Neither is a complete reproduction of the paper, but rather a reference of ideas.

I hope this information will help you.

@Gatsby23
Copy link

@Gatsby23 Thank you for your interest!

NeRF Part

The code added in this pull request is primarily based on the implementation of F2-NeRF. You can find more information here: https://totoro97.github.io/projects/f2-nerf/

F2-NeRF can be summarized as Instant-NGP enhanced with spatial partitioning via Octree and Warping.

However, this pull request omits the spatial partitioning and Warping using Octree, a key innovation of F2-NeRF, due to the following two drawbacks:

  • The initialization process requires about a minute, leading to a time-consuming startup each time.
    It was challenging to backpropagate gradients to the Camera pose for pose optimization.
  • Therefore, the process implemented here is almost equivalent to that of Instant-NGP.

NeRF-based Localization Part

This pull request supports two methods of using NeRF for self-localization:

  • Scattering self-pose in a Monte Carlo fashion and scoring them by comparing the image reconstructed by NeRF with the actual image.

  • Calculating self-pose through differentiation when reconstructing with NeRF.

Neither is a complete reproduction of the paper, but rather a reference of ideas.

I hope this information will help you.

That's really impressive. Beacuase two of the reference paper is worked at a small space. It's really interesting it can worked on a large environment.

@SakodaShintaro
Copy link
Contributor Author

Thank you very much. However, it can be said that this method is still in the validation stage, as it has not been fully evaluated to see if it can cover such a large area as a huge city.

NeRF and Gaussian Splatting have developed remarkably in recent years, and there is a possibility that their performance will be enhanced by incorporating methods such as neuralsim and DrivingGaussian.

I look forward to further developments in this area.

Copy link

github-actions bot commented Jun 28, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

Copy link

stale bot commented Oct 1, 2024

This pull request has been automatically marked as stale because it has not had recent activity.

@stale stale bot added the status:stale Inactive or outdated issues. (auto-assigned) label Oct 1, 2024
@SakodaShintaro
Copy link
Contributor Author

In recent years, techniques such as 3D Gaussian Splatting have emerged, and we believe it would be better to start implementing it with offline tools rather than starting with the Autoware runtime, so we are closing this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:launch Launch files, scripts and initialization tools. (auto-assigned) component:localization Vehicle's position determination in its environment. (auto-assigned) status:stale Inactive or outdated issues. (auto-assigned) type:documentation Creating or refining documentation. (auto-assigned)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants