-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPPI ARM Binaries Issue in RPi4 #4380
Comments
I think some videos here would be more illustrative. I'm not entirely sure I understand what you're describing 😥 How is the path created? Can you reproduce this on the
Not that I think this is the issue, but woof, I'd love to hear how well this actually works on a Jetson Nano. That's got to be eating your CPU alive.
Here nor there for the ticket, but I'm be concerned with these settings moving that fast |
Thanks for your answer, 1 day of troubleshooting later, I found something really strange I would like to share with you. I tried several configurations based on Gazebo and the result is different depending on the FollowPath plugin I use and if I run navigation nodes on my laptop (ubuntu 22.04) or on the Jetson Nano (yocto based using kirkstone). To be more accurate :
Steps to reproduce the NOK :
movie.mp4For your information,
Based on all these observations, any idea where to explore ?
To be honest, we first try to make it works and then we will assess performances, CPU usage, how we need to downgrade performance and how it works in a production environment. I can come back to you with our conclusions post-assessment.
Same behavior |
Did you try compiling MPPI from source and still have the same crash on the RPi? Getting a backtrace on the crash would be helpful to see what's failing https://docs.nav2.org/tutorials/docs/get_backtrace.html We had an issue long ago where binaries would cause a crash due to incompatible build flags on build farm's computers relative to what normal x86 machines had (#3767) and curious if the same is happening now for ARM and we need to find what instructions might not exist. Read through that thread in detail for some information and troubleshooting methods that we evaluated during it that is helpful. Giving me your Wrt the 180 deg issue, @pepisg was trouble shooting some Jetson MPPI issue and I don't think he ever sent me his final report or how we could address it. Might be worth putting your heads together or if this is the same issue that he is thinking about. What version of Nav2 are you using when compiling from source? How are you getting binaries and what version are those?
Well the crash vs the '180' issue are two very different things, so be specific. |
Hi! I found a similar problem a while ago while building nav2 from source on iron / ARM: The trajectories generated by the controller looked odd, did not seem to try to follow the path even w/o obstacles and only the PathFollow critic active, also the optimal trajectory did not seem to be sampled from the generated trajectories. I think it's the same problem reported here . I started progressively rolling back changes from #4174 and was able to trace the bug down to the integrateStateVelocities function in optimizer.cpp, particularly to these changes. I ended up rolling back the PR until having more time to dig deeper. |
Raspberry issue
Compiling from source solve the issue on the RPi (I use branch 1.1.14, the version of the latest binaries for humble).
Here it is. I guess I can't have line numbers because it is based on binary installation... Don't hesitate if you have an idea on how to provide additional information.
I don't know if any other test is relevant ? The issue seems to be in the "configure" method. Any idea ?
Jetson issue
I tried to build nav2_mppi_controller on the jetson nano directly without using yocto to check if the issue is yocto related and run into an issue you can find here : mppi-build-jeton-error.txt (the error is huge so I don't know how to share it another way). |
RPi
That looks like the issue from the previous ticket I linked to. Any important flags look missing between your CPU and the build farm's? https://build.ros2.org/job/Hbin_ujv8_uJv8__nav2_mppi_controller__ubuntu_jammy_arm64__binary/43/consoleFull#console-section-2 Seems like a flag in the build farm is being used that isn't valid for the RPi just like we were having with AVX before with AMD64. We can remove that build flag and re-release and that should be that hopefully. JetsonI'm not going to dig into custom setups with meta-ros / non-standard rosdep installs of dependencies. There's too many things that can go wrong specific to your situation. @pepisg are you on a Jetson for your issues or are you on another AMR based SOM? It would be worth looking into the diff that Pedro sent though and see if changing those lines back fixes your problem. That would tell us that this is the same instantiation of the previous issue vs something specific to your Yocto setup. That's something we can dig into more together. |
@SteveMacenski yeah I'm on a jetson AGX |
RPiHere are the flags on the build farm CPU and not on the RPi (all the flags on the RPi CPU are on the build farm CPU) :
JetsonI tried to use BUT I solved the issue by removing nav2-mppi-controller recipe from Yocto (and consequently its dependencies, xtl, xtensor and xsimd) and installing everything directly on the generated distro from sources using the right versions for xtl (0.7.2), xsimd (7.6.0) and xtensor (0.23.10). I suspect an issue related to versions used by Yocto (xtl 0.7.7, xtensor 0.24.7 and xsimd 11.2.0). I will try to use older versions in Yocto and see if it solves the issue (if so, I will create a PR on meta-ros directly). |
RPi@nuclearsandwich I don't suppose you are aware already of any RPi-build-farm specific problematic interactions in compiler settings? @avanmalleghem Its worth looking over that list ( JetsonOk, seems like then not a problem that we can resolve and you have your answer onto the versions and whatnot to solve that part! |
@avanmalleghem any update on the build flags and issues? |
To be honest, I don't know how to proceed. I can do so but I need some guidelines/links to follow. |
Some pattern that might help:
A way to speed that up would be that if you compile on the Jetson and transfer to the RPi compiling with debug flags, you can get the exact instruction that is failing with GDB and you can look up where that comes from. The Nav2 tutorial will get you the first mile with compiling with GDB and getting a backtrace https://docs.nav2.org/tutorials/docs/get_backtrace.html and other documentation can show you how to get the instruction that failed in an illegal instruction seg fault in GDB. I'll say that the flags that imply or specifically mention "simd" in their names make me suspicious. Does RPi support simd? If not in general, that could point to a potential itll-never-work issue. What compiler flags does the RPi4 have in general (any of those AVX/SIMD)? |
Negative. Although our ARM builds are run on AWS Graviton instances so we don't have any RPi hardware on the official build farm. |
Two things:
Thanks Steven, I didn't think you had RPis in the farm, but didn't know if you had some other RPi-specific reported issues with binaries before that rhymed with this. |
Hi I have the following laptop spec: We run everything in a nix shell, but build the whole nav2 stack locally from source (with CXX 17). I can reproduce it with the versions 1.1.13, 1.1.14 and 1.1.15. (xsimd-11.1.0, xtensor-0.24.7, xtl-0.7.5) with version 1.1.14 |
just chiming as I also see this issue on a pi4 running 22.04 + Humble, nav2 fails to launch with the same error using MPPI, controller_server dies on launch. After compiling from sources running all fine. Seems like same issue as #4380 also. |
@aatb-ch happy to have the help - if you look up this thread I lay out the items needed to debug where this is coming from to potentially resolve for the binaries. I don't have an RPi4 to reproduce, so someone with one that needs to use it will need to help here if we want to make any progress :-) |
I'm not using RPi4 but I have one so I can try to continue troubleshooting this issue.
To be sure I compile with C++17, I build MPPI on the Orin Nano using
How can I compile without atomicity ? I tried to but can't find the right way to do so. I guess some compile flags but which one ? |
This should permanently go away with #4621 once merged, as it completely removes xtensor in favor of Eigen |
Steps to reproduce issue
I use MPPI Controller to navigate with my real robot and observe a really strange behavior. For the sake of this issue, I removed obstacle layer, the velocity smoother and I send goal where only linear velocity is needed. It is a differential drive robot.
Here is my nav2 configuration for controller_server :
The text was updated successfully, but these errors were encountered: