Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_RangeFinder: cleanup rangefinder timeout handling #26528

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

tridge
Copy link
Contributor

@tridge tridge commented Mar 15, 2024

we had inconsistent behaviour across different sensors. This PR moves the timeout logic to the frontend so it is consistent

note also that some drivers set reported distance to zero on timeout. Should we remove that? (eg. mavlink)

this PR was inspired by a log showing the LidarLitev3 (PulsedLight driver) getting "stuck". I think it was giving oscillating values which led the driver to never (or rarely) update. This caused uncontrolled descent/ascent in a copter. That was with 4.3.7 but I think the same vulnerablity is in master

image

there is also a single commit to fix just the pulselightlrf bug, ready for 4.5.x

this PR also fixes quite a few subtle bugs

@tridge tridge added the BUG label Mar 15, 2024
@tridge tridge requested a review from peterbarker March 15, 2024 21:39
Comment on lines 62 to 63
const uint32_t last_update_ms = state.last_reading_ms;
if (AP_HAL::millis() - last_update_ms >= AP_RANGEFINDER_MSP_TIMEOUT_MS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure why you're making the temporary variable, here and in other places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is that we need to be sure that the last_reading_ms is older than the result of millis(). If we put it on the same line there's a small chance that it could be updated by another thread after we call millis().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm... I thought everything was on the same thread here..... if not this is a good change.

If we areconcerned about thread races here then you'd have to be concerned about handle_msp updating the timestamp, the main thread grabbing the data based on that timestamp and then handle_msp updating the distance...
if you're really unlucky that data might be bad rather than stale...

@@ -150,9 +150,6 @@ void AP_RangeFinder_TOFSenseF_I2C::update(void)
state.distance_m = distance_mm * 0.001f;
new_distance = false;
update_status();
} else if (AP_HAL::millis() - state.last_reading_ms > 300) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure some of the other drivers shouldn't more like this - i.e. removing the whole timeout handling in the driver and letting the frontend take care of what happens next. Can't see why some places zero their fields and other places don't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've gone through the others now and removed their timeout handling in most cases

@tridge tridge force-pushed the pr-rangefinder-timeout branch 2 times, most recently from 8d0cd32 to bd8f0c3 Compare March 18, 2024 19:45
tridge added 4 commits March 19, 2024 14:17
the value from the sensor can oscillate, causing it to stay locked at
the last value due to the broken glitch detection. Instead use a 3
element mode filter
we had inconsistent behaviour across different sensors. This PR moves
the timeout logic to the frontend so it is consistent

note also that some drivers set reported distance to zero on
timeout. Should we remove that? (eg. mavlink)

this PR was inspired by a log showing the LidarLitev3 (PulsedLight
driver) getting "stuck". I think it was giving oscillating values
which led the driver to never (or rarely) update. This caused
uncontrolled descent/ascent in a copter. That was with 4.3.7 but I
think the same vulnerablity is in master
@tridge tridge force-pushed the pr-rangefinder-timeout branch from bd8f0c3 to a51f5c6 Compare March 19, 2024 03:17
@peterbarker peterbarker self-assigned this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 4.5.1
Development

Successfully merging this pull request may close these issues.

4 participants