Add validation to initialization configuration commands #27459

lashVN · 2024-07-05T16:33:14Z

Currently, all configuration commands sent to the VN unit are not validated for a response, leading to difficult-to-diagnose errors when a command doesn't go through or is rejected by the VN. All commands send to the unit (particularly setup commands) should have their acceptance confirmation checked before the driver begins to use the data.

IamPete1 · 2024-07-05T22:35:49Z

I'm not a fan of the formatting changes in the last commit. Its significantly worse in some places. It also hides the core functional changes in this PR, and makes the history more complex if we ever need to look back. If your very keen you could revert the cases where its worse and open it in a new PR with only those changes. Then you can run size_compare_branches to prove there is no difference to compiled output.

lashVN · 2024-07-06T00:11:40Z

@IamPete1

Thanks for taking a look. Which spots are you talking about? I generally tried to come up with a clangformat from what I found in the guide and looking around the codebase. Alternatively, if there is a different clangformat I should use, please let me know.

Most of the changes seem to be

Preceeding underscore for private members
Correcting brace newline convention
Pointer alignment
Assignment operator alignment
operator and parenthesis whitespace
Statement newline wrapping (from what I could tell the codebase largely has a 100 character column limit? That's smaller than I tend to prefer, but it looked to be used)
Case label tabbing (happy to revert, just noticed style guide accepts unindented case labels)

IamPete1

Highlighting where some the formatting changes have made the code harder to read.

IamPete1 · 2024-07-06T00:15:05Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+            state.gyro =
+                Vector3f{pkt1.uncompAngRate[0], pkt1.uncompAngRate[1], pkt1.uncompAngRate[2]};


This is worse.

IamPete1 · 2024-07-06T00:15:25Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+        state.quat = Quaternion{pkt1.quaternion[3], pkt1.quaternion[0], pkt1.quaternion[1],
+                                pkt1.quaternion[2]};


Again, the wrap here is worse.

IamPete1 · 2024-07-06T00:15:39Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+        state.location =
+            Location{int32_t(pkt1.positionLLA[0] * 1.0e7), int32_t(pkt1.positionLLA[1] * 1.0e7),
+                     int32_t(pkt1.positionLLA[2] * 1.0e2), Location::AltFrame::ABSOLUTE};


This is worse.

IamPete1 · 2024-07-06T00:15:56Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+        state.origin =
+            Location{int32_t(pkt2.GPS1posLLA[0] * 1.0e7), int32_t(pkt2.GPS1posLLA[1] * 1.0e7),
+                     int32_t(pkt2.GPS1posLLA[2] * 1.0e2), Location::AltFrame::ABSOLUTE};


This is worse.

IamPete1 · 2024-07-06T00:16:04Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+        state.quat =
+            Quaternion{pkt.quaternion[3], pkt.quaternion[0], pkt.quaternion[1], pkt.quaternion[2]};


This is worse.

IamPete1 · 2024-07-06T00:16:17Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+    AP::logger().WriteStreaming(
+        "EAH3", "TimeUS,Temp,Pres,MX,MY,MZ,AX,AY,AZ,GX,GY,GZ,Q1,Q2,Q3,Q4", "sdPGGGoooEEE----",
+        "F000000000000000", "Qfffffffffffffff", AP_HAL::micros64(), pkt.temp, pkt.pressure * 1e3,
+        use_uncomp ? pkt.uncompMag[0] : pkt.mag[0], use_uncomp ? pkt.uncompMag[1] : pkt.mag[1],
+        use_uncomp ? pkt.uncompMag[2] : pkt.mag[2], state.accel[0], state.accel[1], state.accel[2],
+        state.gyro[0], state.gyro[1], state.gyro[2], state.quat[0], state.quat[1], state.quat[2],
+        state.quat[3]);
+


This is much worse.

IamPete1 · 2024-07-06T00:16:58Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+    const float vel_gate              = 5;
+    const float pos_gate              = 5;
+    const float hgt_gate              = 5;
+    const float mag_var               = 0;


This is worse.

IamPete1 · 2024-07-06T00:17:13Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.h

+        uint8_t term_offset;    // offset within the _term buffer where the next character should be
+                                // placed


This is worse.

IamPete1 · 2024-07-06T00:25:18Z

Thanks for taking a look. Which spots are you talking about? I generally tried to come up with a clangformat from what I found in the guide and looking around the codebase. Alternatively, if there is a different clangformat I should use, please let me know.

We do have a guide, but we don't enforce all of it strictly on new code. Its generally left up to the person writing the code what they prefer. Unless the formatting is particularity offensive then its better to just leave it rather than having lots of code churn. As such not all libraries use the same pattern, underscores for private members for example.

I'm not saying that we would not accept a formatting improvement for this library but it should be in a separate PR to any functional changes.

lashVN · 2024-07-06T00:32:53Z

@IamPete1

Thanks for pointing those out.

A lot of those I definitely agree to be worse. I'm happy to resolve those (locally I'd just disable the line wrapping in my clangformat specifiers) and proceed with the merge. If you'd still prefer though, I can pull the formatting changes altogether and open a new PR for the rest of the formatting changes.

Edit:

Sorry, I just reread your comment and it sounds like it clearly should be option 2.

IamPete1 · 2024-07-06T01:34:39Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+            if (strncmp(nmea.term, message_to_send, nmea.term_offset) !=
+                0) {  // ignore sync byte in comparison


Suggested change

if (strncmp(nmea.term, message_to_send, nmea.term_offset) !=

0) { // ignore sync byte in comparison

if (strncmp(nmea.term, message_to_send, nmea.term_offset) != 0) {

// ignore sync byte in comparison

IamPete1 · 2024-07-06T01:38:12Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

                return false;
+            } else if (strncmp(nmea.term, "VNERR", nmea.term_offset) == 0) {
+                AP_HAL::panic("VectorNav received VN error response");


Panicing is not a great idea, it effectively just gives up. Would you expect to see this in normal operation? In which case it should trigger a pre-arm. Or is it only if there is some edge case and very unlikely? I which case you could throw a internal error.

Would it be worth trying a reset of the vector nav to see if we could recover?

@IamPete1

The only cases I'd expect to see an error in response is

If something is catastrophically wrong with the VN sensor (hardware failure, etc)

If the code was incorrectly edited and sent a command that isn't acceptable to the sensor

If the AP-configured IMU Rate (via get_rate()) is set too high, we will reply with insufficient baudrate VNERR

I don't think any of those cases would be resolved by resetting the sensor, but I agree especially since (3) is very viable with normal use panicing may not be the not the correct response. Given the failure cases, what would you suggest instead?

OK, in that case, I would print the error specific (GCS_SEND_TEXT) so the user will have some feedback. Then you need to make sure that they can't arm. Hopefully the existing health checks are sufficient, if not then update those so they give a "init failed" error.

Really there are three things that you need to acheave:

Not prevent the rest of the code running.

Give the user a clue so they can track down how to resolve

Prevent the vehicle flying in a unsafe state

Just pushed a change to resolve this issue.

In this change, the system will continue to send the message to the VectorNav every READ_REQUEST_RETRY_MS (500ms). The pre-arm check should continue to show as a failure, because the system will never pass initialization (assuming the VNERR is never resolved) THis is consistent with previous behavior: block initialization and retry until a valid response is received. It will result in the received error message being sent to GCS every 500ms.

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

Expands wait_register_response (and renames to send_command_blocking) to accept any arbitrary command, and uses the new method for all sent commands

…her than just active port This change prevents a commonly seen baudrate overflow error on the unused port. Also pause asynchronous communications during initial configuration.

Print received VNERR response to console and continually retry sending the configuration message, instead of panicing

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

Add clarity to send_command_blocking by improving comment and directly returning instead of breaking

peterbarker

Needs to pass CI. ALSO needs to not access an uninitalised string on failure:

Please also see https://github.com/peterbarker/ardupilot/pull/new/pr/vectornav-run-command-format on how run_command could be done rather more neatly. The method would need to be marked as taking varargs, however. I'm still not really happy with the way that command is passed into decode, but I think you'd agree the varargs message_format is rather nicer.

peterbarker · 2024-07-16T02:59:27Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

            request_sent = now;
        }

        int16_t nbytes = uart->available();
        while (nbytes-- > 0) {
            char c = uart->read();
            if (decode(c)) {
+                if (nmea.error_response && nmea.sentence_done) {


This might be problematic as the driver evolves. There's an assumption in here that whatever is in the pipeline to/from the device is the command we're currently trying to execute.

That may not be the case outside of the initialisation phase of the driver.

@peterbarker

I don't think I understand this comment.

I agree that the driver assumes that a blocking call doesn't have an interrupting call. Is that what you mean?

What I was getting at here is that your UART might contain incoming data that isn't related to the command you just sent. If you parse a message out of that buffer you may attribute it to the command you sent rather than a previous command.

It's not an easy thing to resolve (flushing the input buffer isn't actually sufficient as there may be outstanding commands on the way t into the VectorNav.

If you're not fussed with this I don't particular mind; it's not a major safety issue for the vehicle.

peterbarker · 2024-07-16T03:03:37Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

+        case 1: 
+            if (nmea.error_response) {
+                GCS_SEND_TEXT(MAV_SEVERITY_CRITICAL, "VectorNav received VNERR code: %s", nmea.term);
+            } else if (strncmp(nmea.term, message_to_send + 6,


Suggested change

} else if (strncmp(nmea.term, message_to_send + 6,

} else if (strncmp(nmea.term, MIN(message_to_send + 6, strlen(message_to_send))

I'm not a fan of the structure of this driver, and I haven't even really understood it yet. But the code that's currently in here looks fall-out-of-sky dangerous here. It's probably not possible to get in here with strlen(message_to_send) less than 6, but if we do get here then bad things may happen.

@peterbarker

It should never be possible to hit this point with length < 6 (All responses must begin with "VNXXX," as we validated in the prior term, and to hit the term==1, there has to have been something received after the comma)

I'm also not clear that it would be particularly dangerous, would it not just worst case compare to unutilized memory allocated to message_to_send? Of course I'm assuming message_to_send capacity is greater than 6 + nmea.term_offset

peterbarker · 2024-07-16T03:06:50Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

        default:
-            return false;
+            return true;


This deserves a comment - we ignore any terms past the third?

The prior iteration of this driver only received responses up to 3 terms, meaning it should mark any response it saw above 3 as invalid. This will now receive 5 or 6 term responses, so it shouldn't invalidate the responses longer than 3 terms.

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

"VNERR" does not match beginning of message_to_send, so have to prevent it from returning due to the string compare. Also must prevent exiting the decode before the sentence has completed so that we can go on to print the error code term. Fix bug preventing disabling of ASCII measurements Instead of snprintf the command to message_to_send then executing run_command, allow run_command to accept a string and format spec Add check to ensure message_to_send is greater than length 6 before attempting to read past 6

…d commands Aside from the RRG,1 call, the simulation should reply with exactly what was received as a receipt confirmation to allow the driver to proceed past init

peterbarker

LGTM

peterbarker · 2024-07-16T10:53:12Z

libraries/AP_ExternalAHRS/AP_ExternalAHRS_VectorNav.cpp

            request_sent = now;
        }

        int16_t nbytes = uart->available();
        while (nbytes-- > 0) {
            char c = uart->read();
            if (decode(c)) {
+                if (nmea.error_response && nmea.sentence_done) {


What I was getting at here is that your UART might contain incoming data that isn't related to the command you just sent. If you parse a message out of that buffer you may attribute it to the command you sent rather than a previous command.

It's not an easy thing to resolve (flushing the input buffer isn't actually sufficient as there may be outstanding commands on the way t into the VectorNav.

If you're not fussed with this I don't particular mind; it's not a major safety issue for the vehicle.

libraries/SITL/SIM_VectorNav.h

peterbarker · 2024-07-16T11:01:24Z

@IamPete1 are you happy for this to be merged in its current state?

IamPete1 · 2024-07-16T11:50:22Z

@IamPete1 are you happy for this to be merged in its current state?

Yeah, looks OK.

Move receive_buf to method scope allocation rather than class scope

lashVN closed this Jul 5, 2024

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch from 70c0c87 to a346323 Compare July 5, 2024 16:35

lashVN reopened this Jul 5, 2024

lashVN marked this pull request as ready for review July 5, 2024 17:51

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch 2 times, most recently from 18239c8 to 3c8e78e Compare July 5, 2024 22:14

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch from 3c8e78e to 646f637 Compare July 5, 2024 23:45

IamPete1 reviewed Jul 6, 2024

View reviewed changes

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch 2 times, most recently from 0f39760 to e918727 Compare July 6, 2024 01:02

IamPete1 reviewed Jul 6, 2024

View reviewed changes

lashVN added 2 commits July 6, 2024 14:59

AP_ExternalAHRS: VectorNav: Add response validation to all sent commands

3207d2f

Expands wait_register_response (and renames to send_command_blocking) to accept any arbitrary command, and uses the new method for all sent commands

AP_ExternalAHRS: VectorNav: Disable ASCII messages on both ports, rat…

d4722b1

…her than just active port This change prevents a commonly seen baudrate overflow error on the unused port. Also pause asynchronous communications during initial configuration.

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch from e918727 to d4722b1 Compare July 6, 2024 20:00

AP_ExternalAHRS_VectorNav: Handle VNERR responses

884a889

Print received VNERR response to console and continually retry sending the configuration message, instead of panicing

peterbarker reviewed Jul 9, 2024

View reviewed changes

AP_ExternalAHRS_VectorNav: Minor code clarity improvements

51db6eb

Add clarity to send_command_blocking by improving comment and directly returning instead of breaking

peterbarker requested changes Jul 16, 2024

View reviewed changes

lashVN added 2 commits July 16, 2024 00:49

AP_ExternalAHRS_VectorNav: Update VectorNav sim to respond to receive…

a8370f3

…d commands Aside from the RRG,1 call, the simulation should reply with exactly what was received as a receipt confirmation to allow the driver to proceed past init

lashVN force-pushed the Add_validation_to_initialization_configuration_commands branch from e00f620 to a8370f3 Compare July 16, 2024 05:56

peterbarker added the DevCallEU label Jul 16, 2024

peterbarker approved these changes Jul 16, 2024

View reviewed changes

AP_ExternalAHRS_VectorNav: Move allocation to local scope

93673c7

Move receive_buf to method scope allocation rather than class scope

tridge approved these changes Jul 17, 2024

View reviewed changes

tridge merged commit 3f00280 into ArduPilot:master Jul 17, 2024
93 checks passed

lashVN deleted the Add_validation_to_initialization_configuration_commands branch July 29, 2024 19:02

IamPete1 removed the DevCallEU label Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add validation to initialization configuration commands #27459

Add validation to initialization configuration commands #27459

lashVN commented Jul 5, 2024

IamPete1 commented Jul 5, 2024

lashVN commented Jul 6, 2024

IamPete1 left a comment

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 Jul 6, 2024

IamPete1 commented Jul 6, 2024

lashVN commented Jul 6, 2024 •

edited

Loading

IamPete1 Jul 6, 2024

lashVN Jul 6, 2024

IamPete1 Jul 6, 2024

lashVN Jul 6, 2024

IamPete1 Jul 8, 2024

lashVN Jul 8, 2024

peterbarker left a comment

peterbarker Jul 16, 2024

lashVN Jul 16, 2024

peterbarker Jul 16, 2024

peterbarker Jul 16, 2024

lashVN Jul 16, 2024 •

edited

Loading

peterbarker Jul 16, 2024

lashVN Jul 16, 2024

peterbarker left a comment

peterbarker Jul 16, 2024

peterbarker commented Jul 16, 2024

IamPete1 commented Jul 16, 2024

		state.gyro =
		Vector3f{pkt1.uncompAngRate[0], pkt1.uncompAngRate[1], pkt1.uncompAngRate[2]};

		state.quat = Quaternion{pkt1.quaternion[3], pkt1.quaternion[0], pkt1.quaternion[1],
		pkt1.quaternion[2]};

		state.quat =
		Quaternion{pkt.quaternion[3], pkt.quaternion[0], pkt.quaternion[1], pkt.quaternion[2]};

		uint8_t term_offset; // offset within the _term buffer where the next character should be
		// placed

		if (strncmp(nmea.term, message_to_send, nmea.term_offset) !=
		0) { // ignore sync byte in comparison

	} else if (strncmp(nmea.term, message_to_send + 6,
	} else if (strncmp(nmea.term, MIN(message_to_send + 6, strlen(message_to_send))

Add validation to initialization configuration commands #27459

Add validation to initialization configuration commands #27459

Conversation

lashVN commented Jul 5, 2024

IamPete1 commented Jul 5, 2024

lashVN commented Jul 6, 2024

IamPete1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IamPete1 commented Jul 6, 2024

lashVN commented Jul 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peterbarker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lashVN Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peterbarker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peterbarker commented Jul 16, 2024

IamPete1 commented Jul 16, 2024

lashVN commented Jul 6, 2024 •

edited

Loading

lashVN Jul 16, 2024 •

edited

Loading