Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D500 - retry GET_GVD #12805

Closed
wants to merge 2 commits into from

Conversation

Nir-Az
Copy link
Collaborator

@Nir-Az Nir-Az commented Mar 28, 2024

On D500 device, if the device is not ready after USB enumeration it will return NOT_READY error code to GET_GVD command.
This PR add retries when this happens on D500 devices

Note: also done some includes cleanup and renames
Tracked on [RSDEV-1826)]

(cherry picked from commit 8ff27ee0806a2110a771cc40215c05b224a41f03)
@Nir-Az Nir-Az requested a review from maloel March 28, 2024 06:33
@Nir-Az
Copy link
Collaborator Author

Nir-Az commented Mar 28, 2024

Main change is adding the retry mechanism,
Other changes are remove redundant code and rename to const to generic names (remove IVCAM)

{
// If we get an error code that match to the error code defined as require retry,
// we will retry the command until it succeed or we reach a timeout
bool should_retry = retry_error_codes && ! retry_error_codes->empty()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!empty is redundant with find() != end()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

auto data = send( command, &p_response );
if( p_response != hwm_Success )
{
// If we get an error code that match to the error code defined as require retry,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but I still don't understand WHY we would have to retry a GVD command (specifically, or generally any command).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I explained, this is the flow that was agreed between SW & FW.
Origin from other HW not being ready yet while FW is already ready.
GVD contains other HW information as well, for more info please contact the architect

src/hw-monitor.h Outdated
@@ -1,50 +1,27 @@
// License: Apache 2.0. See LICENSE file in root directory.
// Copyright(c) 2015 Intel Corporation. All Rights Reserved.
// Copyright(c) 2024 Intel Corporation. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2015-24

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

&& retry_error_codes->find( p_response ) != retry_error_codes->end();
if( should_retry )
{
static constexpr size_t RETRIES = 50;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we need static here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

auto data = send(command);
hwmon_response p_response = hwmon_response::hwm_Unknown;
auto data = send( command, &p_response );
if( p_response != hwm_Success )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this if is exra... you can just put the inside block here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought it will be more efficient, but NVM, will change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if( should_retry )
{
static constexpr size_t RETRIES = 50;
for( int i = 0; i < RETRIES; ++i )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're gonna check the termination inside, seems extra to do it here. Just do while( true ) and increment inside?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't trust while (true)
In the future someone can change something and it will cause the application to stuck.
Prefer to keep as is.

@@ -435,8 +435,15 @@ namespace librealsense
bool advanced_mode = false;
bool usb_modality = true;
group_multiple_fw_calls(depth_sensor, [&]() {

const int HW_NOT_READY_ERR_CODE = -3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs explanation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@maloel
Copy link
Collaborator

maloel commented Mar 28, 2024

if the device is not ready after USB enumeration

And why would this happen? Doesn't seem valid...
You're also assuming GET_GVD will be the first command that encounters this issue, hence the wait here is valid but not for other commands?

src/hw-monitor.cpp Outdated Show resolved Hide resolved
<< "error in querying GVD, error:"
<< hwmon_error2str( p_response ) );
LOG_WARNING( "GVD not ready - retrying GET_GVD command" );
std::this_thread::sleep_for( std::chrono::milliseconds( 100 ) );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be before line 247 (which, the first time, has no delay before it)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Nir-Az
Copy link
Collaborator Author

Nir-Az commented Mar 28, 2024

if the device is not ready after USB enumeration

And why would this happen? Doesn't seem valid... You're also assuming GET_GVD will be the first command that encounters this issue, hence the wait here is valid but not for other commands?

This requirement comes from our SW architect,
Enumerate as fast as possible and return NOT READY to GVD if not all HW is ready.
Probably mostly for HW D555e doesn’t have.
I suggest you check with him
I will add comments

(cherry picked from commit cd3fd76fbdbc87c8da3f9678df307f67f585ed01)
@Nir-Az Nir-Az requested a review from maloel March 28, 2024 10:48
@Nir-Az
Copy link
Collaborator Author

Nir-Az commented Mar 28, 2024

Closed until final solution is defined.

@Nir-Az Nir-Az closed this Mar 28, 2024
@Nir-Az Nir-Az deleted the d500-gvd-error-code branch September 30, 2024 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants