Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't wait for RemoteNode to reach PRE-OPERATIONAL #401

Open
Jef-GB opened this issue Oct 24, 2023 · 11 comments
Open

Can't wait for RemoteNode to reach PRE-OPERATIONAL #401

Jef-GB opened this issue Oct 24, 2023 · 11 comments

Comments

@Jef-GB
Copy link

Jef-GB commented Oct 24, 2023

Problem

The NMT heartbeat implementation does not allow for receiving the bootup status from the node. It would be nice if the bootup status would just be received and stored in the state field. Without this state there is no way of checking if the node is still booting.

Goal (Functional)

My goal is to, via NMT, reset the node and wait for it to reach Pre-Operational again. My node is running hardware that requires some time to setup, this time can differ based on the amount of devices connected to the node what makes it difficult to define a time. While setting up the node is in a BOOTING / INITIALISATION NMT state. After all setups are completed, the node enters PRE-OPERATIONAL what makes it ready to be started and used. I can't define a specific time to wait between resetting and starting again so I would like to wait for the pre-operational state. Only I can not do this due to the booting state being updated to pre-operational.

Solution

Remove this if statement in the callback. I don't know what other implications this might have.

if new_state == 0:

@ventussolus
Copy link
Contributor

ventussolus commented Dec 7, 2023

I’ll admit I’m not familiar with whether or not this proposed change will cause issues for others or cause other issues, but a solution to your existing problem could be to use the existing “add_heartbeat_callback()” function to set up your own callback function. This appears to get called before the state is modified, so you’ll get a notification of booting.

Edit: Per another open issue, there is a typo in that function name to watch out for, it’s actually currently “add_hearbeat_callback”.

@sveinse
Copy link
Collaborator

sveinse commented May 19, 2024

The issue for the typo is #389

@acolomb
Copy link
Collaborator

acolomb commented Jul 4, 2024

The workaround via add_heartbeat_callback() (now spelled correctly in the latest release) is a perfectly valid solution IMHO. I assume you have configured the device to send heartbeats regularly. So you will get some heartbeats with state "INITIALISING" while your setup is not yet finished. Once the received heartbeat actually indicates "PRE-OPERATIONAL", you know it's not the automatic promotion happening to the node.nmt.state property.

It basically all hinges on the fact that your node sends at least one more heartbeat after the initialization, which is not guaranteed by the protocol unless you have configured the Heartbeat Producer Time in advance.

@friederschueler
Copy link
Collaborator

Copied from #402

I see the need to have a "synchronized" state hassle-free. Maybe we can introduce some option for the node, which enables state transitions based on the heartbeat only. Because I think it is a very commong use-case to wait for a device until it finished its internal startup. Currently in my older mc-test projects I use a fixed waiting time, which is bad. In the newer tests, I subscribe to the heartbeat to do another is-ready check, but this would be so much easier if I just could enable this behaviour on the node itself.

@erlend-aasland
Copy link
Contributor

I agree with André that using add_heartbeat_callback() is a usable API for this use-case. A short how-to in the docs should be enough.

@acolomb
Copy link
Collaborator

acolomb commented Jul 10, 2024

I see the need to have a "synchronized" state hassle-free.

The problem is that it won't help unless one has manually convinced the device to send another heartbeat. The only required NMT message is the boot-up message with state INITIALIZING. The further transition to PRE-OPERATIONAL is implicit, that's why the code handles it appropriately. It is perfectly valid to assume it has already occurred (past tense), as the spec says (emphasis mine):

7.2.8.2.3.1 Service boot-up Event
Through this service, the NMT slave indicates that a local state transition occurred from the NMT state Initialisation to the NMT state Pre-operational.

Now if a device sends its boot-up message before in fact being "ready", that's a different problem. I see two ways to mitigate it, either put a delay on the master side, or wait for the next heartbeat which states PRE-OPERATIONAL. But note that the device has no obligation to send another NMT heartbeat object upon the state transition. It will only send another after the Heartbeat Producer Time parameter has been set (and saved) in the device. If it could be enabled immediately via SDO, then we wouldn't have a problem ("device not ready") in the first place. This parameter needs to be chosen to a fixed interval, which is consequently the worst-case time until we get another heartbeat. So instead of doing that and waiting for an appropriate heartbeat, we could just as well choose the first option and hard-code the delay on the master.

This really only concerns broken device behavior and I'm pretty sure there is no more efficient way to work around it. Getting the boot-up message means the device already is in PRE-OPERATIONAL. If it doesn't behave as desired in this state, that doesn't invalidate this fact.

@erlend-aasland
Copy link
Contributor

So instead of doing that and waiting for an appropriate heartbeat, we could just as well choose the first option and hard-code the delay on the master.

Why a delay? What would that solve? How could you even pick a sensible delay value? I would prefer the status quo rather than introducing an arbitrary delay.

@friederschueler
Copy link
Collaborator

This really only concerns broken device behavior and I'm pretty sure there is no more efficient way to work around it.

Ok sounds reasonable. We have a working solution, but we should really provide documentation for it.

I would prefer the status quo rather than introducing an arbitrary delay.

Same, an arbitrary delay is a no-go in a (general) framework.

@erlend-aasland
Copy link
Contributor

I think there is consensus that this is a documentation issue. What remains is a docs PR. I can try to cook up something within a day or two, unless someone else beats me to it.

@acolomb
Copy link
Collaborator

acolomb commented Jul 10, 2024

I'm not arguing for introducing a delay in the canopen library. Instead, the application-level workaround which @friederschueler described above and which I think is common:

Currently in my older mc-test projects I use a fixed waiting time, which is bad.

I had a project where we needed to wait ~2 additional seconds before SDOs would work (on a Nanotec CL4-E controller). That's one reason for me to make sure configuration is saved in the device during commissioning, and doesn't need SDO communication after each power-up.

How could you even pick a sensible delay value?

How would you pick a sensible Heartbeat Producer Time value? I was just saying that without that being set, you won't get any further heartbeat. So the time you pick there is just as arbitrary a minimum delay as if you had added the delay in your application code.

@acolomb
Copy link
Collaborator

acolomb commented Jul 10, 2024

Why a delay? What would that solve?

I think these questions clearly show that some experience with CANopen in general is needed to write a correct documentation entry for this issue. I don't have the time now to write up something, but will happily review a rough draft and offer my experience to beat it into shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants