-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting master going offline error #1
Comments
Thank you for reporting this issue. The problem you described is likely caused by system latency or insufficient real-time performance, which affects stable communication between the master and slaves. Even with RTLinux, we recommend running the following command to verify if the maximum scheduling latency exceeds millisecond-level ranges: sudo cyclictest -m -p99 -t1 -i100 -a3 In a similar case with Jetson Nano (without RTLinux), we achieved 4 hours of stable testing through the following optimizations:
Additionally:
|
and do they all have to be called inside PRE_OP state? Or SAFE_OP state? |
Also, I noticed you're calling ec_send_processdata immediately after ec_receive_processdata. |
This is an excellent question. Here's what we found: When using Based on these observations, we found that calling Thus, the correct sequence should be:
|
We haven't done much testing on this yet. This example is primarily designed to address the issue where the SOEM master cannot bring eRob into the OP state, with simple extensions for enabling and motion functionality. We recommend that you try experimenting with your setup and focus on developing and optimizing the current code on the basis of stable operation of the master. |
If I set the ecat DC cycle time to 1ms, what is the maximum delay for EROB unit to report master go offline?Can I set Cycletime to 2ms but still send and receive data at 1ms to avoid entering master go offline error.We have 20 devices on Ecat line, 15 erob acuators, but only EROB reports master go offline. There got be some threshold value I can set to avoid this error. This error is catastrophic for our system, we can tolerate delay in communication but cannot tokerate master go offline error Sent from my iPhoneOn Dec 4, 2024, at 12:34 AM, ZeroErr ***@***.***> wrote:
Also, I noticed you're calling ec_send_processdata immediately after ec_receive_processdata.
Is it possible to do some data processing after ec_receive_processdata, then call ec_send_processdata? Would it cause problem?
This is an excellent question. Here's what we found:
When using ec_configdc(), we noticed that the slave devices were unable to enter DC mode, which in turn prevented them from reaching the OP state. To investigate this, we analyzed the process of initializing eRob with the ###TwinCAT### master (a standard EtherCAT master) by capturing network packets. During this analysis, we observed that the TwinCAT master writes the value 3 to the register at address 0x0981, and the value of object dictionary 1C32 is set to 2 to indicate the slave has correctly entered DC mode.
Based on these observations, we found that calling ecx_dcsync0() in the PRE_OP state, before ec_config_map(), satisfies these conditions. While ec_configdc() is theoretically meant to be called in the SAFE_OP state to calculate the master’s reference clock, we haven’t identified any significant differences when calling it in PRE_OP or SAFE_OP. You might want to try both and see how it works for your setup.
Thus, the correct sequence should be:
ecx_dcsync0()
ec_config_map()
ec_configdc()
Let us know if you have any further questions or issues.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
This is a highly technical question. We will conduct experiments and tests based on your issue to try and resolve it. However, we currently do not have a specific solution. Here are our suggestions: Try adjusting the watchdog timeout settings in SOEM. |
How can I adjust the watchdog timeout settings in SOEM.
Sent from my iPhone
… On Dec 4, 2024, at 1:32 AM, ZeroErr ***@***.***> wrote:
Try adjusting the watchdog timeout settings in SOEM.
|
The master station goes offline problem is really a headache for us, only eROB is reporting it |
You can refer to the development documentation of the SOEM master station. |
We plan to use the SOEM master to reproduce the issue you mentioned. If there are any results, we will notify you at the earliest opportunity. |
Currently, we have removed the CPU affinity binding for Thread 1 and Thread 2 in the new program (eRob_test.cpp) to avoid unpredictable scheduling delays. After making this adjustment, we tested the program on the RT Linux system to drive six eRob units, and it has successfully run stably for over one hour. We recommend testing the long-term enabling of eRob first. If the issue of dropping out of OP still occurs, I will further optimize the master program. |
Did you reproduce the "Master go offline problem"? I don't care about dropping out of OP, "master go offline" is a fatal issue for us. |
To be precise, I haven't fully understood what you mean by "master going offline." Could you provide the specific scenarios, the messages printed by the master, and the EtherCAT slave messages during the disconnection? This would help me better reproduce the issue. Previously, I interpreted the master going offline as the same as dropping out of OP state. |
ECAT device send error code 0XA000, and enter fault state Sent from my iPhoneOn Dec 6, 2024, at 12:03 AM, ZeroErr ***@***.***> wrote:
Did you reproduce the "Master go offline problem"? I don't care about dropping out of OP, "master go offline" is a fatal issue for us. We're using RT system, we assigned a dedicated core only for the ethercat update thread, but it will immediately drop out if there's even a tiny slight glitch in timing. Once master station goes offline, there's no way to clear the fault, actuator stays in fault and won't work.
To be precise, I haven't fully understood what you mean by "master going offline." Could you provide the specific scenarios, the messages printed by the master, and the EtherCAT slave messages during the disconnection? This would help me better reproduce the issue. Previously, I interpreted the master going offline as the same as dropping out of OP state.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hey do you have screenshot or anything of the error that you are facing? |
Through my attempts with the SOEM master, I identified several key points for optimization: Thread Isolation and CPU Affinity: Bind the EtherCAT thread to a specific CPU core and isolate the core to reduce interference from network management tasks. |
In the latest upload, I have included the optimized PP mode master project, which has undergone multiple one-hour stability tests. Additional considerations have already been mentioned in my previous responses and will not be repeated here. The eRob_eCoder project can be used for testing purposes, but it should not be directly applied to real-world applications to avoid potential risks or unforeseen losses. |
I am running with eRobTest, it's run on a RTLinux.
It connects with slaves fine, but very fast it moves to master goes offline error.
Is there error tolerance I can set to devices to not report such error?
The text was updated successfully, but these errors were encountered: