Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rmw_zenoh_cpp starts slowly due to the router connection during initialization #12

Closed
YuanYuYuan opened this issue Aug 22, 2024 · 5 comments

Comments

@YuanYuYuan
Copy link
Collaborator

Taking rcl/test/test_graph as an example,

  • rmw_zenoh_cpp: 1m14s
  • rmw_cyclonedds_cpp: 8s
@YuanYuYuan
Copy link
Collaborator Author

By reducing the retry interval from 1s to 100ms, the total time of rmw_zenoh_cpp becomes 15s.

I think the remaining questions are

  1. Is 100ms a proper default value?
  2. rmw_zenoh_cpp usually complains "Unable to connect to a Zenoh router" even though the router has been alive for a while. A quick workaround is adding a short sleep like 100ms before we begin checking any connected routers.
[ RUN      ] NodeGraphMultiNodeFixture.test_node_info_clients
2024-08-29T07:46:06.402460Z  INFO ThreadId(02) zenoh::net::runtime: Using ZID: 3d6a93f9c6e11ecd03c40baae8eac912
2024-08-29T07:46:06.403313Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/[::1]:35069
2024-08-29T07:46:06.405611Z  WARN ThreadId(02) zenoh::net::runtime::orchestrator: Scouting delay elapsed before start conditions are met.
[INFO] [1724917566.405698472] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id fa7db30090db9561f56a0c63cb77f13f.
2024-08-29T07:46:06.514480Z  INFO ThreadId(02) zenoh::net::runtime: Using ZID: efd3a5891210cbe8f4f266dccac1ed48
2024-08-29T07:46:06.515690Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/[::1]:39389
2024-08-29T07:46:06.517990Z  WARN ThreadId(02) zenoh::net::runtime::orchestrator: Scouting delay elapsed before start conditions are met.
[INFO] [1724917566.518090345] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id fa7db30090db9561f56a0c63cb77f13f.
2024-08-29T07:46:06.651655Z  INFO ThreadId(02) zenoh::net::runtime: Using ZID: a147e25ec3ac45235be75d4014f9a059
2024-08-29T07:46:06.654292Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/[::1]:37989
2024-08-29T07:46:06.657196Z  WARN ThreadId(02) zenoh::net::runtime::orchestrator: Scouting delay elapsed before start conditions are met.
[ERROR] [1724917566.657438032] [rmw_zenoh_cpp]: Unable to connect to a Zenoh router. Have you started a router with `ros2 run rmw_zenoh_cpp rmw_zenohd`?
[INFO] [1724917566.757634370] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id fa7db30090db9561f56a0c63cb77f13f.
[ERROR] [1724917567.865398794] [rmw_zenoh_cpp]: topic name /test_service not found in topic_map. Report this.
[ERROR] [1724917567.865425930] [rmw_zenoh_cpp]: topic name /test_service not found in topic_map. Report this.
[WARN] [1724917567.865651105] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /old_node_name from the graph. Ignoring...
[WARN] [1724917567.865866689] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /old_node_name from the graph. Ignoring...
2024-08-29T07:46:07.867399Z ERROR rx-2 ThreadId(10) zenoh::net::routing::dispatcher::resource: Face{2, 3d6a93f9c6e11ecd03c40baae8eac912} Resource 1 remapped. Remapping unsupported!
[ERROR] [1724917567.870917627] [rmw_zenoh_cpp]: topic name /rosout not found in topic_map. Report this.
[WARN] [1724917567.871392931] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /test_graph_node from the graph. Ignoring...

@evshary
Copy link

evshary commented Sep 3, 2024

I tried this patch: eclipse-zenoh/zenoh@926208c

In the complete test, most of the complain "Unable to connect to a Zenoh router" disappear now. However, it still takes time to run the test.

  • rmw_cyclonedds_cpp: 8s
  • rmw_zenoh_cpp(w/o patch): 56s
  • rmw_zenoh_cpp(with patch): 56s

Here is the log:

rmw_zenoh_cpp(with patch):

[ RUN      ] TestGraphFixture.test_rcl_get_and_destroy_topic_names_and_types                                                                                                 
[INFO] [1725346848.889051199] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.                                            
[INFO] [1725346849.892166778] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.                                            
[WARN] [1725346850.893782847] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /old_node_name from the graph. Ignoring...                                   
[       OK ] TestGraphFixture.test_rcl_get_and_destroy_topic_names_and_types (2050 ms)                                                                                       
[ RUN      ] TestGraphFixture.test_rcl_get_service_names_and_types
[INFO] [1725346850.936945583] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.
[INFO] [1725346851.939833963] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.
[WARN] [1725346852.941108863] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /old_node_name from the graph. Ignoring...
[       OK ] TestGraphFixture.test_rcl_get_service_names_and_types (2005 ms)
[ RUN      ] TestGraphFixture.test_rcl_names_and_types_init
[INFO] [1725346852.942666472] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.
[INFO] [1725346853.945273296] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 5369938a20e811c424128c784a33b0db.
[WARN] [1725346854.946429386] [rmw_zenoh_cpp]: Received liveliness token to remove unknown node /old_node_name from the graph. Ignoring...
[       OK ] TestGraphFixture.test_rcl_names_and_types_init (2005 ms)

rmw_cyclonedds_cpp:

[ RUN      ] TestGraphFixture.test_rcl_get_and_destroy_topic_names_and_types                                                                                                 
[       OK ] TestGraphFixture.test_rcl_get_and_destroy_topic_names_and_types (4 ms)                                                                                          
[ RUN      ] TestGraphFixture.test_rcl_get_service_names_and_types                                                                                                           
[       OK ] TestGraphFixture.test_rcl_get_service_names_and_types (4 ms)                                                                                                    
[ RUN      ] TestGraphFixture.test_rcl_names_and_types_init                                                                                                                  
[       OK ] TestGraphFixture.test_rcl_names_and_types_init (5 ms) 

If update the sleep time here

std::this_thread::sleep_for(std::chrono::seconds(1));

  • rmw_zenoh_cpp(with patch): 13s

I thought the sleep should only occur if unable to connect to Zenoh router. However, some tests fail if doing so. Still investigating it.

@evshary
Copy link

evshary commented Sep 4, 2024

There are 3 issues here:

  • "Unable to connect to a Zenoh router": Fixed by Increase the delay in scouting #16
  • Take too much time while testing: Need to remove sleep
  • Test failure while removing sleep: Need to update liveliness mechanism

@evshary
Copy link

evshary commented Sep 5, 2024

With this patch https://github.com/eclipse-zenoh/zenoh-c/tree/dev/liveliness_subscriber_history, using liveliness subscriber with history works well and we can move the unnecessary sleep into if-condition. This will solve the remaining two issues.

@evshary
Copy link

evshary commented Oct 14, 2024

I think this can be closed now

@evshary evshary closed this as completed Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants