You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When killing (SIGKILL) and restarting a node (or launch file with multiple nodes) quickly the Zenoh queryable can be removed from the newly discovered services. I noticed this by inspecting the zenoh admin space of the zenoh-bridge-ros2dds plugin from the router and client. We connect a zenoh-bridge-ros2dds in client mode to a zenoh-bridge-ros2dds in router mode to use ROS2 over WiFi.
When dynamically killing and spawning nodes their services sometimes fails to be picked up. What happens is the ROS2 srv and cli are being picked up and visible in the adminspace. But the queryable is initially spawned and than destroyed after 10s (from when the previous process was killed). I ivestigated where this 10s was coming from but can't seem to find a parameter to adjust it (tested queries_default_timeout and link lease without success).
I would expect the service to work even though a previously killed process had the exact same service server.
To reproduce
To reproduce we can show this just with a single router and a single node from the demo_nodes_cpp package.
Terminal 1:
ros2 run zenoh_bridge_ros2dds zenoh_bridge_ros2dds -l tcp/localhost:7447 --no-multicast-scouting --rest-http-port 8000
Terminal 2:
ros2 run demo_nodes_cpp add_two_ints_server; ros2 run demo_nodes_cpp add_two_ints_server
I use twice the same command with a ; to run the same command again after the first one is killed with SIGKILL.
Terminal 3:
pkill -SIGKILL -f add_two_ints_server
Result
Open http://localhost:8000/@/**/add_two_ints to monitor if the server and queryable is shown, but in this case the service is being removed after the new node has already started.
Logs
Logs only shown from start of demo node and redacted to only show relevant services and logs
2024-11-06T13:04:07.398437Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110fc28cf444780c3d7adb9000001c1)
2024-11-06T13:04:07.400068Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110fc28cf444780c3d7adb900000403 from Participant 0110fc28cf444780c3d7adb9000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-06T13:04:07.400174Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110fc28cf444780c3d7adb900000504 from Participant 0110fc28cf444780c3d7adb9000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-06T13:04:07.400433Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110fc28cf444780c3d7adb900001503 from Participant 0110fc28cf444780c3d7adb9000001c1 on rr/add_two_intsReply with type example_interfaces::srv::dds_::AddTwoInts_Response_ (keyless: true)
2024-11-06T13:04:07.400446Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110fc28cf444780c3d7adb900001604 from Participant 0110fc28cf444780c3d7adb9000001c1 on rq/add_two_intsRequest with type example_interfaces::srv::dds_::AddTwoInts_Request_ (keyless: true)
2024-11-06T13:04:07.497810Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::discovery_mgr: Received ros_discovery_info from participant 0110fc28cf444780c3d7adb9000001c1 with nodes: [/add_two_ints_server]
2024-11-06T13:04:07.497843Z INFO tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::discovered_entities: Discovered ROS Node /add_two_ints_server
2024-11-06T13:04:07.497906Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::discovered_entities: ROS Node /add_two_ints_server declares a new Writer on rr/add_two_intsReply
2024-11-06T13:04:07.498231Z INFO tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds: Node /add_two_ints_server declares Service Server /add_two_ints: example_interfaces/srv/AddTwoInts - Allowed
2024-11-06T13:04:07.498234Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints): creation with type example_interfaces/srv/AddTwoInts
2024-11-06T13:04:07.498417Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints): (local client_guid=10ec6480a573dd75) id='1.10.b5.29.cb.7a.60.83.28.b7.15.f6.0.0.0.0' => USER_DATA=[99, 108, 105, 101, 110, 116, 105, 100, 61, 32, 49, 46, 49, 48, 46, 98, 53, 46, 50, 57, 46, 99, 98, 46, 55, 97, 46, 54, 48, 46, 56, 51, 46, 50, 56, 46, 98, 55, 46, 49, 53, 46, 102, 54, 46, 48, 46, 48, 46, 48, 46, 48, 59]
2024-11-06T13:04:07.498468Z INFO tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::routes_mgr: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) created
2024-11-06T13:04:07.498474Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) now serving local nodes {"/add_two_ints_server"}
2024-11-06T13:04:07.498480Z DEBUG tokio-runtime-worker ThreadId(14) zenoh::net::routing::dispatcher::resource: Register resource add_two_ints
2024-11-06T13:04:07.498491Z DEBUG tokio-runtime-worker ThreadId(14) zenoh::net::routing::dispatcher::queries: Face{2, 616c459718f2192e62347ee552a02162} Declare queryable 19 (add_two_ints)
2024-11-06T13:04:07.498505Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) announce via token @/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts
2024-11-06T13:04:07.498507Z DEBUG tokio-runtime-worker ThreadId(14) zenoh::net::routing::dispatcher::token: Face{2, 616c459718f2192e62347ee552a02162} Declare token 20 (@/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts)
2024-11-06T13:04:07.498513Z DEBUG tokio-runtime-worker ThreadId(14) zenoh::net::routing::dispatcher::resource: Register resource @/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts
2024-11-06T13:04:07.502348Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::ros_discovery: Publish update on 'ros_discovery_info' with 8 writers and 7 readers
kill using SIGKILL, killing the process and automatically restarting the node in terminal 2.
2024-11-06T13:04:32.828578Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS Participant 0110be83cde1f5d46b537c0f000001c1)
2024-11-06T13:04:32.830165Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110be83cde1f5d46b537c0f00000403 from Participant 0110be83cde1f5d46b537c0f000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-06T13:04:32.830290Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110be83cde1f5d46b537c0f00000504 from Participant 0110be83cde1f5d46b537c0f000001c1 on ros_discovery_info with type rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_ (keyless: true)
2024-11-06T13:04:32.830925Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS publication 0110be83cde1f5d46b537c0f00001503 from Participant 0110be83cde1f5d46b537c0f000001c1 on rr/add_two_intsReply with type example_interfaces::srv::dds_::AddTwoInts_Response_ (keyless: true)
2024-11-06T13:04:32.830937Z DEBUG ThreadId(22) zenoh_plugin_ros2dds::dds_discovery: Discovered DDS subscription 0110be83cde1f5d46b537c0f00001604 from Participant 0110be83cde1f5d46b537c0f000001c1 on rq/add_two_intsRequest with type example_interfaces::srv::dds_::AddTwoInts_Request_ (keyless: true)
2024-11-06T13:04:32.898372Z DEBUG tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds::discovery_mgr: Received ros_discovery_info from participant 0110be83cde1f5d46b537c0f000001c1 with nodes: [/add_two_ints_server]
2024-11-06T13:04:32.898384Z INFO tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds::discovered_entities: Discovered ROS Node /add_two_ints_server
2024-11-06T13:04:32.898419Z DEBUG tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds::discovered_entities: ROS Node /add_two_ints_server declares a new Writer on rr/add_two_intsReply
2024-11-06T13:04:32.898566Z INFO tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds: Node /add_two_ints_server declares Service Server /add_two_ints: example_interfaces/srv/AddTwoInts - Allowed
2024-11-06T13:04:32.898567Z DEBUG tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) now serving local nodes {"/add_two_ints_server"}
2024-11-06T13:04:32.898569Z DEBUG tokio-runtime-worker ThreadId(10) zenoh::net::routing::dispatcher::queries: Face{2, 616c459718f2192e62347ee552a02162} Declare queryable 43 (add_two_ints)
2024-11-06T13:04:32.898573Z DEBUG tokio-runtime-worker ThreadId(10) zenoh::net::routing::dispatcher::queries: Face{2, 616c459718f2192e62347ee552a02162} Undeclare queryable 19 (add_two_ints)
2024-11-06T13:04:32.898577Z DEBUG tokio-runtime-worker ThreadId(10) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) announce via token @/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts
2024-11-06T13:04:32.898581Z DEBUG tokio-runtime-worker ThreadId(10) zenoh::net::routing::dispatcher::token: Face{2, 616c459718f2192e62347ee552a02162} Declare token 44 (@/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts)
some delay before the next happens
2024-11-06T13:04:41.498829Z INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds::discovered_entities: Undiscovered ROS Node /add_two_ints_server
2024-11-06T13:04:41.499103Z DEBUG tokio-runtime-worker ThreadId(04) zenoh::net::routing::dispatcher::pubsub: Undeclare subscription Face{2, 616c459718f2192e62347ee552a02162}
2024-11-06T13:04:41.505248Z INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds: Node /add_two_ints_server undeclares Service Server /add_two_ints: example_interfaces/srv/AddTwoInts - Allowed
2024-11-06T13:04:41.505264Z DEBUG tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) now serving local nodes {}
2024-11-06T13:04:41.505274Z DEBUG tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds::route_service_srv: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) retire
2024-11-06T13:04:41.505303Z DEBUG tokio-runtime-worker ThreadId(04) zenoh::net::routing::dispatcher::queries: Face{2, 616c459718f2192e62347ee552a02162} Undeclare queryable 43 (add_two_ints)
2024-11-06T13:04:41.505338Z DEBUG tokio-runtime-worker ThreadId(04) zenoh::net::routing::dispatcher::token: Face{2, 616c459718f2192e62347ee552a02162} Undeclare token 44 (@/616c459718f2192e62347ee552a02162/@ros2_lv/SS/add_two_ints/example_interfaces§srv§AddTwoInts)
2024-11-06T13:04:41.505345Z INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds::routes_mgr: Route Service Server (ROS:/add_two_ints <-> Zenoh:add_two_ints) removed
2024-11-06T13:04:41.598112Z DEBUG tokio-runtime-worker ThreadId(14) zenoh_plugin_ros2dds::ros_discovery: Publish update on 'ros_discovery_info' with 0 writers and 0 readers
repeat kill command from terminal 3 to kill last node.
2024-11-06T13:05:14.929353Z INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds::discovered_entities: Undiscovered ROS Node /add_two_ints_server
2024-11-06T13:05:14.929778Z INFO tokio-runtime-worker ThreadId(04) zenoh_plugin_ros2dds: Node /add_two_ints_server undeclares Service Server /add_two_ints: example_interfaces/srv/AddTwoInts - Allowed
System info
Platform: Ubuntu 22.04.5 LTS - 64bit
Zenoh version: tag 1.0.1 (from source)
ROS2 version: humble (binaries)
The text was updated successfully, but these errors were encountered:
The root of the problem is that the bridge has not been designed to support several Nodes with the same name. Internally it uses maps with Node names as indexes.
In you scenario, when the 1st Node is killed, the 2nd Node starts, is discovered by the bridge and replaces the 1st Node in the internal maps. But as the 1st Node was not properly unregistering itself, its disappearance is detected later by the bridge (when CycloneDDS lease expires). And when it does, the bridge removes all about the Node with this name...
A workaround is to use different Node names. But I agree it's not realistic for a crashing Node which is automatically restarted.
The solution is to update the bridge to make it support several Nodes with the same name.
Describe the bug
When killing (SIGKILL) and restarting a node (or launch file with multiple nodes) quickly the Zenoh queryable can be removed from the newly discovered services. I noticed this by inspecting the zenoh admin space of the zenoh-bridge-ros2dds plugin from the router and client. We connect a zenoh-bridge-ros2dds in client mode to a zenoh-bridge-ros2dds in router mode to use ROS2 over WiFi.
When dynamically killing and spawning nodes their services sometimes fails to be picked up. What happens is the ROS2 srv and cli are being picked up and visible in the adminspace. But the queryable is initially spawned and than destroyed after 10s (from when the previous process was killed). I ivestigated where this 10s was coming from but can't seem to find a parameter to adjust it (tested
queries_default_timeout
andlink lease
without success).I would expect the service to work even though a previously killed process had the exact same service server.
To reproduce
To reproduce we can show this just with a single router and a single node from the
demo_nodes_cpp
package.Terminal 1:
Terminal 2:
ros2 run demo_nodes_cpp add_two_ints_server; ros2 run demo_nodes_cpp add_two_ints_server
I use twice the same command with a
;
to run the same command again after the first one is killed withSIGKILL
.Terminal 3:
Result
Open
http://localhost:8000/@/**/add_two_ints
to monitor if the server and queryable is shown, but in this case the service is being removed after the new node has already started.Logs
Logs only shown from start of demo node and redacted to only show relevant services and logs
kill using SIGKILL, killing the process and automatically restarting the node in terminal 2.
some delay before the next happens
repeat kill command from terminal 3 to kill last node.
System info
The text was updated successfully, but these errors were encountered: