Async controllers #3

saikishor · 2024-04-11T08:36:04Z

No description provided.

controller_interface/include/controller_interface/async_function_handler.hpp

controller_interface/test/test_async_function_handler.hpp

controller_interface/include/controller_interface/async_function_handler.hpp

controller_interface/include/controller_interface/controller_interface_base.hpp

controller_interface/test/test_async_function_handler.cpp

controller_interface/include/controller_interface/async_function_handler.hpp

controller_manager/src/controller_manager.cpp

atzaros · 2024-04-20T22:47:35Z

controller_interface/include/controller_interface/async_function_handler.hpp

+  {
+    if (is_running())
+    {
+      async_update_stop_ = true;


deadlock due to lost wake up: stop_async_update() could hang in thread_join() if lines 177-178 are executed while thread_ is in line 209

Surround async_update_stop_ with a mutex in both ends.

This is already fixed by adding the mutex on both ends

ros2_control/controller_interface/include/controller_interface/async_function_handler.hpp

Lines 183 to 194 in 40efb15

void stop_async_update()

{

if (is_running())

{

{

std::unique_lock<std::mutex> lock(async_mtx_);

async_update_stop_ = true;

}

async_update_condition_.notify_one();

thread_.join();

}

}

~~I do not see any mutex protecting this line~~

ah yes it works but I consider a better practice to protect async_update_stop_ with a mutex because it makes harder this error:

while ((get_state_function_().id() == lifecycle_msgs::msg::State::PRIMARY_STATE_ACTIVE || get_state_function_().id() == lifecycle_msgs::msg::State::TRANSITION_STATE_ACTIVATING) && !async_update_stop_) { doing_something_while_async_update_stop_is_true(); // <== my future me committing a bug { std::unique_lock<std::mutex> lock(async_mtx_); async_update_condition_.wait( lock, [this] { return trigger_in_progress_ || async_update_stop_; }); ...

My suggestion is:

std::unique_lock<std::mutex> lock(async_mtx_); while ((get_state_function_().id() == lifecycle_msgs::msg::State::PRIMARY_STATE_ACTIVE || get_state_function_().id() == lifecycle_msgs::msg::State::TRANSITION_STATE_ACTIVATING) && !async_update_stop_) { async_update_condition_.wait( lock, [this] { return trigger_in_progress_ || async_update_stop_; }); if (!async_update_stop_) { async_update_return_ = async_function_(current_update_time_, current_update_period_); } trigger_in_progress_ = false; cycle_end_condition_.notify_all(); }

atzaros · 2024-04-20T23:09:04Z

controller_interface/include/controller_interface/async_function_handler.hpp

+  // Async related variables
+  std::thread thread_;
+  std::atomic_bool async_update_stop_{false};
+  std::atomic_bool trigger_in_progress_{false};


no atomic required: if line 195 is changed to lock.owns_lock() && !trigger_in_progress_

for performance reasons, now you can remove atomic from trigger_in_progress_

controller_interface/include/controller_interface/async_function_handler.hpp

atzaros · 2024-04-21T21:57:00Z

controller_interface/include/controller_interface/async_function_handler.hpp

+              lock, [this] { return trigger_in_progress_ || async_update_stop_; });
+            if (async_update_stop_)
+            {
+              break;


when we break here doing nothing makes threads hang if they're blocked in wait_for_trigger_cycle_to_finish(). We must notify these waiters that we finished the cycle.
The steps that lead to the situation are:

th1: wait_for_trigger_cycle_to_finish()

th2: stop_async_update()

th1 could hang or crash (if stop_async_update() has been called from ~AsyncFunctionHandler())

controller_interface/include/controller_interface/async_function_handler.hpp

atzaros · 2024-04-23T09:42:58Z

controller_interface/include/controller_interface/async_function_handler.hpp

+  // Async related variables
+  std::thread thread_;
+  std::atomic_bool async_update_stop_{false};
+  std::atomic_bool trigger_in_progress_{false};


for performance reasons, now you can remove atomic from trigger_in_progress_

atzaros · 2024-04-26T10:12:26Z

controller_interface/test/test_async_function_handler.cpp

@@ -125,7 +127,7 @@ TEST_F(AsyncFunctionHandlerTest, check_triggering)
  ASSERT_TRUE(async_class.get_handler().is_initialized());
  ASSERT_TRUE(async_class.get_handler().is_running());
  ASSERT_FALSE(async_class.get_handler().is_stopped());
-  async_class.get_handler().wait_for_trigger_cycle_to_finish();
+  std::this_thread::sleep_for(std::chrono::microseconds(1));


Normally concurrency tests with random delays are suspicious.

So... why did you change that? I guess you observed a deadlock here.

If this is the case, either wait_for_trigger_cycle_to_finish() needs some changes or we need to clarify its behavior (and change the test accordingly).

I had to add this 1 microsecond delay in order to run the update cycle. If not, it immediately aborts the update cycle and fails in the assertion of the counter being 2

ros2_control/controller_interface/include/controller_interface/async_function_handler.hpp

Lines 228 to 233 in 40efb15

if (async_update_stop_)

{

trigger_in_progress_ = false;

cycle_end_condition_.notify_one();

break;

}

atzaros · 2024-04-26T10:25:35Z

controller_interface/test/test_async_function_handler.cpp

+  ASSERT_TRUE(async_class.get_handler().is_initialized());
+  ASSERT_TRUE(async_class.get_handler().is_running());
+  ASSERT_FALSE(async_class.get_handler().is_stopped());
+  async_class.get_handler().wait_for_trigger_cycle_to_finish();


Potential deadlock due to lost wake up, there is no warranty that this thread is awaiting before anyc_class.thread_ is signaling.

I also thought so, but open reading around, the conditional variable even before waiting., it checks if the predicate is true or false, if it is true, it continues directly without any signal. If it is false, then it starts to wait for the wakeup signal

I've just testing adding 200 ms delay before the wait and the tests pass (no deadlock)

Adding-delays-until-it-works(TM) is a bad approach because it relays heavily on uncontrolled conditions as computer load, OS (dynamic) scheduler policies, etc.

Basically your tests suffer of UB

controller_interface/test/test_async_function_handler.cpp

atzaros · 2024-04-26T10:48:52Z

controller_interface/test/test_async_function_handler.cpp

+    ASSERT_TRUE(async_class.get_handler().is_initialized());
+    ASSERT_TRUE(async_class.get_handler().is_running());
+    ASSERT_FALSE(async_class.get_handler().is_stopped());
+    async_class.get_handler().wait_for_trigger_cycle_to_finish();


Same potential deadlock between trigger() and wait_for_trigger_cycle_to_finish()

Please add some delay in between to check my assumption

@atzaros I added like 50 ms delay right after the trigger, and the tests seem to pass

atzaros · 2024-04-26T10:50:08Z

controller_interface/test/test_async_function_handler.cpp

+  }
+  async_class.get_handler().stop_async_update();
+
+  // now the async update should be preempted


what do you mean with preemption?

atzaros · 2024-04-26T11:04:51Z

controller_interface/include/controller_interface/async_function_handler.hpp

+   */
+  void join_async_update_thread()
+  {
+    if (is_running())


calling is_running() and join() from different threads could incur in a data race over thread_'s state, for instance calling wait_for_trigger_cycle_to_finish() and join_async_update_thread() concurrently.

I would document clearly that all class' functions using is_running() and join() cannot be used from different threads concurrently.

You mean to say when we have more then 2 threads involved?

atzaros · 2024-04-26T11:46:10Z

controller_interface/include/controller_interface/async_function_handler.hpp

+  void wait_for_trigger_cycle_to_finish()
+  {
+    if (is_running())
+    {
+      std::unique_lock<std::mutex> lock(async_mtx_);
+      cycle_end_condition_.wait(lock, [this] { return !trigger_in_progress_; });
+    }
+  }


Now reading at your tests I understand you want to use this function for detecting progress after a trigger has been issued.

I think what you need here is either a heartbeat [1] or just change your approach in tests using synchronization primitives instead of a plain counter

[1] it means some refactoring around trigger() and wait_for_trigger_cycle_to_finish(), the former returning the current heartbeat and the latter checking for any increase

synchronization primitives sounds like a better idea right? What would you say?

Synch primitives in the tests.
If you walk that way, I would remove wait_for_trigger_cycle_to_finish() from the API. It is harmful and, I think, it has been created just because you needed it for testing.

Having a measure of progress in the async's is useful for the controller manager to diagnose abnormal situations but it's a nice-to-have.

…purious wakeups

…cycle

…ng notify

… between threads

Co-authored-by: atzaros <[email protected]>

…FunctionHandler

…rom it

…g the controller

…les (ros-controls#1724)

Co-authored-by: Bence Magyar <[email protected]>

…nterface_base.hpp

… missed trigger

jordan-palacios reviewed Apr 15, 2024

View reviewed changes

saikishor force-pushed the async_controllers branch 2 times, most recently from 4fede49 to a21ef6f Compare April 15, 2024 21:22

jordan-palacios reviewed Apr 18, 2024

View reviewed changes

saikishor marked this pull request as ready for review April 18, 2024 13:25

atzaros suggested changes Apr 21, 2024

View reviewed changes

atzaros reviewed Apr 21, 2024

View reviewed changes

controller_interface/include/controller_interface/async_function_handler.hpp Outdated Show resolved Hide resolved

atzaros suggested changes Apr 21, 2024

View reviewed changes

atzaros suggested changes Apr 26, 2024

View reviewed changes

saikishor force-pushed the async_controllers branch from 40efb15 to 0d04cd3 Compare May 5, 2024 09:38

saikishor force-pushed the async_controllers branch from b5de991 to f7d015d Compare August 28, 2024 11:25

saikishor added 19 commits September 26, 2024 22:55

Add first version of the async controllers

e8130a4

removed extra notify one

062a445

added async_update_ready_ to use with conditional variable to avoid s…

fd270ca

…purious wakeups

call notify_one after setting the async_update_ready

703f91f

add precommit changes

c09339f

add missing this in the conditional variable callback

1f58a2d

change atomic bool to normal bool

c2f7192

add wait_for_update_to_finish method

4255cc3

add notify_one in different places to wait properly to finish update …

9f7c0b4

…cycle

added unique lock and check with try_lock for triggering calls

faba7cd

use try_to_lock for better checking of owning the lock or not

4ffa885

add current update time and period to be able to update them

f575440

update documentation of the async method

4348019

added some comments about the issue with the get_state method

a2469ca

use atomic_bool

ffaad19

minor change in the logic

b7586a2

initialize the duration to fix the compilation error

0772c49

Move the main async logic into a separate method

bf2a2d3

Add async function handler to handle parsed functions

89e6a7b

saikishor and others added 21 commits September 26, 2024 22:55

Use separated arguments for better clarity

7f2aa64

Move the lock within a scope to avoid exclusive unlock within the thread

679fbf9

notify other waiting threads before stopping

d1ac8df

Scope the lock to avoid manual unlock in the trigger_async_update method

4e3dbb1

Update async_update_stop_ within the scope of the lock to avoid missi…

30be210

…ng notify

Add new conditional variable to have unexpected behavior when working…

e54e613

… between threads

Changes for the starting thread upon activation

47f6b02

remove unused notify at the end of the thread

8525725

simplify the logic inside the thread

4eb74cc

Co-authored-by: atzaros <[email protected]>

add pre-commit formatting changes

a131029

remove the AsyncControllerThread integration to replace it with Async…

b4b2afa

…FunctionHandler

move the async function handler to the realtime_tools and integrate f…

ee64bc7

…rom it

start the async thread when configuring the controller

469682a

remove the get_state bind for API change

f3b5d43

add new API naming changes

408780c

added thread priority argument to be able to set the scheduler priority

8f10adc

stop the thread on cleanup of the controller

a40b005

wait for cycle to finish in the switch controllers before deactivatin…

aa9a81e

…g the controller

fix the thread_priority parameter declaration type

023a204

change conditioning to not trigger logging for a failed return as well

3bd19bf

Add tests for the async controller

ebb69c5

saikishor force-pushed the async_controllers branch from 6c80b9e to ebb69c5 Compare September 26, 2024 20:55

saikishor and others added 8 commits October 17, 2024 10:40

[Spawner] Add support for wildcard entries in the controller param fi…

83fff77

…les (ros-controls#1724)

Update controller_interface/src/controller_interface_base.cpp

e38f567

Co-authored-by: Bence Magyar <[email protected]>

Merge branch 'master' into async_controllers

cbc3638

Update controller_interface/include/controller_interface/controller_i…

00228c4

…nterface_base.hpp

Fixing pre-commit.

df9378f

skip triggeting update cycle before deactivating the async controller

dd89180

fix the testing for the recent change in the deactivation scheme

5a85a24

Add controller update stats to print every 20 seconds when there is a…

75754de

… missed trigger

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async controllers #3

Async controllers #3

saikishor commented Apr 11, 2024

atzaros Apr 20, 2024 •

edited

Loading

saikishor Apr 28, 2024

atzaros Apr 29, 2024 •

edited

Loading

atzaros Apr 20, 2024

atzaros Apr 23, 2024

atzaros Apr 21, 2024

saikishor Apr 21, 2024

atzaros Apr 23, 2024

atzaros Apr 26, 2024

saikishor Apr 28, 2024

atzaros Apr 26, 2024

saikishor Apr 28, 2024

atzaros Apr 29, 2024

atzaros Apr 26, 2024

saikishor Apr 28, 2024

atzaros Apr 26, 2024

atzaros Apr 26, 2024

saikishor Apr 28, 2024

atzaros Apr 29, 2024

atzaros Apr 26, 2024 •

edited

Loading

saikishor Apr 28, 2024

atzaros Apr 29, 2024

	void stop_async_update()
	{
	if (is_running())
	{
	{
	std::unique_lock<std::mutex> lock(async_mtx_);
	async_update_stop_ = true;
	}
	async_update_condition_.notify_one();
	thread_.join();
	}
	}

	if (async_update_stop_)
	{
	trigger_in_progress_ = false;
	cycle_end_condition_.notify_one();
	break;
	}

Async controllers #3

Are you sure you want to change the base?

Async controllers #3

Conversation

saikishor commented Apr 11, 2024

atzaros Apr 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atzaros Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atzaros Apr 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atzaros Apr 20, 2024 •

edited

Loading

atzaros Apr 29, 2024 •

edited

Loading

atzaros Apr 26, 2024 •

edited

Loading