Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gazebo freezes when YARP_CLOCK is set #526

Closed
randaz81 opened this issue Jan 11, 2021 · 9 comments · Fixed by #537
Closed

Gazebo freezes when YARP_CLOCK is set #526

randaz81 opened this issue Jan 11, 2021 · 9 comments · Fixed by #537

Comments

@randaz81
Copy link
Member

randaz81 commented Jan 11, 2021

If I set the env variable:
YARP_CLOCK=/clock
and the I start gazebo with:
gazebo -s libgazebo_yarp_clock.so
then the system immediately freezes (the gazebo gui does not appear).

So, currently, the only way to work with simulated time is to have all process with YARP_CLOCK=/clock, with the only exception of gazebo that must have this env variable unset.
Thus the environment variable YARP_CLOCK cannot be put into a bash_rc file.

@randaz81 randaz81 changed the title Gazebo stucks when YARP_CLOCK is set Gazebo freezes when YARP_CLOCK is set Jan 11, 2021
@traversaro
Copy link
Member

Back in time (2015/2016) this used not to be the case, but at some point it started happening, probably due to some yarp::os::Time::delay that is being called in some callback of the Gazebo's Physics update, that is causing a deadlock, as yarp::os::Time::delay will be waiting for the Gazebo time to pass to return, but the Gazebo's Physics time will not pass if the physics update does not complete because it is blocked by yarp::os::Time::delay.

Is this happening in the case of empty world?

@randaz81
Copy link
Member Author

Yes, it freezes just with:
gazebo -s libgazebo_yarp_clock.so
So the world is empty and the only plugin loaded is the clock

@traversaro
Copy link
Member

Then it is possible that there is something going on with the Network constructors or the checkNetwork method?

@drdanz
Copy link
Member

drdanz commented Jan 13, 2021

I don't remember changing anything...
A bisect is probably the best way to find where this issue was introduced

@traversaro
Copy link
Member

I don't remember changing anything...
A bisect is probably the best way to find where this issue was introduced

I am afraid that bisecting will not help as gazebo-yarp-plugins will not compile against of a YARP older then 2018/2019 . Probably it is easier to check where the program is blocked after is launched via gazebo -s libgazebo_yarp_clock.so .

@xEnVrE
Copy link
Contributor

xEnVrE commented Jan 13, 2021

To add some context, when launching Gazebo with that configuration on the terminal one reads

[INFO] |yarp.os.Port| Port /iiticublap202/gzserver/44642/clock:i active at tcp://192.168.1.5:10005/
[INFO] |yarp.os.Network| Success: port-to-port persistent connection added.
[ERROR] |yarp.os.NetworkClock| Cannot find time port "/clock" or a time topic "/clock@"
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
[INFO] |yarp.os.Time| Waiting for clock server to start broadcasting data ...
(...)

It seems that this is happening in

Time::useNetworkClock(const std::string& clock, const std::string& localPortName)

and is part of the initialization of the object yarp::os::Network in

m_network = new yarp::os::Network();

The yarp::os::Network object will try to read from a network clock that will be never available because the configuration of the clock plugin itself gets stuck in GazeboYarpClock::Load

The sequence of calls should be the following:

yarp::os::Network::Network()
yarp::os::Network::init()
yarp::os::Network::init(yarp::os::YARP_CLOCK_DEFAULT)
yarp::os::NetworkBase::initMinimum(yarp::os::YARP_CLOCK_DEFAULT)
yarp::os::NetworkBase::yarpClockInit(yarp::os::YARP_CLOCK_DEFAULT) (where the env variable YARP_CLOCK is detected)
Time::useNetworkClock("YARP_CLOCK")

@traversaro
Copy link
Member

traversaro commented Jan 13, 2021

Thanks @xEnVrE , after your comment I am able to connect the dots. Probably the regression was caused by robotology/yarp#1277, that introduced the early clock initialization. Before that, the Network clock was lazy inizialized, so the plugin was able to load correctly even if YARP_CLOCK was set to /clock .

@traversaro
Copy link
Member

traversaro commented Jan 13, 2021

A possible solution is to switch to use the system clock after yarp::sig::Network was initialized, and switch back to use the default clock once /clock has been created by calling yarp::os::NetworkBase::yarpClockInit(YARP_CLOCK_DEFAULT). This should work fine unless another Gazebo plugin called one of those methods to perturb the process-global YARP clock, but I think that is ok.

traversaro added a commit that referenced this issue Feb 11, 2021
Fix #526

Add also regression test for the issue
@traversaro
Copy link
Member

A fix for this issue is provided in #537 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants