Skip to content
This repository has been archived by the owner on Jan 30, 2020. It is now read-only.

Fleet starts units before required ones are loaded after a reboot #1003

Closed
digital-wonderland opened this issue Oct 26, 2014 · 1 comment · Fixed by #1134
Closed

Fleet starts units before required ones are loaded after a reboot #1003

digital-wonderland opened this issue Oct 26, 2014 · 1 comment · Fixed by #1134

Comments

@digital-wonderland
Copy link

I'm trying to run an Elasticsearch cluster with fleet. Each [email protected] has -data & -discovery sidekicks on which the main unit has a Requires=.

To get it up and running I first load all necessary units and then start the [email protected] units which in turn starts their respective sidekicks.

This works nicely except after a reboot when fleet tries to start the main service before its sidekicks are loaded:

Oct 24 21:32:51 core-02 fleetd[591]: INFO server.go:148: Starting server components
Oct 24 21:32:54 core-02 fleetd[591]: INFO manager.go:218: Writing systemd unit [email protected] (1343b)
Oct 24 21:32:54 core-02 fleetd[591]: INFO manager.go:142: Instructing systemd to reload units
Oct 24 21:32:54 core-02 fleetd[591]: INFO reconcile.go:274: AgentReconciler completed task: type=LoadUnit [email protected] reason="unit scheduled here but not loaded"
Oct 24 21:32:54 core-02 fleetd[591]: ERROR manager.go:80: Failed to trigger systemd unit [email protected] start: Unit [email protected] failed to load: No such file or directory.
Oct 24 21:32:54 core-02 fleetd[591]: INFO reconcile.go:274: AgentReconciler completed task: type=StartUnit [email protected] reason="unit currently loaded but desired state is launched"
Oct 24 21:32:59 core-02 fleetd[591]: INFO manager.go:218: Writing systemd unit [email protected] (370b)
Oct 24 21:32:59 core-02 fleetd[591]: INFO manager.go:142: Instructing systemd to reload units
Oct 24 21:32:59 core-02 fleetd[591]: INFO manager.go:218: Writing systemd unit [email protected] (515b)
Oct 24 21:32:59 core-02 fleetd[591]: INFO reconcile.go:274: AgentReconciler completed task: type=LoadUnit [email protected] reason="unit scheduled here but not loaded"
Oct 24 21:32:59 core-02 fleetd[591]: INFO manager.go:142: Instructing systemd to reload units
Oct 24 21:32:59 core-02 fleetd[591]: INFO reconcile.go:274: AgentReconciler completed task: type=LoadUnit [email protected] reason="unit scheduled here but not loaded"

To reproduce this follow the instructions from here. After the cluster is up restart one of the hosts.

My uneducated guess would be that deployment should first load all units, which are scheduled for a host, before trying to start one of them.

This also seems to be similar to #997 except that here sidekicks get only loaded but not started (since the Requires= in the main unit takes care of that).

@bcwaldon
Copy link
Contributor

bcwaldon commented Mar 2, 2015

Please see #1134

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants