Fleet unit uploaded, yet missing #1127

yaronr · 2015-02-11T08:21:53Z

Hi

CoreOS 584.0.0

fleetctl destroy wordpress-sidekick.service
fleetctl start wordpress-sidekick.service

or:
destroy - submit - load - start,

Expected result: unit will run
Actual result: unit is failed

Logs from fleet:

Feb 11 08:17:16 ip-10-0-4-135.ec2.internal fleetd[613]: ERROR manager.go:147: Failed to trigger systemd unit wordpress-sidekick.service stop: Unit wordpress-sidekick.service not loaded.
Feb 11 08:17:16 ip-10-0-4-135.ec2.internal fleetd[613]: INFO manager.go:275: Removing systemd unit wordpress-sidekick.service
Feb 11 08:17:16 ip-10-0-4-135.ec2.internal fleetd[613]: INFO reconcile.go:321: AgentReconciler completed task: type=UnloadUnit job=wordpress-sidekick.service reason="unit loaded but not scheduled here"
Feb 11 08:18:26 ip-10-0-4-135.ec2.internal fleetd[613]: INFO manager.go:262: Writing systemd unit wordpress-sidekick.service (740b)
Feb 11 08:18:26 ip-10-0-4-135.ec2.internal fleetd[613]: INFO reconcile.go:321: AgentReconciler completed task: type=LoadUnit job=wordpress-sidekick.service reason="unit scheduled here but not loaded"
Feb 11 08:18:58 ip-10-0-4-135.ec2.internal fleetd[613]: ERROR manager.go:136: Failed to trigger systemd unit wordpress-sidekick.service start: Unit wordpress-sidekick.service failed to load: No such file or directory.
Feb 11 08:18:58 ip-10-0-4-135.ec2.internal fleetd[613]: INFO reconcile.go:321: AgentReconciler completed task: type=StartUnit job=wordpress-sidekick.service reason="unit currently loaded but desired state is launched"

unit file details:
Wants=etcd.service
After=etcd.service

BindsTo=wordpress.service
After=wordpress.service

Restart=always

(wordpress and etcd units are up and running)

The text was updated successfully, but these errors were encountered:

bcwaldon · 2015-02-11T20:08:26Z

Likely related to #900

tom-pryor · 2015-02-19T16:37:01Z

I can't actually get any fleet units to run in my Vagrant environment.

I run:

fleetctl start syslog

Listing units:

core@core-01 ~ $ fleetctl list-units
UNIT        MACHINE             ACTIVE      SUB
syslog.service  0a805687.../172.17.8.103    inactive    dead

Error in log:

core-03 fleetd[915]: ERROR manager.go:136: Failed to trigger systemd unit syslog.service start: Unit syslog.service failed to load: No such file or directory.

SSH into core-03. syslog.service is present in /run/fleet/units but even trying to run it manually fails:

core@core-03 ~ $ sudo systemctl start syslog
Failed to start syslog.service: Unit syslog.service failed to load: No such file or directory.

tom-pryor · 2015-02-19T17:11:44Z

Looking at 4c23412 commit, if I add NeedDaemonReload=true then the unit runs fine. @jonboulle could you please clarify when a reload is necessary?

robszumski · 2015-02-19T18:08:05Z

I've seen this pop up several times on IRC in the past few days. I think we need to look at reverting this change.

sylus · 2015-02-20T22:03:57Z

Just confirming this as well. My fleet units worked in 522.6 but just updated to 598.0 (to test SMB support) and they no longer work and just say fail to load: No such file or directory.

I am currently seeing while looking through the issue queue that a few other issues are probably representative of the same problem.

Is this the definitive issue that is tracking this regression?

guruvan · 2015-02-20T23:09:05Z

@robszumski we'll try the NeedDaemonReload=true and see how that works for now, but this is really a debilitating issue - I've noted on a couple issues already

bdehamer · 2015-02-23T16:30:58Z

Seeing something similar to this issue on 557.2.0. I've got a few units that will fail to start about 50% of the time. I've got a script that demonstrates the issue pretty consistently:

https://gist.github.com/bdehamer/9c5303f7ef9d463a9134

This script goes into a loop and simply destroys, submits, loads, and starts the units over-and-over (you need the Fleet API bound to TCP port 49153 for the script to work). Things usually run fine the first time through, but will fail on the second execution with a not-found error:

core@core-01 ~/dev $ fleetctl status DB.service
● DB.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead) since Mon 2015-02-23 16:29:25 UTC; 10s ago
 Main PID: 5605 (code=exited, status=0/SUCCESS)

bcwaldon · 2015-02-23T17:45:19Z

I'm actively investigating this bug.

Quick note here - NeedDaemonReload is a property that systemd provides over the dbus interface, it is not a property a user can set in a unit file.

yaronr · 2015-02-25T14:05:07Z

@bcwaldon any updates?
I'm seeing this all the time.
At least a workaround?

Thanks

akaspin · 2015-02-25T19:32:55Z

Bump.

rufman · 2015-02-25T19:34:34Z

I would also be interested in a workaround. It seems like v0.8 doesn't have this issue (or it's not as pervasive).

guruvan · 2015-02-25T22:13:22Z

@bcwaldon This is something my crew needs to fix in the next week - we've got production systems and this is a sleep-loser because of constant service outages. A workaround would be fine for the next couple weeks while y'all figure this out - "CrashWhenAdminAwake=true" would also be good. ;) - I'm a little wary of just writing a script to blanket-restart all these services as they get stopped by fleet.

bcwaldon · 2015-03-02T20:49:30Z

Please see #1134

bcwaldon added the bug label Feb 11, 2015

bcwaldon added this to the v0.10.0 milestone Feb 21, 2015

bcwaldon mentioned this issue Feb 27, 2015

Ordered task execution #1134

Merged

bcwaldon closed this as completed in #1134 Feb 27, 2015

bcwaldon modified the milestones: v0.9.1, v0.10.0 Apr 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fleet unit uploaded, yet missing #1127

Fleet unit uploaded, yet missing #1127

yaronr commented Feb 11, 2015

bcwaldon commented Feb 11, 2015

tom-pryor commented Feb 19, 2015

tom-pryor commented Feb 19, 2015

robszumski commented Feb 19, 2015

sylus commented Feb 20, 2015

guruvan commented Feb 20, 2015

bdehamer commented Feb 23, 2015

bcwaldon commented Feb 23, 2015

yaronr commented Feb 25, 2015

akaspin commented Feb 25, 2015

rufman commented Feb 25, 2015

guruvan commented Feb 25, 2015

bcwaldon commented Mar 2, 2015

Fleet unit uploaded, yet missing #1127

Fleet unit uploaded, yet missing #1127

Comments

yaronr commented Feb 11, 2015

bcwaldon commented Feb 11, 2015

tom-pryor commented Feb 19, 2015

tom-pryor commented Feb 19, 2015

robszumski commented Feb 19, 2015

sylus commented Feb 20, 2015

guruvan commented Feb 20, 2015

bdehamer commented Feb 23, 2015

bcwaldon commented Feb 23, 2015

yaronr commented Feb 25, 2015

akaspin commented Feb 25, 2015

rufman commented Feb 25, 2015

guruvan commented Feb 25, 2015

bcwaldon commented Mar 2, 2015