Skip to content

Commit

Permalink
ImagePartitionAction: retry losetup.Attach()
Browse files Browse the repository at this point in the history
losetup.Attach() can fail due to concurrent attaches in other processes
as seen in go-debos#522 .

The problem is a race condition between finding a free loop device
and attaching the image.

Now that we have go-losetup v2, which does report the error, we can do
what util-linux does
( https://github.com/util-linux/util-linux/blob/4c4b248c68149089c8be2f830214bb2be693307e/sys-utils/losetup.c#L662 )
and retry on failure.

I only sleep for 200 ms as opposed to 1 second as in
https://github.com/go-debos/debos/blob/78aad24dc068ec2aac0355c165f760b953379b8f/actions/image_partition_action.go#L668
because the race condition should immediately resolve without waiting
at all.

I still sleep for 200 ms as this is what util-linux does to
prevent spinning ( util-linux/util-linux@3ff6fb8 ).

Fixes: go-debos#522
  • Loading branch information
Jakob Unterwurzacher committed Dec 2, 2024
1 parent 78aad24 commit ba76904
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion actions/image_partition_action.go
Original file line number Diff line number Diff line change
Expand Up @@ -459,7 +459,17 @@ func (i *ImagePartitionAction) PreNoMachine(context *debos.DebosContext) error {

img.Close()

i.loopDev, err = losetup.Attach(imagePath, 0, false)
// losetup.Attach() can fail due to concurrent attaches in other processes
retries = 60
for t := 1; t <= retries; t++ {
i.loopDev, err = losetup.Attach(imagePath, 0, false)
if err == nil {
break
}
log.Printf("Setup loop device: try %d/%d failed: %v", t, retries, err)
time.Sleep(200 * time.Millisecond)
}

if err != nil {
return fmt.Errorf("Failed to setup loop device")
}
Expand Down

0 comments on commit ba76904

Please sign in to comment.