Skip to content

Commit

Permalink
Fix/mounting squashfs extract (#514)
Browse files Browse the repository at this point in the history
* refactor: simplify unpackOne, drop unused param to ExtractSingleSquash

unpackOne required the caller to provide it with a boolean 'isSquashfs'
which then made each caller have to consider the layer type.

Update unpackOne to take a Descriptor and let it do the
determination of how to unpack the layer in a single place.

The result of calling unpackLayer is either error, or the contents
available at the provided path.  The caller does not have to check
if the content is already present.

Also here, drop the 'storage' parameter to ExtractSingleSquash
that had become unused.

Signed-off-by: Scott Moser <[email protected]>

* fix: Update ExtractSingleSquash, adding policy.

Overall, there are 3 "good things" done by this change:
1. Fix bug in the current code which tries mounting with each option
   every time.  The problem with doing that is really that the kernel
   mount option didn't work very well.  It would fail with "already
   mounted", and then squashfuse would end up getting used to mount over
   the top.

2. Fix race conditions in the current code.

   overlay.Unpack starts a thread pool and tries to unpack all layers
   at once.  That is fine if there are no duplicate layers.  But
      if there are duplicate layers used by a stacker.yaml file, then
   there were races on extraction.  The end result really was that things
   would get mounted more than once.

   Example stacker that shows this:

    l1:
      from:
        type: docker
        url: docker://busybox:latest
      run: |
        echo build layer 1

    l2:
      from:
        type: docker
        url: docker://busybox:latest
      run: |
        echo build layer 1

  There, the busybox layer would get extracted multiple times.

  The code here has a single lock on ExtractSingleSquash, it would
  be better to have lock being taken per extractDir.

3. Allow the user to control the list of extractors.

   If they knew that they could not use kernel mounts (or could, but
   didn't want to) or wanted to use unsquashfs they can now do that.

   STACKER_SQUASHFS_EXTRACT_POLICY=kmount stacker build ..

   or

   STACKER_SQUASHFS_EXTRACT_POLICY="squashfuse kmount" stacker build ...

   This adds a SquashExtractor interface, with 3 implementers
   (KernelExtractor, SquashFuseExtractor, UnsquashfsExtractor).

   A ExtractPolicy is basically a list of Extractors to try.
   The first time ExtractPolicy is used it will try each of the Extractors
   in order.  It then stores the result in .Extractor and uses that
   subsequently.

Signed-off-by: Scott Moser <[email protected]>

* fix: Improve and add some debug messages in overlay storage.

This seeks to improve some of the existing debug messages and add some
additional debug in the overlay storage.

Signed-off-by: Scott Moser <[email protected]>

---------

Signed-off-by: Scott Moser <[email protected]>
  • Loading branch information
smoser authored Oct 12, 2023
1 parent 4016f63 commit 565b032
Show file tree
Hide file tree
Showing 4 changed files with 390 additions and 124 deletions.
21 changes: 3 additions & 18 deletions pkg/overlay/overlay.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@ import (
ispec "github.com/opencontainers/image-spec/specs-go/v1"
"github.com/pkg/errors"
"golang.org/x/sys/unix"
"stackerbuild.io/stacker/pkg/mount"
"stackerbuild.io/stacker/pkg/squashfs"
"stackerbuild.io/stacker/pkg/types"
)

Expand Down Expand Up @@ -168,23 +166,10 @@ func (o *overlay) snapshot(source string, target string) error {
for _, m := range ovl.Manifests {
manifest = m
}
cacheDir := path.Join(o.config.StackerDir, "layer-bases", "oci")
ociDir := path.Join(o.config.StackerDir, "layer-bases", "oci")
for _, layer := range manifest.Layers {
if !squashfs.IsSquashfsMediaType(layer.MediaType) {
continue
}
digest := layer.Digest
contents := overlayPath(o.config.RootFSDir, digest, "overlay")
mounted, err := mount.IsMountpoint(contents)
if err == nil && mounted {
// We have already mounted this atom
continue
}
if hasDirEntries(contents) {
// We have done an unsquashfs of this atom
continue
}
if err := unpackOne(cacheDir, contents, digest, true); err != nil {
err := unpackOne(layer, ociDir, overlayPath(o.config.RootFSDir, layer.Digest, "overlay"))
if err != nil {
return errors.Wrapf(err, "Failed mounting %#v", layer)
}
}
Expand Down
118 changes: 69 additions & 49 deletions pkg/overlay/pack.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"path/filepath"
"runtime"
"strings"
"sync"
"time"

"github.com/klauspost/pgzip"
Expand All @@ -29,6 +30,8 @@ import (
"stackerbuild.io/stacker/pkg/types"
)

var tarEx sync.Mutex

func safeOverlayName(d digest.Digest) string {
// dirs used in overlay lowerdir args can't have : in them, so lets
// sanitize it
Expand Down Expand Up @@ -57,36 +60,19 @@ func (o *overlay) Unpack(tag, name string) error {

pool := NewThreadPool(runtime.NumCPU())

for _, layer := range manifest.Layers {
digest := layer.Digest
contents := overlayPath(o.config.RootFSDir, digest, "overlay")
if squashfs.IsSquashfsMediaType(layer.MediaType) {
// don't really need to do this in parallel, but what
// the hell.
pool.Add(func(ctx context.Context) error {
return unpackOne(cacheDir, contents, digest, true)
})
} else {
switch layer.MediaType {
case ispec.MediaTypeImageLayer:
fallthrough
case ispec.MediaTypeImageLayerGzip:
// don't extract things that have already been
// extracted
if _, err := os.Stat(contents); err == nil {
continue
}

// TODO: when the umoci API grows support for uid
// shifting, we can use the fancier features of context
// cancelling in the thread pool...
pool.Add(func(ctx context.Context) error {
return unpackOne(cacheDir, contents, digest, false)
})
default:
return errors.Errorf("unknown media type %s", layer.MediaType)
}
seen := map[digest.Digest]bool{}
for _, curLayer := range manifest.Layers {
// avoid calling unpackOne twice for the same digest
if seen[curLayer.Digest] {
continue
}
seen[curLayer.Digest] = true

// copy layer to avoid race on pool access.
l := curLayer
pool.Add(func(ctx context.Context) error {
return unpackOne(l, cacheDir, overlayPath(o.config.RootFSDir, l.Digest, "overlay"))
})
}

pool.DoneAddingJobs()
Expand Down Expand Up @@ -154,6 +140,9 @@ func ConvertAndOutput(config types.StackerConfig, tag, name string, layerType ty
return err
}

log.Debugf("new oci layer %s [%s] created from path %s as part of %s:%s",
desc.Digest, layerType, overlayPath(config.RootFSDir, theLayer.Digest), name, tag)

// slight hack, but this is much faster than a cp, and the
// layers are the same, just in different formats
err = os.Symlink(overlayPath(config.RootFSDir, theLayer.Digest), overlayPath(config.RootFSDir, desc.Digest))
Expand Down Expand Up @@ -232,7 +221,8 @@ func (o *overlay) initializeBasesInOutput(name string, layerTypes []types.LayerT
return err
}
} else {
log.Debugf("converting between %v and %v", sourceLayerType, layerType)
log.Debugf("creating oci image %s (type=%s) by converting %s (type=%s)",
layerType.LayerName(name), layerType, sourceLayerType.LayerName(name), sourceLayerType)
err = ConvertAndOutput(o.config, cacheTag, name, layerType)
if err != nil {
return err
Expand Down Expand Up @@ -447,6 +437,7 @@ func generateLayer(config types.StackerConfig, oci casext.Engine, mutators []*mu
return false, errors.Wrapf(err, "couldn't make new layer overlay dir")
}

log.Debugf("renaming %s -> %s", dir, path.Join(target, "overlay"))
err = os.Rename(dir, path.Join(target, "overlay"))
if err != nil {
if !os.IsExist(err) {
Expand Down Expand Up @@ -477,6 +468,7 @@ func generateLayer(config types.StackerConfig, oci casext.Engine, mutators []*mu
for _, desc := range descs[1:] {
linkPath := overlayPath(config.RootFSDir, desc.Digest)

log.Debugf("link %s -> %s", linkPath, target)
err = os.Symlink(target, linkPath)
if err != nil {
// as above, this symlink may already exist; if it does, we can
Expand Down Expand Up @@ -548,7 +540,7 @@ func repackOverlay(config types.StackerConfig, name string, layerTypes []types.L
mutators = append(mutators, mutator)
}

log.Debugf("Generating overlay_dirs layers")
log.Debugf("Generating overlay_dirs layers for %s", name)
mutated := false
for i, layerType := range layerTypes {
ods, ok := ovl.OverlayDirLayers[layerType]
Expand Down Expand Up @@ -655,29 +647,57 @@ func repackOverlay(config types.StackerConfig, name string, layerTypes []types.L
return ovl.write(config, name)
}

func unpackOne(ociDir string, bundlePath string, digest digest.Digest, isSquashfs bool) error {
if isSquashfs {
return squashfs.ExtractSingleSquash(
path.Join(ociDir, "blobs", "sha256", digest.Encoded()),
bundlePath, "overlay")
// unpackOne - unpack a single layer (Descriptor) found in ociDir to extractDir
//
// The result of calling unpackOne is either error or the contents available
// at the provided extractDir. The extractDir should be either empty or
// fully populated with this layer.
func unpackOne(l ispec.Descriptor, ociDir string, extractDir string) error {
// population of a dir is not atomic, at least for tar extraction.
// As a result, we could hasDirEntries(extractDir) at the same time that
// something is un-populating that dir due to a failed extraction (like
// os.RemoveAll below).
// There needs to be a lock on the extract dir (scoped to the overlay storage backend).
// A sync.RWMutex would work well here since it is safe to check as long
// as no one is populating or unpopulating.
if hasDirEntries(extractDir) {
// the directory was already populated.
return nil
}

oci, err := umoci.OpenLayout(ociDir)
if err != nil {
return err
if squashfs.IsSquashfsMediaType(l.MediaType) {
return squashfs.ExtractSingleSquash(
path.Join(ociDir, "blobs", "sha256", l.Digest.Encoded()), extractDir)
}
defer oci.Close()
switch l.MediaType {
case ispec.MediaTypeImageLayer, ispec.MediaTypeImageLayerGzip:
tarEx.Lock()
defer tarEx.Unlock()

compressed, err := oci.GetBlob(context.Background(), digest)
if err != nil {
return err
}
defer compressed.Close()
oci, err := umoci.OpenLayout(ociDir)
if err != nil {
return err
}
defer oci.Close()

uncompressed, err := pgzip.NewReader(compressed)
if err != nil {
compressed, err := oci.GetBlob(context.Background(), l.Digest)
if err != nil {
return err
}
defer compressed.Close()

uncompressed, err := pgzip.NewReader(compressed)
if err != nil {
return err
}

err = layer.UnpackLayer(extractDir, uncompressed, nil)
if err != nil {
if rmErr := os.RemoveAll(extractDir); rmErr != nil {
log.Errorf("Failed to remove dir '%s' after failed extraction: %v", extractDir, rmErr)
}
}
return err
}

return layer.UnpackLayer(bundlePath, uncompressed, nil)
return errors.Errorf("unknown media type %s", l.MediaType)
}
Loading

0 comments on commit 565b032

Please sign in to comment.