Process multiple targets in single action call and support S3 backend #10

strophy · 2023-10-10T06:46:50Z

Hi @AkihiroSuda thank you for picking up maintenance of this important action!

We have added two features on a fork over at https://github.com/dcginfra/buildkit-cache-dance and I wonder if you would be interested in PRs to add these features to v2 of the action, now that its use is recommended in the official Docker documentation. We have two main changes:

Process multiple cache mounts in a single pass by specifying an ID for each mount
Support AWS S3 as an alternative cache storage backend

The changes require the user's Dockerfile to be modified with cache IDs like this:

FROM ubuntu:22.04
RUN \
  --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-cache \
  --mount=type=cache,target=/var/lib/apt,sharing=locked,id=apt-lib \
  apt-get update && apt-get install -y gcc

And the action is called something like this:

- name: inject cache mounts into docker
  uses: reproducible-containers/buildkit-cache-dance@mount-id-example
  with:
    mounts: |
      apt-cache
      apt-lib

The main change is in the Dancefile, which is generated on the fly with as many mounts and copy operations as necessary. There is no need to pass the cache-source and cache-target separately anymore because the cache is identified by its unique ID instead, like this:

- name: Prepare list of cache mounts for Dancefile
  uses: actions/github-script@v6
  id: mounts
  with:
    script: |
      const mountIds = `${{ inputs.mounts }}`.split(/[\r\n,]+/)
        .map((mount) => mount.trim())
        .filter((mount) => mount.length > 0);
      
      const cacheMountArgs = mountIds.map((mount) => (
        `--mount=type=cache,sharing=shared,id=${mount},target=/cache-mounts/${mount}`
      )).join(' ');
      
      const s3commands = mountIds.map((mount) => (
        `aws s3 sync --no-follow-symlinks --quiet s3://${{inputs.bucket}}/cache-mounts/${mount} /cache-mounts/${mount}`
      )).join('\n');

      core.setOutput('cacheMountArgs', cacheMountArgs);
      core.setOutput('s3commands', s3commands);

- name: Inject cache data into buildx context
  shell: bash
  run: |
    docker build ${{ inputs.cache-source }} --file - <<EOF
    FROM amazon/aws-cli:2.13.17
    COPY buildstamp buildstamp
    RUN ${{ steps.mounts.outputs.cacheMountArgs }} <<EOT
        echo -e '${{ steps.mounts.outputs.s3commands }}' | sh && \
        chmod 777 -R /cache-mounts || true
    EOT
    EOF

The code is currently still written in JS, and is quite tightly bound to S3 (since that is what we need) but I'd love to see features like this supported in the maintained version of the action, since there has been a lot of discussion about this (as I'm sure you're aware). Thoughts?

The text was updated successfully, but these errors were encountered:

AkihiroSuda · 2023-10-10T07:48:40Z

Thanks for proposal, SGTM

How will “mounts” work with actions/cache?
Do we really need to execute the awscli inside Dockerfile?
Probably, composite actions such as github-script cannot be used: Avoid composite action #4

strophy · 2023-10-10T08:17:07Z

I'm not sure about this, but I think we can call the GH cache API directly? The action would therefore require two inputs:
- list of mount ids
- (optional) cache backend (default to using GHA cache, if using S3 then bucket name is also needed)
Executing the cache call directly inside the Dockerfile results in a significant speedup with large cache by removing one of the copy operations, and uses less drive space because there is no need to store the cache in an intermediate step, so the copy operation cache mount -> runner local storage -> external cache becomes cache mount -> external cache directly
Yes, this would need to be rewritten in bash

We could probably even go a step further for point 1 and implement Apache OpenDAL as the backend, immediately adding support for a wide range of cloud storage. See https://github.com/everpcpc/actions-cache for an existing implementation of this.

AkihiroSuda · 2023-10-10T09:27:56Z

OpenDAL

What about rclone?
https://github.com/rclone/rclone

strophy · 2023-10-10T10:23:44Z

rclone looks perfect!

AkihiroSuda added the enhancement New feature or request label Oct 10, 2023

aminya mentioned this issue Mar 29, 2024

feat!: rewrite in TypeScript with CacheMap support #25

Merged

AkihiroSuda closed this as completed in #25 Mar 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process multiple targets in single action call and support S3 backend #10

Process multiple targets in single action call and support S3 backend #10

strophy commented Oct 10, 2023 •

edited

Loading

AkihiroSuda commented Oct 10, 2023

strophy commented Oct 10, 2023 •

edited

Loading

AkihiroSuda commented Oct 10, 2023 •

edited

Loading

strophy commented Oct 10, 2023

Process multiple targets in single action call and support S3 backend #10

Process multiple targets in single action call and support S3 backend #10

Comments

strophy commented Oct 10, 2023 • edited Loading

AkihiroSuda commented Oct 10, 2023

strophy commented Oct 10, 2023 • edited Loading

AkihiroSuda commented Oct 10, 2023 • edited Loading

strophy commented Oct 10, 2023

strophy commented Oct 10, 2023 •

edited

Loading

strophy commented Oct 10, 2023 •

edited

Loading

AkihiroSuda commented Oct 10, 2023 •

edited

Loading