Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automate peerpods image upload & use that in Terraform configs #931

Merged
merged 7 commits into from
Oct 21, 2024

Conversation

Freax13
Copy link
Contributor

@Freax13 Freax13 commented Oct 15, 2024

This PR adds a script/just target for automating the image upload of peerpod VM images and adjusts our Terraform configs to use the image id of the uploaded image.

@Freax13 Freax13 added the no changelog PRs not listed in the release notes label Oct 15, 2024
infra/azure-peerpods/vars.tf Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
packages/uplosi.conf.template Outdated Show resolved Hide resolved
packages/uplosi.conf.template Outdated Show resolved Hide resolved
packages/uplosi.conf.template Outdated Show resolved Hide resolved
flake.nix Outdated Show resolved Hide resolved
justfile Show resolved Hide resolved
justfile Show resolved Hide resolved
justfile Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
packages/scripts.nix Outdated Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
We don't want to hard-code this.
justfile Outdated Show resolved Hide resolved
justfile Outdated Show resolved Hide resolved
packages/scripts.nix Outdated Show resolved Hide resolved
packages/scripts.nix Show resolved Hide resolved
@katexochen
Copy link
Member

I think the caching isn't correct the way it is currently implemented: If I upload /nix/store/AAA-image-podvm and then /nix/store/BBB-image-podvm, the cache will contain two files, each named after one of the nix path, both containing the same image reference. If I undo the change from BBB back to AAA, and want to redeploy, the upload will check the cache and find a file for the path AAA with the image reference and assumes it is already uploaded, but under that reference is actually the image from the BBB path.

@Freax13
Copy link
Contributor Author

Freax13 commented Oct 17, 2024

I think the caching isn't correct the way it is currently implemented: If I upload /nix/store/AAA-image-podvm and then /nix/store/BBB-image-podvm, the cache will contain two files, each named after one of the nix path, both containing the same image reference. If I undo the change from BBB back to AAA, and want to redeploy, the upload will check the cache and find a file for the path AAA with the image reference and assumes it is already uploaded, but under that reference is actually the image from the BBB path.

Good catch, I'll integrate the store hash into the image reference to avoid this.

@katexochen
Copy link
Member

Good catch, I'll integrate the store hash into the image reference to avoid this.

Not sure that's easily possible. :( Azure's image infra is crap. Image versions are restricted to semver, but having a separate gallery for each image requires additional time for creation during upload.

packages/scripts.nix Outdated Show resolved Hide resolved
@Freax13
Copy link
Contributor Author

Freax13 commented Oct 17, 2024

Good catch, I'll integrate the store hash into the image reference to avoid this.

Not sure that's easily possible. :( Azure's image infra is crap. Image versions are restricted to semver, but having a separate gallery for each image requires additional time for creation during upload.

Turns out the limits on the version are pretty lax, so we can just use 0.0.<unix timestamp> :)

@katexochen

This comment was marked as resolved.

In the future, we'll also want to upload images to that resource group
and we want to do that before running Terraform because Terraform takes
the image id as an input. Instead create the cluster manually.
This script:
1. Runs nix to build an image.
2. Uploads it to azure.
3. Writes the image id to a file that's automatically loaded by
   Terraform.

Note that we skip uploading if we already have a cache image id.
Use a newer uplosi that has fixes for private galleries.
Uploading the image will populate the image_id variable that's used by
Terraform to create the cluster.
We can derive the values for these variables from the values configured
in justfile.env.
This resource group is only used for the non-peerpods AKS resource
group. The peerpods cluster doesn't need to read from this resource
group.
@Freax13
Copy link
Contributor Author

Freax13 commented Oct 18, 2024

@burgerdev @3u13r can one of you review this pr?

infra/azure-peerpods/vars.tf Show resolved Hide resolved
packages/scripts.nix Show resolved Hide resolved
@Freax13
Copy link
Contributor Author

Freax13 commented Oct 21, 2024

I have two approvals, but Paul's last review requested changes, so I can't merge this. Given that Paul is on vacation until Wednesday, and has said that he's fine with merging this if there are approvals by other reviewers, I'll temporarily disable the merge requirement to merge this PR.

@Freax13 Freax13 merged commit dc30318 into main Oct 21, 2024
9 checks passed
@Freax13 Freax13 deleted the tom/image-upload branch October 21, 2024 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no changelog PRs not listed in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants