You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The attestationconfigapi was broken today due to an updated version number, which was not globally rolled out by Azure on their SNP machines. Since Azure doesn't publicly post the version numbers, we read the versions from VM instances and adopt new versions after 2 weeks.
The problem with the current approach is that a single run with a newer version might not be globally rolled out after 2 weeks. Instead, we should consider all runs within a given time, find the minimal version value and use this as reference.
This more conservative approach should work more reliably.
Currently :
the fetcher fetches the latest version value older than 2 weeks
Proposed change(s)
use the configapi as source of truth: only update valid version values to the endpoint
the version fetcher simply fetches the latest value without any filter logic
delegate the version logic to the uploader (reporter)
the reporter (as part of E2E verify) caches observed version numbers in a separate directory and updates the API latest version value with this logic. Find the minimal version within the last 3 weeks and upload this version if it is newer than the current value
this change allows to easily patch the version manually without having to consider the timestamp and date selection logic.
the window size for the latest version decision is determined by a size of latest versions (default 15) instead of a time frame
NEW
fix the E2E for the attestationconfigapi
fix verify broken bazel target
use the test CDN for fetching latest in the CLI instead of fetching from prod during the test
IMPORTANT: for backward compatibility any manual version updates need to have a timestamp older than 2 weeks but newer than the current latest until v2.12 is released
DISCUSS:
the minimal length of cached version numbers is set to 3 which means that no updates will be considered in the next 3 weeks. Is this okay?
should we cache in a separate bucket so that it is not part of the CDN?
we invalidate the cache with every cache upload, do we care about this inefficiency or do we want to implement a separate S3 client without CDN cache?
I only skimmed the PR. This seems to implement the minimum selection. But there's another required change, which is to get the LaunchTCB from the SNP report instead of using the (presumably) CurrentTCB from the MAA token. If you think this will considerably change this PR, you should add it to this PR. Otherwise, you can either do it in this PR or another one as you prefer.
no changelogChange won't be listed in release changelog
4 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The attestationconfigapi was broken today due to an updated version number, which was not globally rolled out by Azure on their SNP machines. Since Azure doesn't publicly post the version numbers, we read the versions from VM instances and adopt new versions after 2 weeks.
The problem with the current approach is that a single run with a newer version might not be globally rolled out after 2 weeks. Instead, we should consider all runs within a given time, find the minimal version value and use this as reference.
This more conservative approach should work more reliably.
Currently :
Proposed change(s)
NEW
Related issue
Additional info
IMPORTANT: for backward compatibility any manual version updates need to have a timestamp older than 2 weeks but newer than the current latest until v2.12 is released
DISCUSS:
Checklist