-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for testing aws-hosted-cp
#280
Conversation
35fca5e
to
8e3465d
Compare
This comment was marked as outdated.
This comment was marked as outdated.
ce30914
to
5596db9
Compare
I was able to fix the nuke behavior so we reliably cleanup EIP's now which helps the I just need to figure out #290 and this should be done. It's definitely ready for review. |
01732f0
to
7da3be0
Compare
Currently failing in Deletion validation and reproducing the VPC dependency violation bug, I'll probably comment those tests out with a fixme because they'll never pass otherwise. |
We already have bug #152 for AWS VPC |
Thanks for adding it, I knew of it but was on my phone and couldn't throw it on here when I typed that comment. |
bedae9a
to
2148a61
Compare
2148a61
to
3530a70
Compare
I've removed the CCM tests for the hosted-cp temporarily so we can test cluster deletions via CI. I intend to bring them back before merge. |
7c6d4d9
to
1292207
Compare
6f69f86
to
37c26c5
Compare
It appears cancelled tests will not run the cleanup resources job, which is concerning since we could get into a state where our test gets cancelled by a subsequent PR push and then the AWS resources are never cleaned. I'll need to look into the best way to handle this case since the |
I think I have an idea of how to fix this, I can make the |
aae6aa2
to
6a3d3db
Compare
I've squash and pushed all of the changes since no one has reviewed this yet. In summary the squashed commit resolves all of the outstanding issues and fixes the cancellation issue. We just need @slysunkin 's merge in #242 so I can rebase and that should fix the deletions that are stuck for both hosted and standalone templates ... hopefully Looking forward to that green checkmark. |
I put the other make steps for |
6a3d3db
to
ee368eb
Compare
I pulled @slysunkin 's changes in and CI is running, hopefully things are good 🤞 |
ee368eb
to
2da5382
Compare
Last test failed in secret generation, that's a new one.
It's consistently failing here after a 2nd run, so it's not flake. :( I was able to get past the this phase of validation prior to the rebase so not sure what's changed but the control plane isn't readying now. k0s etcd is stuck in
The ebs-csi daemonset is up and running on the standalone cluster where this is deploying and the csi-driver validation phase completed successfully:
I'm seeing
The
I'm going to try setting Per https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/charts/aws-ebs-csi-driver/values.yaml#L371C1-L372C1 |
3ccb94c
to
d590bfb
Compare
ee759b8
to
6e10aa9
Compare
Almost there, seeing a failure to delete standalone now with:
This is with the ExternalGC feature gate active. The ELB that was created to validate CCM which is given a public IP is what's holding up deletion this time, we see the cleanup script delete it here:
|
2568013
to
0ef71dc
Compare
I've commented out the deletion test for standalone for a couple of reasons:
|
* Break KubeClient helpers into provider specific file. * Try to simplify the validation process for lots of different providers with different requirements. * Finish aws-hosted-cp test and add comments through test to make it easier to understand. * Use GinkgoHelper across e2e tests, populate hosted vars from AWSCluster. * No longer rely on local registry for images in test/e2e. * Support OS for awscli install. * Prepend hostname to collected log artifacts. * Support no cleanup of provider specs, differentiate ci cluster names. * Add docs on running tests, do not wait for all providers if configured. * Reinstantiate resource validation map on each instance of validation. * Enable the external-gc feature via annotation, featureGate bool. (Closes: #152) * Bump aws-*-cp templates to 0.1.3 * Bump cluster-api-provider-aws template to 0.1.2 * Improve test logging to log template name and validation phase. * Bump k0s version to v1.30.4+k0s.0, set CCM nodeSelector to null for aws-hosted-cp. (Closes: #290) * Break cleanup into seperate job so that it is unaffected by concurrency group cancellations. * Make dev-aws-nuke target less PHONY. * Only build linux/amd64 arch since CI does not need arm. Signed-off-by: Kyle Squizzato <[email protected]>
0ef71dc
to
7ec7fca
Compare
Signed-off-by: Kyle Squizzato <[email protected]>
Signed-off-by: Kyle Squizzato <[email protected]>
Add support for testing `aws-hosted-cp`
Note: This PR was intentionally created on a repository branch instead of my fork so we can iterate on the e2e testing workflow and then change it to
pull_request_target
for merge.easier to understand.
AWSCluster.
cluster names.
if configured.
validation.
bool. (Closes: AWS provider can't delete VPC if dependencies are present #152)
phase.
null for aws-hosted-cp. (Closes: Nodes bootstrapped via aws-hosted-cp get stuck with
node.cloudprovider.kubernetes.io/uninitialized
taint #290)concurrency group cancellations.
Closes: #212