Skip to content
This repository has been archived by the owner on Dec 16, 2024. It is now read-only.

gh-628 fix deletion of dnsrecord and certificates on deletion of gateway target for dnspolicy and tlspolicy #635

Merged
merged 2 commits into from
Nov 1, 2023

Conversation

laurafitzgerald
Copy link
Contributor

@laurafitzgerald laurafitzgerald commented Oct 20, 2023

Addressed #628

Verification

DNSRecord

Follow the steps described in the issue, the dnsrecord should be deleted.

Health Check Probes

kubectl create secret generic probe-headers --from-literal=test=dGVzdAo= -n multi-cluster-gateways

Follow the steps describe in the issue with the following adjustments.
Add the following block to your dnspolicy

healthCheck:
      additionalHeadersRef:
        name: probe-headers
      allowInsecureCertificates: true
      endpoint: /
      expectedResponses:
      - 200
      - 201
      - 301
      failureThreshold: 1
      port: 443
      protocol: https

there should be one dnshealthcheck probe present in the cluster now
e.g.

kc get dnshealthcheckprobes -A        

NAMESPACE                NAME                        HEALTHY   LAST CHECKED
multi-cluster-gateways   172.31.200.0-prod-web-api   true      3s

Delete the dnspolicy, the healthcheck probe should be deleted.

Certificates

  • create gateway
  • label gateway
  • create tlspolicy
  • verify certificates are created
  • delete gateway
  • verify certificates are deleted.

@laurafitzgerald laurafitzgerald temporarily deployed to e2e-internal October 20, 2023 10:23 — with GitHub Actions Inactive
@laurafitzgerald laurafitzgerald temporarily deployed to e2e-internal October 20, 2023 10:42 — with GitHub Actions Inactive
Copy link
Member

@mikenairn mikenairn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of small changes that i think we need to make sure we aren't deleting the wrong resources when processing the gateways that have an invalid policy ref.

…n of gateway target for dnspolicy and tlspolicy
@laurafitzgerald
Copy link
Contributor Author

laurafitzgerald commented Oct 25, 2023

The e2e test failure here is intermittent and appears to be not related to the change.
The failure happens on error [debug] Cert error: 'wildcard hostname not found in the certificate via get request' which is in relation to tlspolicy and secret creation. The first two failures were on the AWS suite and the third is on GCP suite. Locally this test is passing for me on AWS backed cluster.

EDIT: e2e test are passing now. Seems it was intermittent/unrelated.

Aside from this feedback has been addressed and verification steps are updated.

I will keep investigating the failure but it seems intermittent and unrelated.

cc @mikenairn

@laurafitzgerald laurafitzgerald temporarily deployed to e2e-internal October 25, 2023 13:02 — with GitHub Actions Inactive
func (r *TLSPolicyReconciler) deleteGatewayCertificates(ctx context.Context, gateway *gatewayv1beta1.Gateway, tlsPolicy *v1alpha1.TLSPolicy) error {
return r.deleteUnexpectedGatewayCertificates(ctx, []*certmanv1.Certificate{}, gateway, tlsPolicy)
func (r *TLSPolicyReconciler) deleteGatewayCertificates(ctx context.Context, expectedCerts []*certmanv1.Certificate, gateway *gatewayv1beta1.Gateway, tlsPolicy *v1alpha1.TLSPolicy) error {
return r.deleteCertificatesWithLabels(ctx, expectedCerts, commonTLSCertificateLabels(client.ObjectKeyFromObject(gateway), client.ObjectKeyFromObject(tlsPolicy)), tlsPolicy.Namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you have removed the deleteUnexpectedGatewayCertificates method here, but chose to keep deleteUnexpectedGatewayProbes in the dnspolicy_healthchecks reconcile?

Copy link
Contributor Author

@laurafitzgerald laurafitzgerald Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking at the time was to bring it in line with the DNSRecords reconciliation, but looking at it now it makes sense to keep it. I'm bringing the three more in line with each other so I'll return the unexpectedcertificates to match the unexpectedhealthcheckprobes. Doesn't seem a need for it in the dnsrecords reconciliation though so I wont add it there. Commit coming soon.

@mikenairn
Copy link
Member

@laurafitzgerald Looks good to me. Left a couple of comments, will let you reply to those before sticking the lgtm on it.

@laurafitzgerald laurafitzgerald temporarily deployed to e2e-internal October 31, 2023 12:39 — with GitHub Actions Inactive
return fmt.Errorf("error creating and updating expected certificates for gateway %v: %w", gw.Gateway.Name, err)
}
if err := r.deleteUnexpectedCertificates(ctx, expectedCertificates, gw.Gateway, tlsPolicy); err != nil {
return fmt.Errorf("error removing unexpected certificate for gateway %v: %w", gw.Gateway.Name, err)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally i kind of preferred the pattern this (and DNSRecord) was using of having a reconcileGateway<Resource> method that deals with reconciling resources for a given Gateway instead of having it broken up inside the main reconcile<Resource> method.

  • reconcile<Resource> - Reconcile resources(create/update/delete) for the current policy
  • reconcileGateway<Resource> - Reconcile resources(create/update/delete) for the current policy and gateway
  • delete<Resource> - Delete resources for the current policy
  • deleteGateway<Resource> - Delete resources for the current policy and gateway

I'd say if we are trying to make these controllers consistent (looks like you have copied the pattern from health checks) we should actually update the DNSRecord one to deal with unexpected gateway resources in the same way as the other two.

Since none of that is in the scope of this bug fix i'd say we can come back to this later.

@mikenairn
Copy link
Member

/lgtm
/approve

Copy link
Contributor

openshift-ci bot commented Nov 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: laurafitzgerald, mikenairn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot merged commit 76148fd into Kuadrant:main Nov 1, 2023
9 checks passed
@mikenairn mikenairn mentioned this pull request Nov 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants