-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement RFC8555 (ACME spec) § 8.2 #168
Comments
Hey all :) |
We need to be able to cancel future retries if a client request to validate comes in while we'd otherwise be sleeping. As far as I can tell, that pushes things toward the timer approach so we can cancel pending retries and start over when new requests come in. If that's overly complicated, I guess sleeping routines could check whether they've been canceled when they wake up or something, but after 3 minutes of thought, I think the timer approach is still preferable. cc @maraino or @dopey for a second opinion. |
Makes sense - in that case I'll utilize the timer approach. Will also modify |
I'm picking this up in #242. |
Hi, any updates on this? Some of our ACME challenges occasionally get stuck in |
Hey @tomasfreund we ended up tabling this without merging. Then the implementer moved on to a different project, and we've pretty much let it sit since then because there hasn't been much interest from the community. To my recollection, my main issue with this PR (in the state that we left it) was that it was "complicated". It required state to be saved between executions of I'll bring this up at our weekly triage meeting, but my guess is that we don't have bandwidth to prioritize this issue at this time, given that the workaround is probably just to retry the request. |
If you're okay with this only being guaranteed to work in non "HA" contexts (single instance of step-ca) then you should be able to take out the ordinal (implication being that any updates to the retry state would need to be transactional and handled by the backing store and the transaction would need to span the length of the validation attempt) or just leave it in and pin it to Beyond that, I believe the remaining work is just one more pass to address outstanding comments on the PR (there are some minor but blocking logic updates needed) and then a pass on the tests to be more comprehensive. I believe I updated the existing tests which cover the changes made in the PR and they generally pass. However, the PR also involves converting some cases where the server would have returned a 4xx into a 2xx + error. I discovered there was no existing coverage for those specific scenarios and recall we wanted to add some coverage as well as run a smoke test using existing clients to make sure they properly handle the updated behavior. That's about all I remember after a quick skim of #242. |
Agree w/ everything @dcow said above. I think we weren't sure at the time, and still aren't, if we wanted folks trying to use an HA setup to need extra configuration. We're no further along in answering that question. |
Hey @tomasfreund we had the opportunity to discuss this issue a bit this morning and came away with a few questions / comments.
|
Hi, we are using cert-manager which tries to verify the DNS but our dns-01 challenges still get stuck sometimes. |
Section 8.2 of the ACME spec details exactly how client and server retry should be handled during a challenge validation. We should implement this part of the spec. Namely, retry state needs to be included in the challenge resource, client requests should be throttled, and we should set appropriate retry-after headers on the challenge resource response and include error information about the result of each attempted validation.
More details:
#162 (comment)
The text was updated successfully, but these errors were encountered: