-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STCOR-895 wait a loooong time for a "stale" rotation request #1547
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As part of the RTR lifecyle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage. To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated. Thus, waiting longer is a quick fix. A more detailed approach to tracking the rotation request is detailed in the comments for RTR_MAX_AGE. Refs STCOR-895
Quality Gate passedIssues Measures |
aidynoJ
approved these changes
Oct 15, 2024
zburke
added a commit
that referenced
this pull request
Oct 15, 2024
As part of the RTR lifecyle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage. To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated. Thus, waiting longer is a quick fix. A more detailed approach to tracking the request is detailed in the code-comments attached to #1547. Refs STCOR-895
zburke
added a commit
that referenced
this pull request
Oct 21, 2024
As part of the RTR lifecycle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage. To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated. Thus, waiting longer is a quick fix. A more detailed approach to tracking the request is detailed in the code-comments attached to #1547. Refs STCOR-895
zburke
added a commit
that referenced
this pull request
Oct 22, 2024
As part of the RTR lifecycle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage. To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated. Thus, waiting longer is a quick fix. A more detailed approach to tracking the request is detailed in the code-comments attached to #1547. Refs STCOR-895 (cherry picked from commit b2083cc)
zburke
added a commit
that referenced
this pull request
Oct 30, 2024
As part of the RTR lifecycle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage. To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated. Thus, waiting longer is a quick fix. A more detailed approach to tracking the rotation request is detailed in the comments for RTR_MAX_AGE. Refs STCOR-895 (cherry picked from commit cc8ef65)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As part of the RTR lifecycle, we write a rotation timestamp to local storage when the process starts and then remove it when it ends. This is a cheap way of making the rotation request visible across tabs, because all tabs read the same shared storage.
To avoid the problem of a cancelled request leaving cruft in storage, we inspect that timestamp and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be active when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated.
Thus, waiting longer is a quick fix. A more detailed approach to tracking the rotation request is detailed in the comments for RTR_MAX_AGE.
Refs STCOR-895