Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of resumeSession failures #2010

Merged
merged 13 commits into from
Jan 5, 2024

Conversation

estrattonbailey
Copy link
Member

@estrattonbailey estrattonbailey commented Jan 2, 2024

Motivation: flaky networks currently result in the app nuking JWTs, effectively logging the user out. We want to log out the user if the session is invalid, but for other unknowns, we can keep it around on-device until we get a legit response from our server that says otherwise.

This PR adds an additional AtpSessionEvent event, network-error, and emits that event from resumeSession if we catch an unexpected exception.

@estrattonbailey estrattonbailey force-pushed the eric/resume-session-handling branch 3 times, most recently from 50f56fb to 37bb367 Compare January 2, 2024 23:20
@estrattonbailey estrattonbailey changed the title Specifically emit expired event for resumption failures Add additional AtpSessionEvent for unknown exceptions Jan 2, 2024
@estrattonbailey estrattonbailey force-pushed the eric/resume-session-handling branch from bfd9b1b to 1d03455 Compare January 4, 2024 18:53
@estrattonbailey estrattonbailey changed the title Add additional AtpSessionEvent for unknown exceptions Improve handling of resumeSession failures Jan 4, 2024
*
* Everything else is unexpected.
*/
if (['InvalidDID'].includes(e.error)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused what circumstance this happens in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just handling the logic that was already here, where a did doesn't match the requested session did. I honestly don't know why this would happen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah cool cool 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and i guess if that happens, then somethings gone wonky & the appropriate action would be to just log the user out - hence the expired


if (e instanceof XRPCError && e.error) {
/*
* `ExpiredToken` and `InvalidToken` are handled in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this._refreshSession implicitly called by the retry logic from calling getSession?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little confusing, but we override the static AtpAgent.fetch here so that if we get an ExpiredToken we can try to refresh the session. So calling getSession actually calls this._fetch here, which calls this._refreshSession.

But yeah, basically expired will be emit from that handling + afaict we don't get a ExpiredToken error from getSession.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet checks out 👍

Comment on lines 193 to 195
} else if (['InternalServerError'].includes(e.error)) {
this._persistSession?.('network-error', undefined)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that network errors (e.g. user does not have connectivity to the service) would have status ResponseType.Unknown, and may not have an error property.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think I see! These 500s are treated akin to a network error. That makes sense since it's another kind of temporary unavailability, and we don't necessarily want to log the user out. Other status codes for temporary unavailability such as 502 and 503 might be handled similarly.

Tiny P.S.: I think that checking the status property of an XRPCError for this kind of thing might be more reliable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

network-error is just a name Paul suggested for this. This particular branch is tested in agent.test.ts, where we do get an XRPCError called InternalServerError.

Basically we're trying to handle any not-already-handled knowns with expired, and leave unknowns (like a network timeout that doesn't even hit our server) as network-error.

Copy link
Collaborator

@dholms dholms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for talking us through this - makes sense to me 👍

I agree with Devin that status code checks are probably more reliable than error message checks. But it seems like any unknown error gets turned into network-error regardless, so it probably works either way

packages/api/src/agent.ts Outdated Show resolved Hide resolved
@estrattonbailey estrattonbailey merged commit 1406773 into main Jan 5, 2024
10 checks passed
@estrattonbailey estrattonbailey deleted the eric/resume-session-handling branch January 5, 2024 01:51
@github-actions github-actions bot mentioned this pull request Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants