Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OAuth device flow for GitHub logins. #168

Merged
merged 11 commits into from
Sep 12, 2024
Merged

Conversation

plietar
Copy link
Member

@plietar plietar commented Aug 30, 2024

GitHub-based logins to Packit rely on users specifying a personal access token when adding the location. This has a number of usability and security disadvantages.

Users need to manually take steps in order to create a PAT, and they need to make sure they have selected the right scope. It is quite likely users would be tempted to create tokens with way more scopes than is necessary.

The configured PAT would be stored in the orderly configuration, which many users share and keep on network drives that are accessible to everyone.

By using an OAuth flow to generate our own tokens, we remove the extra steps the user needed to take and we can make sure it has just the right scopes needed. Additionally the token cache implemented by httr2 is located outside of the orderly repository, in the user's home directory.

The device flow works by showing the user an 8 character code in the console, and a link to https://github.com/login/device. The user needs to type in the code and approve the login. Meanwhile, the orderly2 client polls the GitHub API until the authentication has been approved.

There are other OAuth flows we could have used, in particular the authorization code flow (with PKCE). In this case, a link is printed to the console and the user needs to follow it and approve the login. They are then redirected to a localhost server, which has been started by orderly2. This removes the need to need with the 8 character code of the device flow, but it is a bit finicky and assumes orderly2 is running on the same machine as the user's browser, which may not be the case when runing over SSH. For this reason, the device flow is preferred.

It is still possible to specify a PAT as before, in which case the OAuth flow is skipped. This can be useful in non-interactive situations, such as a CI pipeline.

Note that there are still fundamental issues with this model. In particular, the Github tokens are not bound to the application. This means that a malicious application could manage to get a user to sign in to it, obtain a access token for the user, and use this to login to Packit. From just a GitHub read:org scope, the app is able to escalate to full impersonation of the user on the Packit server.

Additionally it still tightly couples the orderly2 client with GitHub as an authentication method, prohibiting the use of other authentication methods, such as basic username / password.

Ideally we would remove the use of Github Access tokens as a way to login to Packit. Instead, Packit should itself implement an OAuth2 Authorization Server. orderly2 would perform the device flow directly against Packit, and directly retrieve a Packit JWT in exchange. During the device flow, when the user is directed to the Packit website to enter the user code, Packit may require the user to login. When this happens, a new, parallel OAuth flow would take place, this time with GitHub as the OAuth AS and Packit as the OAuth client (using the authorization code flow). When the nested flow is complete and login is succesful, the user would be prompted to type in the code and complete the flow initiated by orderly2.

With this approach, orderly2 has no involvement with GitHub at all, and Packit can use another authentication method (or even no authentication). This would be completely transparent to the orderly2 client. It also provides tighter binding between GitHub and Packit, as Packit fetches the access token directly from the former, and cannot be supplied a token from an arbitrary application.

@plietar plietar requested a review from richfitz August 30, 2024 13:44
@plietar plietar changed the base branch from mrc-5724-upgrade-httr2 to main August 30, 2024 15:46
@plietar plietar force-pushed the mrc-5725-device-flow branch from 509a1a1 to abb6332 Compare August 30, 2024 15:48
@plietar
Copy link
Member Author

plietar commented Sep 2, 2024

Tests are broken on R-devel, see #169

@plietar plietar force-pushed the mrc-5725-device-flow branch 4 times, most recently from 6ec61a3 to 322d425 Compare September 6, 2024 15:30
Copy link
Member

@richfitz richfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working for me with both oauth and tokens - some comments included

R/location.R Outdated Show resolved Hide resolved
Comment on lines +2 to +4
# Surprisingly, we don't actually need the Client ID here to match the one
# used by Packit. It should be fine to hardcode a value regardless of which
# server we are talking to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a surprise. Hopefully nothing that will bite us in future!

##' a GitHub token with `read:org` scope). Later we'll expand this
##' as other authentication modes are supported.
##' a GitHub token with `read:org` scope). If missing or NULL, orderly2 will
##' perform an interactive authentication against GitHub to obtain one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the docs need regenerating, this is not reflected in the .Rd files atm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think that we might also add a cache arg here to turn oauth caching on or off:

However, there’s no way to prevent other R code from using httr2 to access them, so if you do choose to cache tokens, you should inform the user and give them the ability to opt-out

(from https://httr2.r-lib.org/articles/oauth.html)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I think we should write (at some point) wrappers like orderly2::orderly_location_add_packit etc that have sensibly documented pages, and possibly just drop the generic interface entirely

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added a save_token argument.

)
}

do_oauth_device_flow <- function(base_url) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunately a httr2 thing, but I was confused the first time seeing the prompt:

ℹ Logging in to https://packit.dide.ic.ac.uk/malariaverse-sitefiles/
→ Copy 18C0-2805 and paste when requested by the browser
Press <enter> to proceed:

At this point I thought that I was waiting for a browser window to open, then I tried clicking the link. Finally I pressed enter and it launched the browser. It might be worth here (or around l39) posting a small hint about how the login process will go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is quite annoying. It also doesn't work if R can't open a browser, eg. over SSH.
I'll fix something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright so if I override interactive temporarily, the behaviour is a lot less confusing. I've changed it, let me know if this helps.

plietar and others added 9 commits September 12, 2024 15:28
GitHub-based logins to Packit rely on users specifying a personal access
token when adding the location. This has a number of usability and
security disadvantages.

Users need to manually take steps in order to create a PAT, and they
need to make sure they have selected the right scope. It is quite likely
users would be tempted to create tokens with way more scopes than is
necessary.

The configured PAT would be stored in the orderly configuration, which
many users share and keep on network drives that are accessible to
everyone.

By using an OAuth flow to generate our own tokens, we remove the extra
steps the user needed to take and we can make sure it has just the right
scopes needed. Additionally the token cache implemented by httr2 is
located outside of the orderly repository, in the user's home directory.

The device flow works by showing the user an 8 character code in the
console, and a link to https://github.com/login/device. The user needs
to type in the code and approve the login. Meanwhile, the orderly2 client
polls the GitHub API until the authentication has been approved.

There are other OAuth flows we could have used, in particular the
authorization code flow (with PKCE). In this case, a link is printed to
the console and the user needs to follow it and approve the login. They
are then redirected to a localhost server, which has been started by
orderly2. This removes the need to need with the 8 character code of the
device flow, but it is a bit finicky and assumes orderly2 is running on
the same machine as the user's browser, which may not be the case when
runing over SSH. For this reason, the device flow is preferred.

It is still possible to specify a PAT as before, in which case the OAuth
flow is skipped. This can be useful in non-interactive situations, such
as a CI pipeline.

Note that there are still fundamental issues with this model. In
particular, the Github tokens are not bound to the application. This
means that a malicious application could manage to get a user to sign in
to it, obtain a access token for the user, and use this to login to
Packit. From just a GitHub `read:org` scope, the app is able to escalate
to full impersonation of the user on the Packit server.

Additionally it still tightly couples the orderly2 client with
GitHub as an authentication method, prohibiting the use of other
authentication methods, such as basic username / password.

Ideally we would remove the use of Github Access tokens as a way to
login to Packit. Instead, Packit should itself implement an OAuth2
Authorization Server. orderly2 would perform the device flow directly
against Packit, and directly retrieve a Packit JWT in exchange. During
the device flow, when the user is directed to the Packit website to
enter the user code, Packit may require the user to login. When this
happens, a new, parallel OAuth flow would take place, this time with
GitHub as the OAuth AS and Packit as the OAuth client (using the
authorization code flow). When the nested flow is complete and login is
succesful, the user would be prompted to type in the code and complete
the flow initiated by orderly2.

With this approach, orderly2 has no involvement with GitHub at all, and
Packit can use another authentication method (or even no authentication).
This would be completely transparent to the orderly2 client. It also
provides tighter binding between GitHub and Packit, as Packit fetches
the access token directly from the former, and cannot be supplied a
token from an arbitrary application.
Co-authored-by: Rich FitzJohn <[email protected]>
@plietar plietar force-pushed the mrc-5725-device-flow branch from ea536e6 to dc2d5c5 Compare September 12, 2024 15:25
@plietar plietar force-pushed the mrc-5725-device-flow branch from dc2d5c5 to 8590d00 Compare September 12, 2024 15:25
@plietar plietar requested a review from richfitz September 12, 2024 15:52
@plietar plietar merged commit 3843be9 into main Sep 12, 2024
10 checks passed
@plietar plietar deleted the mrc-5725-device-flow branch September 12, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants