-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement OAuth device flow for GitHub logins. #168
Conversation
509a1a1
to
abb6332
Compare
|
6ec61a3
to
322d425
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working for me with both oauth and tokens - some comments included
# Surprisingly, we don't actually need the Client ID here to match the one | ||
# used by Packit. It should be fine to hardcode a value regardless of which | ||
# server we are talking to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed a surprise. Hopefully nothing that will bite us in future!
##' a GitHub token with `read:org` scope). Later we'll expand this | ||
##' as other authentication modes are supported. | ||
##' a GitHub token with `read:org` scope). If missing or NULL, orderly2 will | ||
##' perform an interactive authentication against GitHub to obtain one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the docs need regenerating, this is not reflected in the .Rd files atm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think that we might also add a cache
arg here to turn oauth caching on or off:
However, there’s no way to prevent other R code from using httr2 to access them, so if you do choose to cache tokens, you should inform the user and give them the ability to opt-out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, I think we should write (at some point) wrappers like orderly2::orderly_location_add_packit
etc that have sensibly documented pages, and possibly just drop the generic interface entirely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added a save_token
argument.
R/location_packit.R
Outdated
) | ||
} | ||
|
||
do_oauth_device_flow <- function(base_url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunately a httr2 thing, but I was confused the first time seeing the prompt:
ℹ Logging in to https://packit.dide.ic.ac.uk/malariaverse-sitefiles/
→ Copy 18C0-2805 and paste when requested by the browser
Press <enter> to proceed:
At this point I thought that I was waiting for a browser window to open, then I tried clicking the link. Finally I pressed enter and it launched the browser. It might be worth here (or around l39) posting a small hint about how the login process will go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is quite annoying. It also doesn't work if R can't open a browser, eg. over SSH.
I'll fix something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright so if I override interactive temporarily, the behaviour is a lot less confusing. I've changed it, let me know if this helps.
GitHub-based logins to Packit rely on users specifying a personal access token when adding the location. This has a number of usability and security disadvantages. Users need to manually take steps in order to create a PAT, and they need to make sure they have selected the right scope. It is quite likely users would be tempted to create tokens with way more scopes than is necessary. The configured PAT would be stored in the orderly configuration, which many users share and keep on network drives that are accessible to everyone. By using an OAuth flow to generate our own tokens, we remove the extra steps the user needed to take and we can make sure it has just the right scopes needed. Additionally the token cache implemented by httr2 is located outside of the orderly repository, in the user's home directory. The device flow works by showing the user an 8 character code in the console, and a link to https://github.com/login/device. The user needs to type in the code and approve the login. Meanwhile, the orderly2 client polls the GitHub API until the authentication has been approved. There are other OAuth flows we could have used, in particular the authorization code flow (with PKCE). In this case, a link is printed to the console and the user needs to follow it and approve the login. They are then redirected to a localhost server, which has been started by orderly2. This removes the need to need with the 8 character code of the device flow, but it is a bit finicky and assumes orderly2 is running on the same machine as the user's browser, which may not be the case when runing over SSH. For this reason, the device flow is preferred. It is still possible to specify a PAT as before, in which case the OAuth flow is skipped. This can be useful in non-interactive situations, such as a CI pipeline. Note that there are still fundamental issues with this model. In particular, the Github tokens are not bound to the application. This means that a malicious application could manage to get a user to sign in to it, obtain a access token for the user, and use this to login to Packit. From just a GitHub `read:org` scope, the app is able to escalate to full impersonation of the user on the Packit server. Additionally it still tightly couples the orderly2 client with GitHub as an authentication method, prohibiting the use of other authentication methods, such as basic username / password. Ideally we would remove the use of Github Access tokens as a way to login to Packit. Instead, Packit should itself implement an OAuth2 Authorization Server. orderly2 would perform the device flow directly against Packit, and directly retrieve a Packit JWT in exchange. During the device flow, when the user is directed to the Packit website to enter the user code, Packit may require the user to login. When this happens, a new, parallel OAuth flow would take place, this time with GitHub as the OAuth AS and Packit as the OAuth client (using the authorization code flow). When the nested flow is complete and login is succesful, the user would be prompted to type in the code and complete the flow initiated by orderly2. With this approach, orderly2 has no involvement with GitHub at all, and Packit can use another authentication method (or even no authentication). This would be completely transparent to the orderly2 client. It also provides tighter binding between GitHub and Packit, as Packit fetches the access token directly from the former, and cannot be supplied a token from an arbitrary application.
Co-authored-by: Rich FitzJohn <[email protected]>
ea536e6
to
dc2d5c5
Compare
dc2d5c5
to
8590d00
Compare
GitHub-based logins to Packit rely on users specifying a personal access token when adding the location. This has a number of usability and security disadvantages.
Users need to manually take steps in order to create a PAT, and they need to make sure they have selected the right scope. It is quite likely users would be tempted to create tokens with way more scopes than is necessary.
The configured PAT would be stored in the orderly configuration, which many users share and keep on network drives that are accessible to everyone.
By using an OAuth flow to generate our own tokens, we remove the extra steps the user needed to take and we can make sure it has just the right scopes needed. Additionally the token cache implemented by httr2 is located outside of the orderly repository, in the user's home directory.
The device flow works by showing the user an 8 character code in the console, and a link to https://github.com/login/device. The user needs to type in the code and approve the login. Meanwhile, the orderly2 client polls the GitHub API until the authentication has been approved.
There are other OAuth flows we could have used, in particular the authorization code flow (with PKCE). In this case, a link is printed to the console and the user needs to follow it and approve the login. They are then redirected to a localhost server, which has been started by orderly2. This removes the need to need with the 8 character code of the device flow, but it is a bit finicky and assumes orderly2 is running on the same machine as the user's browser, which may not be the case when runing over SSH. For this reason, the device flow is preferred.
It is still possible to specify a PAT as before, in which case the OAuth flow is skipped. This can be useful in non-interactive situations, such as a CI pipeline.
Note that there are still fundamental issues with this model. In particular, the Github tokens are not bound to the application. This means that a malicious application could manage to get a user to sign in to it, obtain a access token for the user, and use this to login to Packit. From just a GitHub
read:org
scope, the app is able to escalate to full impersonation of the user on the Packit server.Additionally it still tightly couples the orderly2 client with GitHub as an authentication method, prohibiting the use of other authentication methods, such as basic username / password.
Ideally we would remove the use of Github Access tokens as a way to login to Packit. Instead, Packit should itself implement an OAuth2 Authorization Server. orderly2 would perform the device flow directly against Packit, and directly retrieve a Packit JWT in exchange. During the device flow, when the user is directed to the Packit website to enter the user code, Packit may require the user to login. When this happens, a new, parallel OAuth flow would take place, this time with GitHub as the OAuth AS and Packit as the OAuth client (using the authorization code flow). When the nested flow is complete and login is succesful, the user would be prompted to type in the code and complete the flow initiated by orderly2.
With this approach, orderly2 has no involvement with GitHub at all, and Packit can use another authentication method (or even no authentication). This would be completely transparent to the orderly2 client. It also provides tighter binding between GitHub and Packit, as Packit fetches the access token directly from the former, and cannot be supplied a token from an arbitrary application.