Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonicalized email is returned as the 'sub' instead of the original email. #165

Closed
colemickens opened this issue Jun 22, 2018 · 10 comments · Fixed by #181 or portier/portier.github.io#39

Comments

@colemickens
Copy link
Contributor

I tried to log in as <first>.<last>@gmail.com, but was instead authenticated as <first><last>@gmail.com.

I can't tell from @callahad's comment here if the "identical string" returned should be the original or the canonicalized version.

In my head, I can argue for both approaches. Just thought I'd ask and see if this is indeed intentional.

Thanks.

@stephank
Copy link
Member

This is something of a discussion we're having right now.

The original intention was that your email provider could tell Portier and the site you're logging into what the canonical version of the email address is. Now GMail doesn't do Portier just yet, so we do it for them, and normalize by removing dots.

Rolling this out, we discovered we broke a bunch of accounts, because people previously had accounts for [email protected], but were being treated as a new user [email protected] after the changes. And if we want email providers to integrate with Portier, things will break even more as email addresses suddenly change.

So we're considering what to do here. I was personally thinking we should revert the change, and stick to our own, documented normalization only. That's basically lowercasing the email address (accounting for non-latin and international domain names.)

I think we also have to figure out how to deal with this kind of breakage. Maybe we need to version the API?

@onli
Copy link
Member

onli commented Jun 24, 2018

Maybe we need to provide everything we have there:

  1. The original email as entered by the user
  2. The email as normalized by portier, with that being the one set as sub and email token field
  3. The email address as normalized by the IdP, if that is returned

We already return multiple emails in

pub fn create_jwt(app: &Config, email: &str, email_addr: &EmailAddress, aud: &str, nonce: &str) -> String {
, including the email_original and email_verified, so maybe this is just a matter of clearing up the protocol, the docs and adding a comment at that function into the broker code?

@mhofman
Copy link

mhofman commented Sep 13, 2018

I'm gonna continue here the conversation regarding identity handle I had with @onli in [https://github.com/portier/portier.github.io/issues/13#issuecomment-420946785](issue #13), since this is a better fitting issue.

When I tested Portier with an email in the form of [email protected], portier ended up showing I was logged in as [email protected], which I then learned is by design.

Well, design. It is by design, but it actually caused enough issues that we though about reverting it back, at least to only the normalization portier is doing. Though that's more about the . in the address, less about the +, that portier would so far remove as well, with the current code. But it's the same issue.

I assume you'd agree with that change.

I do as well, but it would mean quite some work for me making sure users of my services using portier don't get locked out of their account.

In my opinion it all comes down to why the email resolution was done in the first place? The design docs specifically mentions discovering a different email than the one entered by the user, and re-validating that one if it is different.

What was the reasoning behind this? Was it to make sure to find the "canonical" email of the user, which you expect to be stable in case the user ends up entering another email that ties to the same account?

I think that's where things get tricky. You have no assurance in OpenID Connect that the email claim of the user will not change over time. What happens in that case? Right now you'd just end up login in the user as a different identity.
To solve that problem, as suggested above, you have to disable this discovery and simply use the email provided by the user. It's probably fine as you do still verify that the user is in control of that email address since the authentication succeeds with it.

IMO, that's one of the reason why using the email as the account handle can be problematic.
It's a trade-off of stability (and potentially security) vs portability.

Which brings me to the other part of the discussion in the other issue:

The email identifier shouldn't be used as a stable handle. The reason is that if the domain expires and gets re-registered, the new owner would be able to steal the identity.

No, the new owner just would be the new owner of the identity. That's unavoidable imho. The only way around that is adding meatspace authentication factors, 2-factor or passwords again, bound to the service or broker itself. Binding it to the broker would require state and destroy the federalization, so that's no option. A site wanting to change what is an identity - which is who controls the email address right now - would need to introduce 2FA on its own. If wanting to address that concern, Portier should be extended to support that in its auth flow - the RP providing the second factor, Portier doing the rest of the work.

Here I wholly disagree. The concept of identity is tied to the user (or other logical entity). The authority / identity provider is in charge of verifying the identity of the the user, by whatever means it sees fit (2FA, password, other out-of-band verification). It then makes and assertion to the Relying Party (Portier broker in this case) that the user coming in is known as a certain handle. That handle (subject in the authentication) uniquely identifies that user. This is where the authentication should stop, the broker or relying party shouldn't have to perform any further authentication of the user's identity.

If the user/entity behind the identifier changes, so should the handle. Changing ownership of the domain would cause a new identity provider to be used, and thus a new subject. In Portier you end up putting back the identifier/email as a handle, and basically hide this change.

To be honest, a system based on opaque handles is probably incompatible with what Portier is trying to achieve, especially when mixing it with the built-in SMTP authentication fallback. Adding an identity provider to a domain that previously would authenticate through email verification, is basically the same as changing ownership of the domain, and there is no way to prove continued ownership of the identity.

But if you say that Portier is a way to easily verify ownership of an email identifier by a user, then not using opaque handles is fine. It's all about expectations. I think no one should consider Portier as a full identity provider, nor an identity broker. It can only make assertions about the ownership of an email address. In which case, removing the canonization of the email makes total sense.

@stephank
Copy link
Member

What was the reasoning behind this? Was it to make sure to find the "canonical" email of the user, which you expect to be stable in case the user ends up entering another email that ties to the same account?

That was the idea, yes. It was also a bad idea for when a users' provider 'upgrades' from SMTP fallback to implementing Portier IdP. Suddenly the IdP may start doing normalisation and break existing users' logins.

Portier specifies both sides (RP -> Broker & Broker -> IdP) in a single flow, so ideally the solution we come up with solves the problem for both.

So I think we should simply enforce servers (as both Broker & IdP) do what @onli suggested above:

  • email containing the original input normalised according to our spec only, no additional processing. This should then be the handle the client uses for their account.
  • email_original containing the original input exactly, which the client can optionally use for, e.g., session lookups. Its absence means it exactly matches email.
  • email_canonical which the server may optionally provide, but should not be used instead of one of the above. Its purpose is unspecified, and its absence also does not indicate anything specific.

I think that's where things get tricky. You have no assurance in OpenID Connect that the email claim of the user will not change over time. What happens in that case? Right now you'd just end up login in the user as a different identity.

The intention of OpenID Connect compliance was to ease adoption through libraries, not to guarantee any sort of compliance with existing providers. Portier tries to make stronger guarantees than OpenID Connect. But I don't think we can make the claim that any current OpenID Connect provider will just work.

But if you say that Portier is a way to easily verify ownership of an email identifier by a user, then not using opaque handles is fine. It's all about expectations. I think no one should consider Portier as a full identity provider, nor an identity broker. It can only make assertions about the ownership of an email address. In which case, removing the canonization of the email makes total sense.

I think that's correct. Portier just wants to be better email verification. I think a lot of currently in-the-wild services break the same way Portier does when domain/email ownership changes, so I personally at least never tried to tackle that issue with Portier.

@stephank
Copy link
Member

stephank commented Mar 20, 2019

This is still a pain point for us, so I was looking at this again. Even though I'm kind of waiting on Rust async/await to pick up Portier work again, I'm willing to work on this short term, because it's been incredibly annoying.

One of the things about email_canonical is that it requires an additional verification step, complicating the broker and protocol. This is because it allows to IdP to change the address, making us do discovery again:

// If the IdP actually changed the address, verify that it controls the new one.
if email_addr != data.email_addr {
let f = webfinger::query(&ctx.app, &email_addr).and_then(move |links| {

So I'm tempted to completely drop it. Doubly so because I believe it'll have very limited usefulness in practice.

The other issue I have is how we should communicate this transition. Even without the above, email will change again, and break logins for a whole bunch of people. I'm open to suggestions on this.

I'm going to try this on a branch and schedule a deploy/transition of our private broker at $work.

@stephank
Copy link
Member

stephank commented Mar 22, 2019

We've deployed this at $work, so will see if there are any issues. I believe there will be no impact for us, but that's because we only have private apps, with no registration. Some points from our analysis:

  • Client libraries don't really follow protocol at this moment.

    • The Node.js & PHP libraries use nonce+email as the session identifier, but ignore email_original and don't normalise themselves. The result is that any normalisation (by the broker) breaks login. ('Invalid nonce' error)
      EDIT: New releases fix this for Node.js & PHP.
    • The Python library uses only nonce as a session identifier and will accept any sub, so any normalisation by the broker will work.
    • We didn't investigate other libraries, but a quick search shows none under the Portier umbrella use email_original yet. (The proper session identifier is nonce+email_original.)
    • We should probably change the spec and standardise on sub instead of email for better OpenID Connect compatibility.
      EDIT: Or maybe not? sub could be used for a more natural identifier if the Portier spec is overlapped with something else, like added on top of an existing OpenID Connect implementation? We should at least add a sub requirement to the spec for compatibility, but maybe libraries shouldn't use it.
      EDIT 2: I think we pretty much standardised on email everywhere now.
  • Unlike the previous breakage where we introduced this normalisation, we are now going to do less normalisation (than ever). There are a bunch of scenarios that can result because of this change. Given an example address [email protected]:

    • Users with an existing account that were aware it is currently normalised to [email protected] can continue logging in using [email protected]. There is no lockout, at least, unlike the previous breakage.
    • Users with an existing account that are not aware it is currently normalised to [email protected] will not be able to login using [email protected]. On a public site, they may be asked to register again. On a private site, they may be told their account doesn't exist.
    • Implementations like our Node.js & PHP libraries currently fail on a login attempt like [email protected], because they can't deal with any kind of normalisation yet. There's a chance users have already been trained to login with [email protected].
      (EDIT: New releases fix this for Node.js & PHP.)
    • Implementations like our Node.js & PHP libraries will no longer break with login attempts like [email protected] that are no longer changed with normalisation, but will continue to break with login attempts like [email protected] that are still changed.
      (EDIT: New releases fix this for Node.js & PHP.)

@onli
Copy link
Member

onli commented Mar 22, 2019

Sounds like this is definitely an improvement. I wasn't aware of the Node & PHP clients breaking.

We haven't had many reports about people using portier in the wild, which makes me assume that pipes.digital is one of its biggest users. And that site is still small enough that I'm comfortable handling the support requests, or rather maybe fixing pooential issues beforehand.

@ootada
Copy link

ootada commented May 28, 2019

Php client v0.2.1 is unable to auth "[email protected]" and fails with "fail: Invalid or expired nonce".
Is there any plans to fix this?

@onli
Copy link
Member

onli commented Jun 4, 2019

@stephank did you see this?

@onli onli closed this as completed in #181 Jun 4, 2019
@stephank
Copy link
Member

@ootada While this PR fixes that, we haven't deployed it to the public broker yet.

@onli When do you think we should deploy this? It's especially users with a dot in a gmail address that will notice this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants