-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DecodedURL.to_uri is inconsistent with DecodedURL.normalize, .child, etc #144
Comments
Yeah the wrapped surface area is not the clearest. I'm curious, what were you planning on doing with the URL once you got it? The docstring of
But why one wouldn't go straight to |
Perhaps that's an argument to deprecate It would obliquely dodge the compatibility jam by removing the API instead of changing it in a way that's hard to reconcile. |
@mahmoud
It converts to a URL object rather than text or bytes directly because you still might want to manipulate it in that state, or to inspect its differences. I like what @wsanchez is saying here though, in the sense that we should probably have better names for these. At the time, neck-deep in protocol specifications, I am struggling to come up with something even marginally succinct, and I hate these names, but just as a starting point for discussion: |
It's true that I think this is a low-urgency issue precisely because 99.9% of uses of |
Interesting. Well, my first instinct is to add an |
I was going to suggest |
And still deprecate |
I like that, too. |
Just wanted to point out that we should not have both "as_text" and "to_text" :). And we need more than a text-based manipulation here, the thing the methods do (give you a This is an interesting bikeshed, though. Another possible color: |
Or ... |
|
Apparently the RFC calls them |
(okay okay I'm actually done for now) |
The RFC names make more sense than the status quo to me |
Ooops, sorry about that. Tab order strikes again. |
@glyph confused the hell out of me above. Oh, my bad. No, I don't was to add I mean to propose that we keep (Which is to say that I think "text" is a fine synonym for "Unicode".) |
Ah. Whew.
I'm fairly certain that the equivalent of the RFC's
I think the idea of the twisted-style names was for compatibility only. I would not suggest adding both on any new functionality.
Normally I'd agree but there are a number of confounding nuances here. One is that we aren't talking about literal Python types, but rather encoding notations, and also that "unicode" is specifically in contrast to "ascii" in this case... which is also text. Just different text. |
Re-reading this I realized I didn't quite answer @mahmoud's question above:
When you issue a request for GET /foo%E2%87%A7bar/
Host: xn--mgba3a4fra.example.com This requires that the HTTP protocol implementation disassemble the URL into its constituent parts, then decode various different bits of it as ASCII; the hostname, the path, the query, and not the fragment. And of course, to issue a matching DNS query and TLS certificate validation. But only the On the flip side, if the HTTP implementation wants to take those bytes and give you the answer to "what URL was requested", it has to do this same process in reverse. Now, arguably, the server should hand you a totally pristine URL object, as unmodified as possible from the wire; but you may want to read some decoded unicode stuff out of its constituent parts as well, so the There's a case to be made, I guess, that P.S.: While I was attempting to construct an example URI to comment on this issue, I happened across this web page, which causes a lot of problems with Hyperlink due to this bug. Hyperlink's behavior is arguably even worse than IDNA's, since we let you construct the "invalid" (not really invalid) URL, but then we don't let you convert it. |
I was pretty surprised to discover just now that
DecodedURL.to_uri
returns aURL
object, rather than aDecodedURL
. Given that almost all the other methods wrap the passed-through object, why does this one not?I'm not sure how we get out of the compatibility jam if we decide this is wrong, but it seems wrong to me.
The text was updated successfully, but these errors were encountered: