-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define hosts' public suffix and registrable domain. #391
Changes from 3 commits
e679f89
cbf9063
6ea048d
cc03e7b
bd35e7e
2be718c
8828de3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -272,6 +272,94 @@ for further processing. | |
U+0020 SPACE, U+0023 (#), U+0025 (%), U+002F (/), U+003A (:), U+003F (?), U+0040 (@), U+005B ([), | ||
U+005C (\), or U+005D (]). | ||
|
||
<p>A <a for=/>host</a>'s <dfn for=host export>public suffix</dfn> is the portion of a | ||
<a for=/>host</a> which is controlled by a registrar, public or otherwise. To obtain | ||
<var>host</var>'s <a for=host>public suffix</a>, run these steps: | ||
|
||
<ol> | ||
<li><p>If <var>host</var> is not a <a>domain</a>, then return null. | ||
|
||
<li><p>Return the <a for=host>public suffix</a> obtained by executing the | ||
<a href="https://publicsuffix.org/list/">algorithm</a> defined by the Public Suffix List on | ||
<var>host</var>. [[!PSL]]. | ||
</ol> | ||
|
||
<p>A <a for=/>host</a>'s <dfn for=host export>registrable domain</dfn> is a <a>domain</a> that could | ||
be registered at a registry. To obtain <var>host</var>'s <a for=host>registrable domain</a>, run | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "is its public suffix including one domain label preceding its public suffix". Again, boring, but factual? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So a given host may have multiple public suffixes expressed within it. Perhaps: From a spec question, what do you want this definition to entail for the That is,
I seem to recall that different platform features interpret that differently (navigation vs cookies, for example) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't really aware of this case. Do you know why they interpret it differently? I guess we want consistent answers with cookies, WebAuthn, etc. If by navigation you mean the address bar it seems consistency with that would not matter that much. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mikewest wrote that the registrable domain would be null in such a case (we have github.io as example). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’d need to reaudit the Chrome code to figure out which cases are web visible. The results differ in this case based on whether or not you include private suffices and whether you treat wildcards as implicit entries of the parent. Chrome and FF differ on the latter, and the former is specified by the caller. |
||
these steps: | ||
|
||
<ol> | ||
<li><p>If <var>host</var>'s <a for=host>public suffix</a> is null or <var>host</var>'s | ||
<a for=host>public suffix</a> <a lt=concept-host-equals>equals</a> <var>host</var>, then return | ||
null. | ||
|
||
<li><p>Return the <a for=host>registrable domain</a> obtained by executing the | ||
<a href="https://publicsuffix.org/list/">algorithm</a> defined by the Public Suffix List on | ||
<var>host</var>. [[!PSL]]. | ||
</ol> | ||
|
||
<div class=example id=example-host-psl> | ||
<table> | ||
<tr> | ||
<th>Host input | ||
<th>Public suffix | ||
<th>Registrable domain | ||
<tr> | ||
<td><code>com</code> | ||
<td><code>com</code> | ||
<td> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should change this to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? We don't use that convention anywhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Example tables like this are special. We omit the quotes, substitute strings for structs, and use other conventions meant for visual clarity and not consistency. I certainly don't think we should capitalize "null" here, and I think italicizing it so that it's clear it's not just a registrable domain named Shrug. Just a thought. |
||
<tr> | ||
<td><code>example.com</code> | ||
<td><code>com</code> | ||
<td><code>example.com</code> | ||
<tr> | ||
<td><code>www.example.com</code> | ||
<td><code>com</code> | ||
<td><code>example.com</code> | ||
<tr> | ||
<td><code>sub.www.example.com</code> | ||
<td><code>com</code> | ||
<td><code>example.com</code> | ||
<tr> | ||
<td><code>EXAMPLE.COM</code> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a host, but input to the host parser. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's helpful to point out that no matter how folks spell the URL, it's going to be normalized. Perhaps shifting this table to include a URL rather than a host would make that point, especially for the punycode bits? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's fine to just list hosts, but we should label it "host input" or some such, to not confuse it with host as a concept, which is already parsed and normalized. |
||
<td><code>com</code> | ||
<td><code>example.com</code> | ||
<tr> | ||
<td><code>github.io</code> | ||
<td><code>github.io</code> | ||
<td> | ||
<tr> | ||
<td><code>whatwg.github.io</code> | ||
<td><code>github.io</code> | ||
<td><code>whatwg.github.io</code> | ||
<tr> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this row duplicated? The previous one looks the same. |
||
<td><code>whatwg.github.io</code> | ||
<td><code>github.io</code> | ||
<td><code>whatwg.github.io</code> | ||
<tr> | ||
<td><code>إختبار</code> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. And also applies below. |
||
<td><code>xn-kgbechtv</code> | ||
<td> | ||
<tr> | ||
<td><code>example.إختبار</code> | ||
<td><code>xn-kgbechtv</code> | ||
<td><code>example.xn-kgbechtv</code> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So one of the things is the PSL doesn't specify whether or not it returns U-Label or A-Label (that's left to the implementation). I'm curious the documentation here for the A-Label - is this an expectation of the contract? That is, are you trying to show that either U-Label or A-Label can be returned regardless of U-Label or A-Label input, or are you trying to state that A-Labels should be the consistent return? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently we don't rely on this anywhere (assuming it's consistent to be one or the other, is that at least required?), but A-label seems preferable as that'd be consistent with how the platform exposes URLs and origins overall. I suspect this will only matter if we add an API, but it really depends on whether PSL dependencies keep getting added or not. |
||
<tr> | ||
<td><code>sub.example.إختبار</code> | ||
<td><code>xn-kgbechtv</code> | ||
<td><code>example.xn-kgbechtv</code> | ||
</table> | ||
</div> | ||
|
||
<p>Two <a for=/>hosts</a>, <var>A</var> and <var>B</var> are said to be | ||
<dfn for=host export>same site</dfn> with each other if either of the following statements are true: | ||
|
||
<ul class=brief> | ||
<li><p><var>A</var> <a lt=concept-host-equals>equals</a> <var>B</var> | ||
<li><p><var>A</var>'s <a for=host>registrable domain</a> is <var>B</var>'s | ||
<a for=host>registrable domain</a> and is not null. | ||
</ul> | ||
|
||
|
||
<h3 id=idna>IDNA</h3> | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of "controlled by a registrar, public or otherwise" we could say "included on the Public Suffix List". This is boring, but factual and correct as I understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that works (as boring as it is)