-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add versioning information to "public_suffix_list.dat" file #1808
Comments
You can do the same thing in git with There are no tags so |
@smarnach is this possible? |
Git doesn't ship with an In order to embed an external information, like the SHA or any other ID, we would need to pre-process the file before being committed. This is generally the responsibility of a CI/pipeline that we don't have. I am not inclined to add such complexity in the file itself when this is within the repo, as it would be redundant since we can leverage git. Ideally, the tagging should happen in the pipeline that processes the list for distribution at Although these days I even question whether we still need such distribution mechanism and we shouldn't instead just rely on Git hosting. For consumers that need/want version tagging the current solution would be to switch towards pulling the list directly from the repo. I've actually been doing it for years in the library I maintain, here's an example: |
I believe that the .dat file instructs that it only be pulled from the
publicsuffix.org url in order to utillize cdn/cloud services.
Taking note here of the value of this suggestion, I wonder if we couldn't
add automation that adds a date to the file itself in plaintext within the
initial header comment section when merging.
I believe this would be valuable towards Universal Acceptance.
As an example, the date would be abundantly clear to someone how stale
their list is if they incorporate it in a static manner in their use or
incorporation of the list.
Looking at #1807 as an example.
Whatsapp would know more clearly that they have an 8 year old copy of the
PSL in use from 2015.
…On Tue, Aug 1, 2023, 1:44 AM Simone Carletti ***@***.***> wrote:
Git doesn't ship with an $id$ equivalent feature. Instead, you are
encouraged to leverage SHAs generated by Git itself.
In order to embed an external information, like the SHA or any other ID,
we would need to pre-process the file before being committed. This is
generally the responsibility of a CI/pipeline that we don't have.
I am not inclined to add such complexity in the file itself when this is
within the repo, as it would be redundant since we can leverage git.
Ideally, the tagging should happen in the pipeline that processes the list
for distribution at
https://publicsuffix.org/list/public_suffix_list.dat
Although these days I even question whether we still need such
distribution mechanism and we shouldn't instead just rely on Git hosting.
For consumers that need/want version tagging the current solution would be
to switch towards pulling the list directly from the repo. I've actually
been doing it for years in the library I maintain, here's an example:
***@***.***
<weppos/publicsuffix-go@a20f9ab>
https://github.com/weppos/publicsuffix-go/blob/a20f9abcc222b049ef9b7a28845bac88e0155ae3/publicsuffix/generator/gen.go#L24-L49
—
Reply to this email directly, view it on GitHub
<#1808 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACQTJJZD7RC7N7YNYLOVADXTC6YTANCNFSM6AAAAAA2UNFXM4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Cloud Storage returns the date the list was last modified in the Last-Modified header, so anyone is free to post-process the file when downloading it via the CDN. It would also be easy to modify the deployment workflow to include the date in the file when uploading the data. From an operational point of view, I don't have any concerns about doing this, so it's up to you to make the call here, @weppos and @dnsguru. I'm happy to make the required changes if you want me to. |
I specifically pointed out that it does indeed do precisely this. It's part of the It doesn't affect git clones, although you could invoke that machinery pretty easily:
|
Because the gTLD list from ICANN's JSON has a timestamp in it, and that's the most often updated element, I'd assert that "Solution Exists" if one were to track that as the last date. It does not account for deltas that occur between auto-pulls from ICANN, but due to the frequency of those, and their priority of processing ahead of subdomain projects, this works itself out relatively well. |
In reviewing #1855 / #1856 - in order to avoid confusion about versions of security reports that would cause further disposible volunteer resource drain in hunting, we may want to tie doing these things together:
I have seen salient arguments for doing both and also for doing neither, but it seems like datestamp would be prereq should we implement a security policy were that to proceed. |
Would you be interested in an implementation of the git-archive side of this on the theory that it causes no harm to have this literal text in the file:
and under some conditions, at least, it would be a benefit since it would actually contain:
|
We have updated the deployment pipeline to include version information like this:
I think this should make string comparison on the version do the right thing (@danderson wdyt?) and if somebody wants to know the actual commit ID in the repo that is available as well. Please let me know if you see issues with this or can think of a use-cases that this doesn't solve! If there is nothing we will try to roll this out next week. |
…ages - log for processed public suffix list - MD5 and SHA-512 - number of bytes, lines and rules - commit date and git hash, cf. publicsuffix/list#1808
…ages - log for processed public suffix list - MD5 and SHA-512 - number of bytes, lines and rules - commit date and git hash, cf. publicsuffix/list#1808
…ages - log for processed public suffix list - MD5 and SHA-512 - number of bytes, lines and rules - commit date and git hash, cf. publicsuffix/list#1808
…ages - log for processed public suffix list - MD5 and SHA-512 - number of bytes, lines and rules - commit date and git hash, cf. publicsuffix/list#1808
…ages - log for processed public suffix list - MD5 and SHA-512 - number of bytes, lines and rules - commit date and git hash, cf. publicsuffix/list#1808
It would be nice to have some sort of (automatic) versioning information directly inside the "public_suffix_list.dat" file. Currently it is practically impossible to determine which file is the most current from a set of multiple "public_suffix_list.dat" on disk. This probably also could be useful for libpsl to determine what the "latest" is.
With CVS or SVN we could add
// $Id$
as the first line of the file and the problem would solve itself (svn may need a propset depending on the configuration). The source control system would then automatically insert current version and/or date during the checkout (I'm not too familiar with git and if it has a similar feature or not.)The text was updated successfully, but these errors were encountered: