-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[new PEP] Use SPDX license expressions in Core package metadata #2
Conversation
This was originally at python#1148 and is now closed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together! The basic concept seems sound to me, but I think we should tread very softly on the deprecation side of things, since it doesn't gain consumers much (since they still need to deal with old releases that don't have this new field), but imposes a cost on all publishers, even those that aren't publishing projects that get consumed by large organisations.
pep-9999.rst
Outdated
|
||
The use of license-related classifiers in this field will be deprecated in the | ||
future and its documentation has been updated accordingly. Tools are encouraged | ||
to provide a warning when this field is used with license-related classifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations, and if they're not offering to pay project maintainers to update their licensing metadata (e.g. through a Tidelift subscription or a consulting contract), then it isn't reasonable to push new work onto publishers solely for the benefit of these organisations.
Instead, I'd prefer to see these sections say something along these lines:
License:
If the License-Expression field is present, publishing tools MUST NOT also populate the License field. However, for compatibility with existing publishing and installation processes, the License field SHOULD continue to be accepted if the License-Expression field is absent. Publishing tools MAY infer License-Expression from the provided License information if they are able to do so unambiguously.
Classifiers:
If the License-Expression field is present, publishing tools MUST NOT also provide any licensing related Classifiers entries. However, for compatibility with existing publishing and installation processes, licensing related Classifiers entries SHOULD continue to be accepted if the License-Expression field is absent. Publishing tools MAY infer License-Expression from the provided Classifiers entries if they are able to do so unambiguously.
However, no new licensing related classifiers will be added, with anyone requesting them being directed to use the License-Expression field instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that omitting the deprecations doesn't actually make things any more complicated for metadata consumers, since they have to deal with at least the 1,428,826 project releases already on PyPI anyway, and none of those will have the License-Expression field.
The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll second @ncoghlan.
Let's not make existing packages incompatible with Metadata 2.2 and instead, add License-Expression
as an opt-in unambiguous alternative to existing mechanisms. Having it be exclusive to License Classifiers and "License" metadata is a nice approach for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ncoghlan you wrote:
I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations,
and:
The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.
Agreed.
Note that I think that using here SPDX is not driven by the needs of large consumer organisations exclusively. Small development teams, authors and FOSS supporters all benefit from improve clarity in licensing.
But to your point, yes, it could be a new burden, so the next push no longer uses a License-Expression
field but instead re-purposes the existing License
field and provides immediate compatibility with v2.1 without
doing any change.
Beyond this I have integrated your comments in the latest push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne I don't generally agree that there are significant benefits to regular people to mandate SPDX. Additionally, I don't think it's worth it to repurpose the existing License
field.
If people want to fill in this field, then adding a SPDX-License-Expression
tag makes sense.
Though, I'm tempted to suggest that you make this a versioned tag (like SPDX-3.0-License-Expression
) because SPDX does not remain consistent on license tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Conan-Kudo you wrote in #2 (comment):
@pombredanne Between the times that the expression grammar was changed (+ -> WITH, / -> OR, etc.),
From what I can see there never was any such change: the expression grammar was introduced in SPDX 2.0 and there are no material changes in version 2.1. I extracted the texts from each version for the Expression spec part and posted the two versions there https://gist.github.com/pombredanne/49b2b8699d15403bec21030fe359c797/revisions
+
is not replaced byWITH
: both were always there/
was never a substitute forOR
: you may refer too the way this has been used in Rust Cargo manifests... but that's Rust way and this was never specified in SPDX specs.
the stupid case sensitivity issue (and vs AND)
Being a Pythonista I am with you there: everything should be in proper lower case and snake case! :D
That said that's minor since this is only a canonical form issue. In practice you can ignore case when parsing (e.g. being lenient on inputs) and output something with the proper case. The license-expression library does exactly this FWIW, so I do not see this as a problem.
and the change of the GNU license tags (GPL-2.0 -> GPL-2.0-only, GPL-3.0+ -> GPL-3.0-or-later, etc.),
I am siding with you there too. And I voiced my concerns publicly when that changed happened in the fall of 2017. I was against it, but I was not the majority so I accepted the community consensus. You should have voiced your concerns in pblic then ... I would have mucho appreciated the support. In the end the SPDX community showed deference to rms and the request of the FSF to use these updated ids for their own licenses. Even though I was against it (especially since I was in the middle of a major licensing clarification work with the linux kernel maintainers), when a license author (and someone as prominent in the community as rms himself) comes and kindly ask for changes on how their things are named, I think this is OK to do the change. If you still disaagree, please bring your concern to the SPDX mailing list, as well as to rms and the FSF, but that ship has sailed now IMHO.
I've lost faith in the SPDX organization to keep this stable and reliable.
I do not recall that you joined the discussions at SPDX back then (but I have a crass memory, so forgive me if this is wrong), and that's really something that I would have appreciated as we are thinking alike on many of these topics and you seem to care about these.
That said, things are versioned and evolve in SPDX, they are not set in stone! The same way I expect software to evolve with bug fixes and new features. To the credit of the volunteers caring for the SPDX spec and license list, things are versioned and they are trying hard to keep backward compability AND ensure that new and retired license ids are never reused and have a clear mapping. I think that's quite OK in general. I am not sure what else I could
openSUSE, Rust, and the Linux kernel all implement SPDX license identifiers differently based on when they pulled the rules. And that's the most frustrating part of it all! Suddenly many things failed validation when they used to pass because the tools and data were updated to invalidate them.
I can understand your frustration, but SPDX spec and ids list are versioned for a good reason to cope with changes. You cannot blame tools that may not work yet with newer versions of the specs or stop to support previous versions. This is no different from code: there are at times some major changes and they may break compatibilty. That said, tools can handle that alright: for instance the license-expression library that I maintain is prefectly happy with past, current and future versions of the SPDX license list as well as mixing all versions together(and FWIW with any list of license symbols you can feed it with) so this means that it is possible for a tool to deal with updates in an orderly way.
That said, if I dive in the specifics:
- For openSUSE, I can see the guidelines here https://en.opensuse.org/openSUSE:Packaging_guidelines#Licensing and they may be a tad underspecified as this references an SPDX license "short names" as opposed to an "identifier" but that's quite minor.
There is a also an mapping table at https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub which looks to me as bringing order to map several legacy openSUSE ids to a sinegle SPDX id such as for AGPL... and I see that as a good thing. Now you seem much closer than I am to openSUSE so you likely know better.
AGPL-3.0-only Affero GPL
AGPL-3.0-only AGPL-3.0
AGPL-3.0-only AGPLv3
AGPL-3.0-or-later AGPLv3+
AGPL-3.0-or-later AGPL-3.0+
AGPL-3.0-or-later SUSE-AGPL-3.0+
- Rust is very clear on what they use and that sounds very clean and consistent to me.
@wking @dwijnand If could comment on the Rust/Cargo adventure with SPDX license expressions that would be awesome!
See https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata
And https://github.com/rust-lang/cargo/blob/fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d/src/doc/src/reference/manifest.md#L254
And https://github.com/rust-lang/cargo/blame/fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d/src/doc/src/reference/manifest.md#L254
# This is an SPDX 2.1 license expression for this package. Currently
# crates.io will validate the license provided against a whitelist of
# known license and exception identifiers from the SPDX license list
# 2.4. Parentheses are not currently supported.
#
# Multiple licenses can be separated with a `/`, although that usage
# is deprecated. Instead, use a license expression with AND and OR
# operators to get more explicit semantics.
license = "..."
- For the Linux kernel, I have been in the first line as I helped Greg Kroah-Hartman, Thomas Gleixner and Kate Stewart to streamline the kernel licensing by using SPDX expressions in source code and the FSFE reuse conventions. The bulk of the initial work took place in fall 2017 just when the FSF requested the GPL id cange @ SPDX... this has been handled nicely and without much hiccups by pinning the version of the licenses being used. There are tons of discussions on lkml on this and many patches since thousands of files were changed, but tell me if you want some pointers on specifics.
That said you may be referring to things like this thread https://lists.opensuse.org/opensuse-factory/2018-02/threads3.html https://lists.opensuse.org/opensuse-factory/2018-02/msg00464.html
As I mentioned here, I was first line when that happened and I am with you on the issue. I explained here why this happened with rms and the FSF. And I was against it, but that was not te consensus. It would have been great for folks from openSUSE to chime in at the ttime.
I don't want that for Python, at all.
I highly respect your opinions there: since you care and want something different for Python, may I kindly suggest that you submit a PR on top of this PR/branch with your suggested updates or specific comments to evolve it?
Or if you think this PEP is completely off base and wrong, would you mind to start a concurrent PEP with an alternaive proposal so that we can all discuss and review both proposals to work out something constructive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne I think it's perfectly fine to allow people to put SPDX expressions in License
, but adding a field that can be optionally used that must be SPDX compliant means that if that field is detected, software can guarantee they can process it that way.
We could be explicit in the packaging documentation about which exact version of the SPDX license list we support at any time. Would this work out for you?
I'm aware that the identifiers list is actually versioned. What I'm saying is that this PEP should specify a version and have a local copy of it, and updating the version of the identifiers should require a revision to this PEP.
EDIT: You did in fact ask me this, and the answer is yes, it would work for me. I'd prefer a local copy embedded because historically it's been a pain to request specific versions of the identifiers, and having a local copy avoids that problem. I want updating to new SPDX identifier versions to require PEP updates.
And for what it's worth, I have been subscribed to the main SPDX mailing list (I subscribed when I was more interested in migrating Fedora to SPDX identifiers as part of my Fedora Rust SIG work), and I voiced my concern about the change when it happened as well. That was the straw that broke the camel's back for me, as the discussion did not resolve the issue well for me. I would have probably been less annoyed if it was easy to directly request specific versions of the identifier list for machine parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Conan-Kudo thanks for your answer details... I was afk for a while and I am back.
You wrote in #2 (comment):
I want updating to new SPDX identifier versions to require PEP updates.
That's reasonable (this is more or less what we ended up doing for the Linux kernel: we store the list of valid identifiers in the kernel doc)... but at the same time, this means an update to a foundational document (the metadata doc/PEP) which is fairly significant and not to be done lightly.
To me the key thing would be how often would this possibly happen in the future? The rate at which new FOSS and related licenses evolve is slow enough. Here are some anecdotes:
- SPDX adds about 20 new licenses per year
- In ScanCode this is more like 1 or 2 per week.
That said, the new added licenses are mostly either old, seldom used licenses that were not yet "discovered" and new licenses that are not much used as of now. So in 99% of the cases the new licenses could be qualified as exotic.
Therefore, I think we could freeze a version of the SPDX license list in the PEP alright and the need for an update should be rather rare (maybe once in three years or so).
@Steap @ncoghlan @pfmoore @cjerdonek what would you think about this? This would mean being strict in this section: https://github.com/pombredanne/spdx-pypi-pep/pull/2/files#diff-7a25ca1769914c1141cb5c63dc781f32R223 and specify that we use a defined version of the list and that adopting future version would require an update to the metadata doc and version.
- The positive: there is no ambiguity about which licenses ids are supported
- The negative: adopting a new version of the license list every other year would require a new PEP, which is a disruption but that would be every couple years or so only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Conan-Kudo you wrote in #2 (comment)
I think it's perfectly fine to allow people to put SPDX expressions in
License
, but adding a field that can be optionally used that must be SPDX compliant means that if that field is detected, software can guarantee they can process it that way.
My personal inclination as documented in the current version is to avoid field inflation and reuse the license field. There is no ambiguity at all when this contains a parsable SPDX license expression or not. Since the field is in use and a warning would be issued it provides the proper gentle nagging that will help authors evolve towards a more accurate license documentation. a field that's new and optional is likely to have a lower impact and create a bigger disruption:
We have today two fields used for license (and this is confusing). And we would go to three fields all optional if we add a new one, a likely source of more confusion for authors IMHO.
With that said, if there is a consensus to use a separate field, I will update the draft to use that instead.
@Steap @ncoghlan @pfmoore @cjerdonek @pradyunsg what's your last take on this topic?
This and the freezing the list of licenses discussed in #2 (comment) are IMHO the last two objections/concerns to address and resolve before moving this PEP to an official draft IMHO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do these discussions in dedicated issues for each of these points? It'd be weird to discuss these inline on a PR.
Also requested the same at https://discuss.python.org/t/improving-license-clarity-with-better-package-metadata/2154/64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ncoghlan @pradyunsg Thank you ++ for your review.
I have pushed an updated version
pep-9999.rst
Outdated
|
||
The use of license-related classifiers in this field will be deprecated in the | ||
future and its documentation has been updated accordingly. Tools are encouraged | ||
to provide a warning when this field is used with license-related classifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ncoghlan you wrote:
I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations,
and:
The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.
Agreed.
Note that I think that using here SPDX is not driven by the needs of large consumer organisations exclusively. Small development teams, authors and FOSS supporters all benefit from improve clarity in licensing.
But to your point, yes, it could be a new burden, so the next push no longer uses a License-Expression
field but instead re-purposes the existing License
field and provides immediate compatibility with v2.1 without
doing any change.
Beyond this I have integrated your comments in the latest push.
pep-9999.rst
Outdated
* Updated the documentation of two fields: ``License`` and ``Classifiers`` | ||
|
||
|
||
License Expression Library Reference implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opinion: It is important to separate the discussion of the standard from the discussion about the implementation.
This section (and the discussion of a library being developed for licensing) should probably be dropped from the PEP -- it's not relevant to the metadata update and going into too much detail of how-to-do-this, unnecessarily "fixing" what we would be doing here. eg: the library could well be implemented independently and not be a part of the pypa
organization here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pradyunsg you wrote:
opinion: It is important to separate the discussion of the standard from the discussion about the implementation.
This section (and the discussion of a library being developed for licensing) should probably be dropped from the PEP -- it's not relevant to the metadata update and going into too much detail of how-to-do-this, unnecessarily "fixing" what we would be doing here. eg: the library could well be implemented independently and not be a part of the
pypa
organization here.
This makes sense, though I added the section about a reference implementation based on a comment from @pfmoore
FWIW, the library already exists and is not part of Pypa alright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth and ended keeping this as a reworded reference implementation section for now. That said, I am quite happy to remove that section if there is a consensus that it does not belong there.
Note to self: distutils docs define the
and also this interesting bit referencing the English spelling of
There is also the "UNKNOWN" business... See https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/distutils/dist.py#L1189 I am not sure "unknown" is still a thing with newer packaging tools though it likely is still used at least by setuptools based on https://github.com/pypa/setuptools/blob/375138c7a477278ee7bcc5e4d78cbe243ef5c008/setuptools/monkey.py#L104 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some typos I spotted on a skim. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction this has taken!
pep-9999.rst
Outdated
- making the existing `License` and new `License-File` fields mandatory | ||
including stricter enforcement in tools and Pypi publishing. | ||
|
||
- restricting the upload of packages to the public Pypi index to the packaes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packaes
-> packages
Worth noting: SPDX license list has metadata for whether the license is approved by OSI and FSF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 86eb7a8
pep-9999.rst
Outdated
- SPDX license ids are becoming a de-facto way to reference common licenses | ||
everywhere, whether or not a license expression syntax is used. But they often | ||
need to be supplemented with extra license ids or conventions to accept extra | ||
or generic licenses such as "Proprietary" or "Public domain" not tracked by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Public domain should be argued based on https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files
-
I haven't ever seen good rationale why
Proprietary
needs to be there. It feels that in medium-size companies legal department tells tech department things, but nothing concrete. -
There's also Add “NONE” to the license expression syntax spdx/spdx-spec#49, i.e. you could say
NONE
to mean there are no license (and again, no lawyer have ever explained to me howAll rights reserved
differs fromlicense = NONE
in this technical context)` -
I don't see how
LicenseRef-PublicDomain
orLicenseRef-Proprietrary
are any worse. They are however valid SPDX license expressions, so generic tools/libraries will understand them. -
Also, if there's some actual proprietary license, not "All rights reserved", then
LicenseRef-OurCompanyLicense
is valid license expression and more correct&descriptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LicenseRef-OurCompanyLicense is valid license expression
I spent a few minutes looking at the SPDX site and couldn't find any confirmation of this (not saying you're incorrect, but rather that the information is hard to find) and there's no way I'd have known to even look for an expression like this if all I knew was "I need to record that the package I'm publishing is licensed under our company license".
My concern here is that if it's too hard for people to find a reasonable thing to put in this metadata, they'll end up not bothering, and just supplying a LICENSE.txt
file, or if the field is made mandatory we'll get endless support requests on the packaging tracker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at appendix iv in https://spdx.org/spdx-specification-21-web-version
Some examples:
LicenseRef-23
LicenseRef-MIT-Style-1
DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2
Unfortunaty, indeed https://spdx.org/ids-how doesn't mentioned LicenseRef
and DocumentRef
use cases directly. That's something to be raised with SPDX people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pfmoore if company went through a cost of writing own proprietary license, it's not hard to figure out what SPDX expression should they use for it.
I repeat: SPDX is univeral standard, having Python-specific deviation won't really help anyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phadej FWIW, I happen to be one of the SPDX co-founders though I speak here exclusively with my Python hat on and not on behalf of SPDX.
LicenseRef-XXX
license ids are only valid within the context of a full SPDX document and not outside in a solo expression like we are considering here. So using LicenseRef-Proprietary
is no more valid than using Proprietary
in this context ... much it is much simpler to write down, remember and specify.
There are some ongoing discussions at SPDX to defines license ids "namespaces" to cope with this but this is not fully there yet. In the meantime, there is no many other ways than to use expressions with extra not-SPDX-listed ids (which is what npm
and Suse
and ClearlyDefined
do for now)..
@pf_moore you wrote:
My concern here is that if it's too hard for people to find a reasonable thing to put in this metadata, they'll end up not bothering, and just supplying a LICENSE.txt file, or if the field is made mandatory we'll get endless support requests on the packaging tracker.
exactly: which is why I am not making anything mandatory and I find that using a generic Proprietary
for anything off SPDX is simpler.
pep-9999.rst
Outdated
''''''''''''''''''''''''' | ||
|
||
A `License Expression` is a string using the SPDX license expression syntax as | ||
documented in the SPDX specification [#spdx]_ using either Version 2.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd explicitly discourage use of +
combinator. Recent license lists added GPL-2.0-or-later
and GPL-2.0-only
identifiers, and deprecated GPL-2.0
. I don't remember proper rationale, but that change removed most needs for +
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stand corrected: https://spdx.org/ids-how it's an issue with FSF license only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +
is mostly abandoned at this stage and the main area where this was showing up was for GNU licenses and as you rightly pointed out SPDX changed these to -only
and or-later
based on requests from FSF.
I don't remember proper rationale, but that change removed most needs for +.
That was a request from rms and the FSF. See https://www.gnu.org/licenses/identify-licenses-clearly.html and then https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2
pep-9999.rst
Outdated
- any SPDX-listed license short-form identifiers that are published in the | ||
SPDX License List [#spdxlist]_ using either Version 3.6 of this list or any | ||
later compatible version. Note that the SPDX working group never removes any | ||
license identifiers: instead they may only one as obsolete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we (Haskell/Hackage) took SPDX into use for the license field, we didn't included any identifiers already deprecated in the first version we used (IIRC license list 3.0).
Fortunately for us, suffix-less GPL-2.0
was already deprecated. So one have to explicitly write GPL-2.0-only
or GPL-2.0-or-later
.
pep-9999.rst
Outdated
with type, file an text keys. This is mandatory unless there is a LICENSE or | ||
LICENCE fie provided. | ||
|
||
- Haskell Cabal [#cabal]_ specifies a single string with a list of accepted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. license
field is documented at https://cabal.readthedocs.io/en/latest/developing-packages.html#pkg-field-license
- it's not a list, it's proper SPDX License Expression (with additional
NONE
)
Cabal used to have own short list of licenses, but we moved to SPDX because
- more licenses
- expressions to combine them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you ++ for your review!
I fixed this in 3290f56
I also added you name to the Acknowledgement section
pep-9999.rst
Outdated
When processing the `License` field to determine if it contains a valid license | ||
expression, tools: | ||
|
||
- MUST ignore the case of the `License` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale to this?
I mean, the choice is irrelevant as there are no ambiguities.
Anecdotally all miss-cases cases I'm aware of were actual mistakes, so IMO one COULD report warning if non-canonical casing is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spdx/spdx-spec#63 is related issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale to this?
Accept anything even if there is a case change since this case really does not matter.
Anecdotally all miss-cases cases I'm aware of were actual mistakes,
Indeed!
so IMO one COULD report warning if non-canonical casing is used.
Good point: reporting a warning is a good way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep-9999.rst
Outdated
|
||
Several package authors have expressed difficulty and/or frustrations with the | ||
possibilities to express licensing in package metadata. This also applies to | ||
Liux distribution packagers. This has triggered several license-related |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s_Liux_GNU/Linux_
Could we also mention *BSD? And since this is also an issue with Macports, maybe we could find a more "generic" term. How about "package maintainers in various operating systems"?
no package uses them in PyPI as of the writing of this PEP. | ||
|
||
The remainder of the `Classifiers` using a `License::` prefix map to a simple | ||
single license expression using the corresponding SPDX license identifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote https://framagit.org/upt/upt-pypi/blob/master/upt_pypi/licenses.py#L15 . Should we provide a ready-to-use mapping in an annex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Steap that would be great. Do you mind to update the draft directly and add yourself as a co-author? Your call... but that would be great!
::::::::::::::::::::::::::: | ||
|
||
The License-File is a string that is a package-root relative path to a license | ||
file. The license file content __must__ be UTF-8-encoded text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter-proposal: it is a simple filename for a license file in the dist-info directory.
I see license as part of the metadata, not the program code, so it seems it would be better to have all the info inside dist-info.
(Your program wants to access the license file to display it? use functions in importlib.metadata
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the mailing list, it was suggest to reuse the rules for RECORD, so with that it would be a path relative to site-packages directory, which makes it easy to point to project-0.42.dist-info/license.txt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@merwok that's exactly what should be written here: you are entirely correct there, this is actually a bug that I am about to fix in a few:
- wheel accepts root-relative paths in
license_files
. For instance:
[metadata]
license_files=
LICENSE
foo/bar/NOTICE
- this will produce a dist-info with these two files:
LICENSE
NOTICE
So the behaviour is what you are advocating for. And there is one oddity you made me discover in wheel!
When we have this setup.cfg:
[metadata]
license_files=
LICENSE
etc/LICENSE
the built wheel will contain only one LICENSE
file with the content of etc/LICENSE
e.g. the last reference to a filename wins
I think this is reasonable behaviour and it could be mentioned in the PEP for reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@merwok this also highlights another issue. With this setup.cfg
[metadata]
license_files=
LICENSE
license
A whee will have both:
LICENSE
license
That's on a POSIX filesystem that's case-sensitive for paths.
But if that wheel were installed on a case-insensitive FS such as Windows and more recently macOS APFS, it may be that one of te two files get overwritten which is IMHO not an acceptable solution for METADATA. IMHO we need to specify that each License-File
entry must be unique ignoring case. And that will be for tools to honor (and for wheel
to be fixed accordingly)
@agronholm and @pfmoore what's your take on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that "the paths of license files in the source tree" is a different information than "the path of the license files copied to the dist info directory". Hence while setup.cfg has license_files, it is not the same information as will be stored in the dist-info metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one dist-info file that’s not defined by PEP 376: entry_points.txt
https://packaging.python.org/specifications/entry-points/#file-format
It is not referenced from METADATA but is present in RECORD, so maybe it’s precedent enough!
(the naming doesn’t use the all-caps style because the spec was retrofitted from a setuptools invention)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, it works as a precedent for shipping license files - but as you say, it doesn't indicate how we reference other files from metadata. Also (and this is something else the proposal here needs to consider) what if someone tries to ship a license file called entry-points.txt
? Yes, I know it's a silly thing to do, but standards need to cover edge cases...
I think if we're going to standardise shipping license files (as opposed to the original scope of this PEP which was just about specifying what license was in use), we probably need to reserve a namespace in .dist-info
- say that all licenses must go under .dist-info/licenses
, or something.
On a procedural note, by the way, this discussion is getting too complex to be handled just on the tracker, it should be part of the main discussion thread. @pombredanne could you summarise the discussion so far, and post that summary to the Discourse thread to allow others to comment? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pfmoore Let me summarize all the discussion over the week-end and start a new discourse topic.
what if someone tries to ship a license file called entry-points.txt
FWIW, you can do that alright but it is even worse than that: I can overwrite the dist-info/METADATA
with a METADATA
arbitrary file entry in the license_files
setup.cfg section :|
This is a wheel bug alright, and I will tackle this as that and that's not a topic for this PEP.
What's a topic is that License-File
as it is specified is NOT a solution for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter-proposal: it is a simple filename for a license file in the dist-info directory.
I actually think the PEP's current approach, is a much better approach than including the license file like this.
A case that comes to mind is a Sphinx theme I'm working on which is gonna be under the MIT license, and vendors material-icons with it's own different license in the same directory (copyrighted to Google). Referencing the specific files in the distribution is much better IMO, since I would like my metadata to not indicate that the entire project is copyrighted by Google. :)
I think what's in the PEP right now is a much more capable mechanism to handle such instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that simple. Some licences are applied by using a comment header and attaching multiple files (e.g. LGPL needs a file with full GPL text and a file with full LGPL text, some other licenses need LICENSE and NOTICE). While merging the text to one file is technically possible, it's non-standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer Rejected Idea 1 over what the PEP currently proposes, but other than that, I think this looks basically ready to go into the PEPs repository. :)
@pombredanne Nudge. Consider going ahead and submitting this PEP (see PEP 1 for the details like sponsorship). We can definitely iterate on this as it moves forward even after the PEP is submitted, but it'll be nice to have a PEP number to link to from discussions about licensing in Python packages. :-) |
ee50c6f
to
1ab29fb
Compare
You might also want to update the base branch here. :) |
Signed-off-by: Philippe Ombredanne <[email protected]>
pypa/packaging-problems#41 Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Aliaksei Urbanski @Jamim Signed-off-by: Philippe Ombredanne <[email protected]>
- Refactor intro with new and improved abstract, scope, non-scope, motivation and rationale sections - Add new Backwards Compatibility, Security and How to Teach sections - Move Reference Implementation out of appendix as its own section - Add new Rejected ideas section - Add new License Expression example using setuptools in Appendix Reported-by: Chris Jerdonek @cjerdonek Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-By: Pradyun Gedam <[email protected]> Signed-off-by: Philippe Ombredanne <[email protected]> Co-Authored-By: Pradyun Gedam <[email protected]>
Reported-By: Pradyun Gedam <[email protected]> Signed-off-by: Philippe Ombredanne <[email protected]> Co-Authored-By: Pradyun Gedam <[email protected]>
Reported-By: Pradyun Gedam <[email protected]> Signed-off-by: Philippe Ombredanne <[email protected]> Co-Authored-By: Pradyun Gedam <[email protected]>
Reported-By: Pradyun Gedam <[email protected]> Signed-off-by: Philippe Ombredanne <[email protected]> Co-Authored-By: Pradyun Gedam <[email protected]>
Reported-by: Nick Coghlan @ncoghlan Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Nick Coghlan @ncoghlan Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
The case does nt matter, but there is a canonical case: if the case is the not the standard canonical case, tools should issue a warning. Reported-by: Oleg Grenrus @phadej Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Oleg Grenrus @phadej Signed-off-by: Philippe Ombredanne <[email protected]>
Cabal uses both expressions and license files as proposed in this PEP Reported-by: Oleg Grenrus @phadej Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Oleg Grenrus @phadej Signed-off-by: Philippe Ombredanne <[email protected]>
This help endsure that the expressions is fully parseable by a conforming license expression processor Reported-by: Oleg Grenrus @phadej Reported-by: Nick Coghlan @ncoghlan Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Nick Coghlan @ncoghlan Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Pradyun Gedam @pradyunsg Signed-off-by: Philippe Ombredanne <[email protected]> Co-authored-by: Pradyun Gedam <[email protected]>
Use latest SPDX spec 2.2 and SPDX license list 3.10 Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
1ab29fb
to
5722ba0
Compare
Reported-by: Miro Hrončok @hroncok Signed-off-by: Philippe Ombredanne <[email protected]>
The belated PEP PR has been submitted at last at python#1625 ! |
This is a PR for a new PEP to improve the license information in Core metadata which is now more than ready to enter
The discussion has been taking taking place at https://discuss.python.org/t/improving-license-clarity-with-better-package-metadata/2154
This has been triggered by several recent or older discussions in particular:
Signed-off-by: Philippe Ombredanne [email protected]