Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISO Schematron iso-schematron.rng license issue #65

Open
hjoukl opened this issue Jul 13, 2023 · 9 comments
Open

ISO Schematron iso-schematron.rng license issue #65

hjoukl opened this issue Jul 13, 2023 · 9 comments

Comments

@hjoukl
Copy link

hjoukl commented Jul 13, 2023

Hi,

pardon me if this isn't the proper place to report such an issue - what would be?

The renowned lxml Python XML library uses the "skeleton" schematron implementation to provide iso schematron support since ~2009.

Lately, Fedora1 and RHEL2 and probably soon SUSE strip lxml's iso schematron parts due to the license in iso-schematron.rng (https://github.com/lxml/lxml/blob/4bfab2c821961fb4c5ed8a04e329778c9b09a1df/src/lxml/isoschematron/resources/rng/iso-schematron.rng) "being unclear and potentially non-Free"3.

Note: This is lxml's vendored copy of the formerly available RelaxNG schema for schematron, added to lxml years ago. Looks like it's still available in its compact form in this repo linked on schematron.com: https://github.com/Schematron/schema/blob/655a641bb8fe21ec4fa7b1a82498c43bc70a3bf0/schematron.rnc, with the same license header.

See the lxml mailing list (https://mail.python.org/archives/list/[email protected]/message/XZZAANG3Y2EMTVTQ66AH7WKB7N4VILUP/) and issue tracker (https://bugs.launchpad.net/lxml/+bug/2024343) reports on this situation.

lxml tries to mitigate that by optionally running without support for RelaxNG-validation of the schematron schema in use.
I.e. you can now remove the iso-schematron.rng file with the "offending" license and still run lxml.isoschematron functionality, albeit without validating the schematron schema itself.

Is there any chance to get this dependency properly re-licensed (or the license text reworded unambiguously), i.e. with a license acceptable for Fedora, RHEL as linux distributors who ship lxml in their distribution?

I wouldn't even know who'd be in the position to to this, if possible. The original author? The ISO org?

Could one reimplement the schematron schema from scratch (if one had access to the standards documents, which aren't publicly available any more, without buying from ISO)? Or maybe there's an alternative open source schema-for-schematron out there somewhere, e.g. an XSD?

Any insights appreciated.

Best regards,
Holger

EDIT: Correct formatting to properly show footnotes with links.

Footnotes

  1. https://src.fedoraproject.org/rpms/python-lxml/c/9d95f5a04edc386313fa854541971b3af07bcae1

  2. https://gitlab.com/redhat/centos-stream/rpms/python3.11-lxml/-/commit/7f6d5f61df3d053b7cc392f03b12f059fb97a4a3

  3. https://gitlab.com/fedora/legal/fedora-license-data/-/issues/154

@hjoukl hjoukl changed the title ISO Schematron license issue ISO Schematron iso-schematron.rng license issue Jul 13, 2023
@tgraham-antenna
Copy link
Member

This probably is the best place for you to report an issue with the license text.

I'm not involved with ISO, but I think there's zero chance that ISO will revise the version before last of the standard to update the license text. You might, however, influence the license wording in the next version, if the WG so decides.

https://github.com/Schematron/schema/blob/655a641bb8fe21ec4fa7b1a82498c43bc70a3bf0/schematron.rnc is a copy of the 2020 version of the schema, with a correction added by the ISO editor. The Schematron organisation hosts unofficial copies of the Schematron schemas so that people don't have to copy-and-paste from their PDFs of the Schematron standard (from an idea by @susi-wunsch at Schematron/schematron#15 (comment)). After all, permission is granted to "distribute free of charge".


You could generate a schema by using a utility such as trang to generate a schema from a bunch of Schematron documents and then clean that up a little based on your understanding of the standard.


The comment at https://gitlab.com/fedora/legal/fedora-license-data/-/issues/154#note_1273444092 includes "while the license does not require inclusion of the license in copies", but the license includes "The following permission notice and disclaimer shall be included in all copies of this XML schema ("the Schema"), and derivations of the Schema:"

@tgraham-antenna
Copy link
Member

You could generate a schema by using a utility such as trang to generate a schema from a bunch of Schematron documents and then clean that up a little based on your understanding of the standard.

There's also utilities -- Oxygen XML Editor has one -- that can generate sample documents from a schema. To make sure that you have all of the elements and attributes, you could generate documents from the schema, then generate a schema from the documents.

Generated schemas tend to have loose content models and have lots of CDATA attributes, so they tend to need fix-up to be more useful.

@hjoukl
Copy link
Author

hjoukl commented Jul 13, 2023

Thanks for sharing your valuable insights! Really appreciated.

I'm not involved with ISO, but I think there's zero chance that ISO will revise the version before last of the standard to
update the license text. You might, however, influence the license wording in the next version, if the WG so decides.

I feared so. ;-)

https://github.com/Schematron/schema/blob/655a641bb8fe21ec4fa7b1a82498c43bc70a3bf0/schematron.rnc is a copy of the 2020 version of the schema, with a correction added by the ISO editor.
[ ...] After all, permission is granted to "distribute free of charge".

That's basically lxml's interpretation too - we can provide iso-schematron.rng since lxml distributes it free of charge.
Which seems to be the exact problem for the Linux distros since they regularly (and rightfully) charge for their distribution and support.

Re generating the schema-for-schematron: interesting idea. I do have plenty of Oxygen XML experience from times past, mainly working with XSDs. So that might indeed be a way to kickstart an alternative schema. Still, I suppose you'd need to manually fine-tune it and needed access to the standards PDFs for this.

Thanks again,
Holger.

@rjelliffe
Copy link
Member

rjelliffe commented Jul 14, 2023 via email

@rjelliffe
Copy link
Member

rjelliffe commented Jul 14, 2023 via email

@hjoukl
Copy link
Author

hjoukl commented Jul 14, 2023

Thanks for chiming in @rjelliffe!

The problem is not the license, but the ignorance of the original reviewer, AFAICS. It would better to stop the problem
at source. Regards Rick

Probably. I'm not a lawyer though, and common sense seems not be generally applicable when it comes to legal.

I take it by original reviewer you mean the Fedora reviewer (lawyer?) who qualified the license as being ambiguous ("susceptible of at least four interpretations") and failing Fedora license criteria?

The license is exactly the same as the standard SGML license as used by piblic entity sets WITHOUT PROBLEM for 35
years. Are they going to remove all SGML and XML distros for the same reason?

I just noticed that the iso-schematron.rng originally included in lxml was a different/previous version. The older version carried a different license notice:

"(c) International Organization for Standardization 2005.
Permission to copy in any form is granted for use with conforming
SGML systems and applications as defined in ISO 8879,
provided this notice is included in all copies."

Might that be the standard SGML license you're referring to, probably with updated copyright year?
Since this looks identical to the one contained in Berners-Lee's HTML IETF rfc (https://www.ietf.org/rfc/rfc1866.txt, e.g. 9.7.2).
And who'd have thought I'd have to dig around there. :-)

I take it at some point in time lxml upgraded iso-schematron.rng to a later version, from the 2016 schematron standard (in commit lxml/lxml@92901bd).

So that might mean that the original license was changed for the 2016 schematron version (by ISO?).
Makes me wonder if "copy in any form" allows for modification (copy in modified form?). If that was the case maybe the 2016/2020 schematron changes/upgrades could be sat on top of that.

But I'm very much out of my depth wrt licensing here.

Best regards,
Holger

@AndrewSales
Copy link
Collaborator

So that might mean that the original license was changed for the 2016 schematron version (by ISO?).

Just to follow up, I believe the only change was to the date.

@hjoukl
Copy link
Author

hjoukl commented Jul 31, 2023

Just to follow up, I believe the only change was to the date.

Quoting the different license texts here:

The originally lxml-included iso-schematron.rng (a) carried this license notice:

<!--
         (c) International Organization for Standardization 2005. 
        Permission to copy in any form is granted for use with conforming 
        SGML systems and applications as defined in ISO 8879, 
        provided this notice is included in all copies.
-->

Whereas the current version from the 2016 schematron standard ((b) introduced in commit lxml/lxml@92901bd) has this:

<!-- Copyright © ISO/IEC 2015 -->
<!--
  The following permission notice and disclaimer shall be included in all
  copies of this XML schema ("the Schema"), and derivations of the Schema:
  
  Permission is hereby granted, free of charge in perpetuity, to any
  person obtaining a copy of the Schema, to use, copy, modify, merge and
  distribute free of charge, copies of the Schema for the purposes of
  developing, implementing, installing and using software based on the
  Schema, and to permit persons to whom the Schema is furnished to do so,
  subject to the following conditions:
  
  THE SCHEMA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
  THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
  OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SCHEMA OR THE USE OR
  OTHER DEALINGS IN THE SCHEMA.
  
  In addition, any modified copy of the Schema shall include the following
  notice:
  
  "THIS SCHEMA HAS BEEN MODIFIED FROM THE SCHEMA DEFINED IN ISO/IEC 19757-3,
  AND SHOULD NOT BE INTERPRETED AS COMPLYING WITH THAT STANDARD".
-->

Which is very much the same as the license header in https://github.com/Schematron/schema/blob/655a641bb8fe21ec4fa7b1a82498c43bc70a3bf0/schematron.rnc (c):

# Copyright © ISO/IEC 2017
# The following permission notice and disclaimer shall be included in all 
# copies of this XML schema ("the Schema"), and derivations of the Schema: 

# Permission is hereby granted, free of charge in perpetuity, to any 
# person obtaining a copy of the Schema, to use, copy, modify, merge and 
# distribute free of charge, copies of the Schema for the purposes of 
# developing, implementing, installing and using software based on the 
# Schema, and to permit persons to whom the Schema is furnished to do so, 
# subject to the following conditions: 

# THE SCHEMA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SCHEMA OR THE USE OR 
# OTHER DEALINGS IN THE SCHEMA. 

# In addition, any modified copy of the Schema shall include the following 
# notice: 

# "THIS SCHEMA HAS BEEN MODIFIED FROM THE SCHEMA DEFINED IN ISO/IEC 19757 3, 
# AND SHOULD NOT BE INTERPRETED AS COMPLYING WITH THAT STANDARD".

So indeed (b) and (c) have the same license text, apart from the copyright year. But the iso-schematron.rng version included in lxml initially (a) has a different license text, which seems to be the "standard SGML license".

@AndrewSales
Copy link
Collaborator

AndrewSales commented Jul 31, 2023

I can see that the 2016 text differs from what appeared in the first edition of the ISO standard.

IIRC, I took over as project editor in 2016 when the text was at FDIS (Final Draft International Standard) stage and the earliest draft I worked on already had this text in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants