Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tool model files from spdx-3-model files #708

Closed
wants to merge 5 commits into from

Conversation

davaya
Copy link

@davaya davaya commented Jun 26, 2023

Adds two scripts:

  1. load_model.py parses model files from the spdx-3-model repo and saves the full model as a single modelTypes.json file. This script is complete and may be generally useful.
  2. make-model-classes.py reads modelTypes.json and creates class files from it. This is currently a skeleton that generates only boilerplate class files, to illustrate what needs to be developed.

@goneall
Copy link
Member

goneall commented Jun 26, 2023

Thanks @davaya - Can you attach a sample output modelTypes.json file? I may attempt something similar for the Java tools and would like to see an example.

@davaya
Copy link
Author

davaya commented Jun 26, 2023

@goneall it's a straight translation of the markdown to JSON (skipping the summary/description text), plus four derived metadata properties prefixed with underscore to make it easier to process.

The ad-hoc "External properties restrictions" heading is included verbatim, with no attempt to munge it into an identifier. No other such headings exist in the model but if they did they would also be preserved.

https://gist.github.com/davaya/9f931c4b4cb36aa9dead21b4ee45b4a5

@armintaenzertng
Copy link
Collaborator

Thanks for the work, David! However, I don't think we should keep a parser of the spec md-files in the python-tools repo. This is likely to go out of sync eventually. The spec-parser repo already takes care of this kind of parsing and should be the go-to for such endeavors, I think.
There is also a PR ready to be merged that generates a JSON file similar to yours from the model. Please have look if that would satisfy your expectations! :)

@davaya
Copy link
Author

davaya commented Jun 27, 2023

It did feel odd to have code where I put it since there was no previous example - I just wanted to make the parser available somewhere since it can read directly from the GitHub repo, it does zero manipulation of the model data (it just checks that the directory structure is as expected, then sucks in whatever it finds), and the class generator needs something concrete to work from. Plus it's short - 120 lines total, 30 of which are the GitHub virtual filesystem. Hopefully it will meet the expectations of anyone who needs an up-to-the-minute model capture.

I'll move this to the spec-parser repo for now, though having a whole repo for one simple task seems like overkill. Parsing is just a prelude to doing something useful, not an end unto itself. @zvr 's md format is brilliant - simple enough even a caveman can parse it :-)

This repo will need a class file generator that's completely driven by the model, not crafted by hand. Putting a human in the loop will almost certainly get out of sync with the single-source-of-truth. That generator code will need a home, and when it has one, the spec parser will be just a tiny part of its workflow.

@armintaenzertng
Copy link
Collaborator

We opened an issue on the auto-generation of the model from the spec: #716.
@davaya: Are you currently working on this topic? If not, we may take this on.

@davaya
Copy link
Author

davaya commented Aug 3, 2023

@armintaenzertng, @goneall I am currently working on it in the Playground, and am very happy to see #716 and #738 - that's a great step forward. Please copy or ignore any of the playground code as you see fit.

I'm tired of arguing about whether the spec-parser repo is the one and only thing allowed to read model files, that isn't the open source way. I happen to think my parser is much simpler and thus much easier to verify for correctness than the spec-parser repo, but the code is only a suggestion, nobody is forced to use it or even look at it.

I don't see "External properties restrictions" as adding any value over putting the restrictions directly into the property being restricted, but since they were added to the model files, the playground class generator now supports doing restrictions both ways.

The playground class generator pre-processes class inheritance to ensure that types explicitly show all the properties that they can have. That representation means it can be implemented in any language that doesn't support classes. But I'm not sure any languages like that even exist any more, so Python and all other tools languages would presumably implement the model using OOP classes that do inherit properties. The full property list is a convenience for developing and documenting the model, not a prescription for implementing it.

If you use some of its ideas and also manage to get them merged into spec-parser, that would be a wonderful thing. The JSON snapshot produced by the playground parser omits nothing from the model files (except summary and description content, which could easily be put back in), so the spec-parser repo code could get everything that exists in the markdown files from a JSON snapshot file. But I'm getting a "not invented here" vibe against even looking at PRs for the spec-parser, much less considering technical merit, completeness or correctness. Maybe you'll have better luck.

@goneall
Copy link
Member

goneall commented Aug 3, 2023

I'm tired of arguing about whether the spec-parser repo is the one and only thing allowed to read model files, that isn't the open source way. I happen to think my parser is much simpler and thus much easier to verify for correctness than the spec-parser repo, but the code is only a suggestion, nobody is forced to use it or even look at it.

My only concern with supporting multiple parsers from the markdown files is it will require "standardizing" the markdown and may restrict our ability to quickly change the files. The reason some months ago we chose to standardize on the OWL / SHACL rather than the markdown was the OWL / SHACL is already a well established stable spec.

If we're willing to allow rapid change of the markdowns and not require standardizing either the markdown nor the intermediate json files, I'm flexible.

@davaya
Copy link
Author

davaya commented Aug 3, 2023

The playground parser doesn't know anything about the SPDX model, the only thing it knows are filesystem directories and markdown section headers, level 1 list items and level 2 list items. As long as those things are "standard", the playground parser will read the entire model file hierarchy and produce a JSON snapshot of it that reflects that structure. Creating a Datatypes folder didn't change the parser, it is copied into the JSON data like the other directories.

If the markdown files started using some new markdown feature like level 3 list items, or changed the filesystem hierarchy to be more than 3 layers deep (currently model/profile/category/file.md) then the parser would need to be tweaked.

The playground has a JSON file containing an old (July 19) model commit. All of the markdown files should in principle be re-creatable from the snapshot. There isn't currently any code to recreate the model markdown tree, but round tripping is the way to test whether something that should be true actually works - run it both ways and do a dir diff between original and reconstructed model.

That means that the model-parser code that generates OWL/SHACL would get the same information from a JSON snapshot that it would get from parsing the markdowns, and that rapid changes to the markdown format would be reflected in the snapshot format.

@maxhbr
Copy link
Member

maxhbr commented Mar 22, 2024

This is stale, and https://github.com/JPEWdev/shacl2code is expected to replace it

@maxhbr maxhbr closed this Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants