-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create tool model files from spdx-3-model files #708
Conversation
# Conflicts: # src/spdx_tools/spdx3/make-model-classes.py
Thanks @davaya - Can you attach a sample output |
@goneall it's a straight translation of the markdown to JSON (skipping the summary/description text), plus four derived metadata properties prefixed with underscore to make it easier to process. The ad-hoc "External properties restrictions" heading is included verbatim, with no attempt to munge it into an identifier. No other such headings exist in the model but if they did they would also be preserved. https://gist.github.com/davaya/9f931c4b4cb36aa9dead21b4ee45b4a5 |
Thanks for the work, David! However, I don't think we should keep a parser of the spec md-files in the python-tools repo. This is likely to go out of sync eventually. The spec-parser repo already takes care of this kind of parsing and should be the go-to for such endeavors, I think. |
It did feel odd to have code where I put it since there was no previous example - I just wanted to make the parser available somewhere since it can read directly from the GitHub repo, it does zero manipulation of the model data (it just checks that the directory structure is as expected, then sucks in whatever it finds), and the class generator needs something concrete to work from. Plus it's short - 120 lines total, 30 of which are the GitHub virtual filesystem. Hopefully it will meet the expectations of anyone who needs an up-to-the-minute model capture. I'll move this to the spec-parser repo for now, though having a whole repo for one simple task seems like overkill. Parsing is just a prelude to doing something useful, not an end unto itself. @zvr 's md format is brilliant - simple enough even a caveman can parse it :-) This repo will need a class file generator that's completely driven by the model, not crafted by hand. Putting a human in the loop will almost certainly get out of sync with the single-source-of-truth. That generator code will need a home, and when it has one, the spec parser will be just a tiny part of its workflow. |
@armintaenzertng, @goneall I am currently working on it in the Playground, and am very happy to see #716 and #738 - that's a great step forward. Please copy or ignore any of the playground code as you see fit. I'm tired of arguing about whether the spec-parser repo is the one and only thing allowed to read model files, that isn't the open source way. I happen to think my parser is much simpler and thus much easier to verify for correctness than the spec-parser repo, but the code is only a suggestion, nobody is forced to use it or even look at it. I don't see "External properties restrictions" as adding any value over putting the restrictions directly into the property being restricted, but since they were added to the model files, the playground class generator now supports doing restrictions both ways. The playground class generator pre-processes class inheritance to ensure that types explicitly show all the properties that they can have. That representation means it can be implemented in any language that doesn't support classes. But I'm not sure any languages like that even exist any more, so Python and all other tools languages would presumably implement the model using OOP classes that do inherit properties. The full property list is a convenience for developing and documenting the model, not a prescription for implementing it. If you use some of its ideas and also manage to get them merged into spec-parser, that would be a wonderful thing. The JSON snapshot produced by the playground parser omits nothing from the model files (except summary and description content, which could easily be put back in), so the spec-parser repo code could get everything that exists in the markdown files from a JSON snapshot file. But I'm getting a "not invented here" vibe against even looking at PRs for the spec-parser, much less considering technical merit, completeness or correctness. Maybe you'll have better luck. |
My only concern with supporting multiple parsers from the markdown files is it will require "standardizing" the markdown and may restrict our ability to quickly change the files. The reason some months ago we chose to standardize on the OWL / SHACL rather than the markdown was the OWL / SHACL is already a well established stable spec. If we're willing to allow rapid change of the markdowns and not require standardizing either the markdown nor the intermediate json files, I'm flexible. |
The playground parser doesn't know anything about the SPDX model, the only thing it knows are filesystem directories and markdown section headers, level 1 list items and level 2 list items. As long as those things are "standard", the playground parser will read the entire If the markdown files started using some new markdown feature like level 3 list items, or changed the filesystem hierarchy to be more than 3 layers deep (currently model/profile/category/file.md) then the parser would need to be tweaked. The playground has a JSON file containing an old (July 19) model commit. All of the markdown files should in principle be re-creatable from the snapshot. There isn't currently any code to recreate the model markdown tree, but round tripping is the way to test whether something that should be true actually works - run it both ways and do a dir diff between original and reconstructed model. That means that the model-parser code that generates OWL/SHACL would get the same information from a JSON snapshot that it would get from parsing the markdowns, and that rapid changes to the markdown format would be reflected in the snapshot format. |
This is stale, and https://github.com/JPEWdev/shacl2code is expected to replace it |
Adds two scripts:
load_model.py
parses model files from the spdx-3-model repo and saves the full model as a single modelTypes.json file. This script is complete and may be generally useful.make-model-classes.py
reads modelTypes.json and creates class files from it. This is currently a skeleton that generates only boilerplate class files, to illustrate what needs to be developed.