Extract metadata from a rdzip.
vitals grabs metadata using "facets", which are functions defined mostly in the facets directory. Each facet function is responsible for a little bit of metadata. vitals calls these facet functions and builds a final metadata object out of the result.
An important note is that vitals is deterministic based on the rdlevel. That is, it will always give the exact same result if you give it the same rdzip. This means that all information must originate from the rdzip itself, so for instance vitals determines the created date by the created date on the zip file, and not when it reads the file.
Facet functions have access to the following info:
- The json representation of the rdlevel created by parse
- The zipfile (a python
zipfile.ZipFile
object) - The file, as bytes
- An optional TOML object which may be in the level
The TOML object is parsed based on the first Comment event in the rdlevel that has the text #orchard
in it. It's
mainly used for overriding metadata where the facet heuristics may be incorrect. I haven't publicised this
functionality yet, so no levels use it as of now.
Refer to installation steps in the root-level README.
For testing, you will need to pull in the test rdzips which are stored in Git LFS:
git lfs pull
python -m orchard.vitals <path to rdzip>
pytest orchard/vitals