-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema refactor #92
Comments
More related to this than worth spawning a new issue: I'd like to revisit / upvote a conversation about how identifiers / named entities are referenced in JAMS. For example, I'd like to tag a single annotation as being produced by some unique identifier, such that I can search a collection for all annotations performed by the same entity (human or algorithm). We've got the annotator dict, but it's a little too unconstrained to encourage any convention. |
I'm not sure that fits under the scope of JAMS per se; remember the headaches about filenames in #5? We eventually decided that that's better handled at the application level -- for better or worse. I suspect that indexing annotation sources will have similar difficulties. OTOH, if we do want to add support for foreign-key indexing (for tracks, annotators, etc), maybe it's worth reopening that discussion? |
Could we simply add a new identifier field in the annotator dictionary that On Thu, Aug 18, 2016 at 9:03 AM, Brian McFee [email protected]
|
I don't want to necessarily tell users what the namespace should be, but I
|
Maybe go rosetta-style? Let identifiers be a list of strings of the form That will at least validate for syntax. If you want semantic validation, that's up to a separate indexing structure that should live outside of jams. For example, the SALAMI annotators could be identified by |
Rehashing #40 after a conversation with @ejhumphrey
There are good arguments for splitting the JAMS schema into smaller pieces that can be shared and repurposed. Specifically, a database (eg, a mongodb key-value store) for managing jams collections could be more reasonable structured (and easily searchable) if the database contains individual annotation objects (indexed by track id) rather than full JAMS objects.
I propose that we refactor the jams schema so that annotations can exist independently of the JAMS file format. Of course, the JAMS file format will still use annotation definitions, so there should be no observable difference in the way JAMS files work*; put another way, the API for JAMS files stays the same, and all the changes would be under the hood.
Digging in a bit more, the current schema looks like:
and the refactored schema might look like:
What do folks think?
To make this happen, we'd have to get a better handle on json-schema inheritance, but I think it's totally possible.
The text was updated successfully, but these errors were encountered: