Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/fdo profiles #70

Merged
merged 12 commits into from
Jan 18, 2024
Merged

Feature/fdo profiles #70

merged 12 commits into from
Jan 18, 2024

Conversation

southeo
Copy link
Contributor

@southeo southeo commented Dec 14, 2023

Adds fdo profiles from rfc documents

Includes request schemas (i.e. what is to be sent to the PID API) and the schema of the profile.

@southeo southeo changed the base branch from master to feature/schema-library December 14, 2023 11:09
Base automatically changed from feature/schema-library to master December 14, 2023 15:54
Copy link
Contributor

@juliancervos juliancervos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just finished reviewing the schemas, and I have to say: great work! I think everything looks very good and I couldn't fint any major problems, or anything important that was missing or I think should be changed.

There were a couple of aspects that I caught my attention (also left a couple of comments for these):

  • Most descriptions were quite detailed and gave a good explanation of the property, but there are a few that could use a longer text, especially whenever there is an enumeration to choose from (to give guidance on what value to pick).For example,"primarySpecimenObjectIdType" from data-model/fdo-profiles/0.1.0/digital-specimen/schema/digital-specimen-request.json, "sourceDataStandard" from data-model/fdo-profiles/0.1.0/mapping/schema/mapping.json or "dcterms:subject" from data-model/fdo-profiles/0.1.0/media-object/schema/media-object-request.json. Still, since this is mostly for documentation purposes, I understand it doesn't break anything and therefore it's not a priority.
  • Another aspect that might change how things behave are required properties.
    • A particular case of this is properties that are required depending on specific cases. I left a comment for "specimenObjectIdAbsenceReason" where you could enforce a dependent requirement at validation that currently is only stated in the description. I think this situation might come up other times on different schemas.
    • More broadly, I noticed required properties were not always what I expected. For example, in some schemas the "...Id" and "...IdType" are required, while in others it's only of them, or none (data-model/fdo-profiles/0.1.0/tombstone/schema/tombstone.json, where I was expecting "tombstonePids"). I think this is mostly because I'm not aware of the reasoning behind the required properties for each schema, how things work internally, the differences between FDO profiles and digital object schemas, etc. So there's probably a reason for things being the way they are now that I don't see, but I just wanted to bring it up to your attention in case something needs to be changed.

In any case, this PR looks like a colossal amount of work, so my compliments again and keep up the good work!

"Resolvable",
"Local"
],
"description": "Nature of institutional identifier",
Copy link
Contributor

@juliancervos juliancervos Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description for "primarySpecimenObjectIdType" could be a bit longer to offer guidance about which value to choose, in line with the properties "primarySpecimenObjectId" or "normalisedSpecimenObjectId".

Something similar happens in other related properties, like "primaryMediaObjectIdType" from data-model/fdo-profiles/0.1.0/media-object/schema/media-object-request.json. There are some more details there, but I think it could still be clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added more info and updated primaryMediaObjectIdType

},
"specimenObjectIdAbsenceReason": {
"type": "string",
"description": "*Not currently supported.* Either this attribute or primarySpecimenObjectId needs to be filled, if both are filled the FDO record is invalid. Max 255 chars. Note that absence of the ID poses a challenge on avoiding duplicate digital specimen IDs.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think what's said in this description about the presence of this property could also be enforced in the schema? By reflecting that either "primarySpecimenObjectId" or "specimenObjectIdAbsenceReason" must be present in the required properties. For example, by replacing lines 230-233 with something like:

  "required": [
    "specimenHost"
  ],
  "oneOf": [
        {
            "required": [
                "primarySpecimenObjectId"
            ]
        },
        {
            "required": [
                "specimenObjectIdAbsenceReason"
            ]
        }
  ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, but I'm hesitant to implement it. 2 issues:

  1. We don't have a workflow defined for when a specimen doesn't have a specimen object id. The handle API (which is the primary use-case for this schema right now) would crash.
  2. We use a tool called jsonschema2pojo to create java objects from json schemas. Unfortunately, jsonschema2pojo doesn't currently support the "oneOf" element, and would negatively affect the java objects derived from the schema.

It would be better practice to use the oneOf element here, but until at least issue #1 is resolved and #2 has a decent workaround, I don't want to implement this element.

PS Check out schemas.dissco.tech, it serves the fdo profiles in this repo as jsons to the applications that need them 😎

@southeo
Copy link
Contributor Author

southeo commented Jan 8, 2024

Hi Juilian,

Thanks for the review! I've added more detailed descriptions and replied to your comments.

As for tombstonePids, this is for PIDs of other resources we might want to link to in the tombstone record. I've added more information that (hopefully) clarifies this. It might make sense to have this be an array of objects, like:

tombstonePids: [
  {
     id: 20.600/123
     relationship: isObsoletedBy
}]

Will bring forward this suggestion

@southeo southeo requested a review from sharifX January 8, 2024 14:16
"$comment": "Schema for tombstoning requests to DiSSCo PID API",
"tombstoneText": {
"type": "string",
"description": "Free text describing why the object was archived",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Free text describing why the object was archived". Shouldn't it be "Free text describing why the object was tombstoned"?

# Conflicts:
#	json-examples-and-schemas/digital-specimen-object/basic-json-example2.md
@southeo southeo requested a review from sharifX January 16, 2024 12:25
Copy link
Contributor

@sharifX sharifX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change and updates look ok.

@southeo southeo merged commit 9185241 into master Jan 18, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants