Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major refactoring for upstream changes / blocks #92

Merged
merged 4 commits into from
May 14, 2024

Conversation

Lestropie
Copy link
Collaborator

Replaces #90.

Resolving the BEP against the absence of both complex inheritance and suffix distinction is not as simple as removing complex inheritance and swapping suffices; attempting to do so just breaks everything around it. But given prior precedent I'm pessimistic about my ability to convey this convincingly. So I'm better off making the changes necessary to conform to the main structural constraints (no IP changes, single fixed suffix) that will both retain compatibility with extension beyond the most trivial of diffusion models, and hopefully not be objectionable on other bases.

I've ended up making a much broader scope of changes than just "doing #90 differently", as there are too many moving parts to handle in isolation. I'll try to break up my description here in terms of aspects that address those core structural constraints, vs. other things that to greater or lesser necessity got changed along the way. I can't finish this off right now as I have other urgent tasks, but hopefully this is enough to elicit opinions.

Primary structure

  • One sidecar file per model parameter data file; no advanced inheritance principle required (and no smuggling in complex inheritance by other means).
  • Distinguish between "model metadata" and "parameter metadata".
    (Serves similar purpose to the former MFP / MDP distinction, but applies exclusively in the context of metadata, not image data)
  • Duplicate model metadata across all sidecars.
  • Use metadata sub-dictionary to separate metadata applicable to a model as a whole from metadata applicable to specific parameters.

Other changes

  • More extensive use of metadata sub-dictionaries.
    In particular, those metadata fields specific to the interpretation of data across a NIfTI image axis as encoding some anisotropic information have been collated into their own sub-dictionary.
  • More explicit about image axes used for orientation encoding vs. bootstrapping
    Having to try to make the bedpostx example more mature, and thinking about future orientation encoding types, I deemed it necessary to be explicit about which image axes encode orientation information vs. different bootstrap realisations of a model. There are some images that have orientation information but no bootstrapping, some that have bootstrap realisations but no orientation information. Then, in the future, there are orientation encoding types that will require multiple image axes (I'm thinking SHARD and DSI PDFs). So I'm not convinced that saying "bootstrapping comes after orientation stuff" is adequate, and would prefer to be explicit about both.
  • Removal of fields about which gradients / shells were utilised in the model fit (DWI directions utilised in model #48).
    I don't think it makes sense to specify which volumes / shells were utilised, but not what image data were used, especially when the former was already incomplete. I tried adding the latter, but it started to look an awful lot like provenance.
  • Replace "data representations" / "orientation representation" with "orientation encoding" ("OrientationRepresentation" rename #91).
  • Non-negativity is not a constraint of a model as a whole; it is a constraint that may be applied to individual parameters of that model. Therefore I've stored it on an individual parameter basis, rather than being a single field applicable to the model as a whole.
  • Removed reserved labels of models and parameters.
    I think these do more harm than good; it makes the whole ecosystem inflexible, might be misinterpreted as not supporting anything outside of that scope, and runs into a weird situation where the corresponding entities are neither arbitrary labels nor define a finite restricted set of options. If the demonstrative examples are moved out of the document, then their scope could be increased to exemplify some of these.
    The counter to this would be that by controlling the entity labels, specific parametric maps generated by one pipeline would be possible to robustly identify by another. While I'm myself interested in this, I'm not sure that partial control of entity labels solves that problem without consequence. Elsewhere there's discouragement of placing too much automated interpretation on entity labels (eg. _dir- to discover phase encoding information). And for such abbreviated labels, hard to dictate that a dataset would be non-conformant if some App / user were to store an image with a two-character label that encodes a parameter other than the one that the specification attributes to that two-character label.

Outstanding questions

  • For FSL bedpostx orientations when stored in spherical coordinates:
    • Is the interpretation of polar angles conformant with the ISO specification here, or is a transformation required?
    • Are they truly defined with respect to the image axes, or do they conform to the same convention as bvecs, which has a flip along the first axis in the case of a positive determinant?
  • For spherical deconvolution with multiple kernels, should metadata encoding the response functions be defined only with respect to the parameter corresponding to each, or should all response functions be defined within the scope of model metadata?
  • Demonstrative examples section becomes even longer when you've got metadata duplicated across each JSON contents. It's now more than half the document. I would prefer to defer this entirely to an external repository where exemplar data are explicitly stored. This would also better facilitate demonstrating scenarios where there are multiple possible solutions; eg. bedpostx concatenating all stick orientations / volume fractions into images versus having each parameter for each stick stored as its own image.
  • For metadata, possible that software implementation / model description / description of individual encoded parameter could all have their own URLs.

- Reject suffices "mfp" (or earlier "_model") / "mfp" in favour of "_dwimap". Instead adopt distinction of "model metadata" vs. "parameter metadata".
- Reject use of advanced inheritance in favour of listing all relevant metadata for each data file in the sidecar JSON.
- Greater use of sub-dictionaries in JSON files to assist in separating metadata relevant to a model as a whole vs. only that particular parameter of the model.
Copy link
Collaborator

@arokem arokem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work on this. I think that it presents a good and viable way forward. I think that I'll defer discussion on some of the finer points for a PR against the main spec (as suggested in #24 (comment)). For now, if I understand correctly, the main question to present to others is whether we can reach broad consensus that duplication of metadata across files is the "least of evils" among our various viable options. At the risk of stating the obvious, the main tension here is that this in contradiction to the RECOMMENDED corollary 2 of the inheritance principle, but in compliance with the more strictly defined MUST NOT rule 4.

src/derivatives/05-diffusion-derivatives.md Outdated Show resolved Hide resolved
src/derivatives/05-diffusion-derivatives.md Outdated Show resolved Hide resolved
src/derivatives/05-diffusion-derivatives.md Outdated Show resolved Hide resolved

- WOULD SPLITTING STICK COMPONENTS ACROSS NIFTIS
REQUIRE A NEW ENTITY BY WHICH TO INDEX THEM?
OR JUST GIVE THEM EG. `_param-spherical1`, `_param-spherical2`?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used bedpostx in a really long while, and even then only very superficially. I would love to get input from users of the tool who can tell us more about how they anticipate using these files and what they'd prefer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps defer dedicated discussions to #93 / #94.

@arokem
Copy link
Collaborator

arokem commented May 10, 2024

Any objections to merging this into #24 so that we can fix up things there and keep moving towards a PR on the bids-standard/bids-specification? If I get no objections by mid next week, I will go on ahead with that and we can follow up with more PRs on top of this one (specifically to address the "notes" and "todo" sections of current PR).

@Lestropie
Copy link
Collaborator Author

I'm eager to merge it to start chipping away at other things. I've been writing code to verify appropriate interpretation of fibre orientations (not at the scope of #23, mostly trying to get an answer to #96), and have already identified one aspect where the spec will need to be augmented to support FSL data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants