Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Motion trajectories & pose extracted data for animal beh research #2057

Open
yarikoptic opened this issue Feb 18, 2025 · 11 comments
Open

Motion trajectories & pose extracted data for animal beh research #2057

yarikoptic opened this issue Feb 18, 2025 · 11 comments

Comments

@yarikoptic
Copy link
Collaborator

yarikoptic commented Feb 18, 2025

Your idea

Popular relevant toolkits used

Even though that section was primarily aiming for hardware-based motion tracking of humans (AFAIK; e.g. like point-light and other systems), I think it would be great to review the Motion BIDS data type which was and see how it could/should be extended to be able to capture results of the processing from SLEAP, DeepLabCut, or any other similar tool. We might need to workout a BEP and also see if some common format could be established for tool-agnostic format.

Also attn @bids-standard/bep032

Related issues

@talmo
Copy link

talmo commented Feb 18, 2025

Relevant:

@MMathisLab
Copy link

Thanks @yarikoptic! DeepLabCut uses HDF5, and has an export for NWB for saving analyzed data.

@yarikoptic
Copy link
Collaborator Author

yarikoptic commented Feb 18, 2025

Thank you @talmo and @MMathisLab . So both SLEAP and DeepLabCut has export to ndx-pose. I didn't check though if such representation is generalized that e.g. they could potentially read it back (I see there is "load" functionality within https://github.com/DeepLabCut/DLC2NWB)?

@niksirbi 's movement has open issue

I wonder if someone could digest for me a summary of how much ndx-pose covers data types across those 4 toolkits. (The situation reminded me of NIfTI in neuroimaging originally envisioned as an "exchange format" among more specialized back then DICOM, AFNI BRIK/HEAD, FreeSurfer, etc)

@talmo
Copy link

talmo commented Feb 18, 2025

I think NWB through ndx-pose covers most of it. We wanted to still add support for embedding images with compression, but that's on the backburner.

I believe there might still be a field or two that we're missing for total completeness (and there are some serious performance issues with pynwb), but otherwise it's nearly there.

The movement library is probably the most complete and functional, but it is focused on final tracked inference outputs (i.e., contiguous frames), while ndx-pose supports both the training data (random individual frames, doesn't make sense to organize by video/time) and inference outputs (strictly timeseries per video).

We're still adding proper support for the training data types from NWB into sleap-io, but otherwise it's a good base. We support the roundtrip, but might lose some metadata useful in some cases.

@MMathisLab
Copy link

(I see there is "load" functionality within https://github.com/DeepLabCut/DLC2NWB)?

  • Yep! :D

But my pref. for a data standard is HDF5 due to interoperability between fields (NWM is very "neuro" focused). We also support human motion data in .json files and COCO formats.

@niksirbi
Copy link

Just chiming in to confirm some things that have already been said about movement:

  • We currently support loading pose tracks (tracked keypoints over time) from:

  • Support for ndx-pose is underway with this PR. I keep getting distracted by other priorities, but we plan to finalise and merge this within ~1 month.

  • Less relevant for this discussion, we also support tracked bounding boxes, currently from only one source format.

The above list is likely to grow, depending on user demand and the availability of suitable contributors. So far, we've been neuro-centric, but that is changing, so we're keen to also support formats that are common in other fields.

As @talmo said, movement solely deals with predicted (inference output) and tracked data, not with the training datasets.

Regarding ndx-pose, I think it indeed covers most neuroscience use-cases, but @MMathisLab is absolutely right about the “neuro” focus. Machine-learning-based motion tracking is used by many fields other than neuroscience, e.g. ethology, conservation biology, biomechanics, animal welfare, veterinary medicine, sports science, etc. Some of these fields do not follow the “subject-session” paradigm that’s prominent in neuroscience, and have probably never heard of NWB or BIDS. Moreover, in some cases it’s not uncommon to track tens or hundreds of individuals at once (e.g. to look at collective behaviour), and it’s hard to imagine how that data would fit into ndx-pose or BIDS.

But perhaps that’s fine; neuro people are likely to prefer ndx-pose, especially if they are planning to eventually analyse the motion tracks in conjunction with neural recordings. BIDS + NWB doesn’t need to (and cannot) cater to everyone. I’m just noting that motion tracking in general (including every field that employs these methodologies) is in need of a standardised format, and ndx-pose probably won't be it.

FYI @roaldarbol's animovement is an R package with similar scope to movement, and it already supports a bunch of formats. The developer, Mikkel, has some nice ideas on formatting motion tracking data as a tidy dataframe that could be saved in a variety of formats (e.g. parquet or csv).

Cross-linking some relevant discussions happening elsewhere:

@rly
Copy link

rly commented Feb 19, 2025

I’m just noting that motion tracking in general (including every field that employs these methodologies) is in need of a standardised format, and ndx-pose probably won't be it.

I totally agree. ndx-pose is neuro-focused, but that might be OK for BIDS purposes. It looks like Motion-BIDS was driven by human motion capture data. I believe the required fields in Motion-BIDS are also required in ndx-pose, but many of the recommended fields are not relevant or not captured by the tools, as far as I know.

Indeed, ndx-pose (and NWB) does not support handling multiple subjects in a nice way -- each subject's pose data should be in its own NWB file with its individual subject information, even if the data from multiple subjects were acquired and processed together. But I think BIDS also does not currently support handling multiple subjects in a session nicely, though BIDS 2.0 might change that (ses-X/sub-Y organization instead of sub-Y/ses-X).

neuroconv supports converting outputs from DLC, SLEAP, and LightningPose to ndx-pose. Once movement supports writing ndx-pose NWB files (which I 100% support!), then I think any formats that can be read by movement should also be convertible to ndx-pose NWB files through movement.

If you want to extend Motion-BIDS to support single-subject animal motion data, I think allowing NWB files with ndx-pose data makes sense, and would be better than supporting the unique output formats of N different tools. Note that there is no way to tell whether an NWB file has ndx-pose data just by the file name. You would have to open the file. In addition, ndx-pose NWB files could contain other very large related datasets from other modalities. Are symlinks allowed so that a single NWB file can be used in multiple places in the BIDS structure?

Alternatively, you could not extend BIDS support to allow another file type, and instead someone could write a tool to convert data from ndx-pose, DLC, SLEAP, etc. to the existing Motion-BIDS standard of JSON and TSV files. This could be built into movement or neuroconv. This would avoid complicating the reading of motion data in different formats for BIDS users. However, I think NWB users and tools would prefer storing pose data in NWB with ndx-pose.

@talmo
Copy link

talmo commented Feb 19, 2025

Image
relevant


I also want to put another plug in to not forget the training data format support. The needs of data used for training these motion capture models are pretty different from the inference outputs.

If our goal is to store enough metadata to make these kinds of data more reusable and reproducible, we can't ignore the fact that they're produced by machine learning systems. These are pieces of software that depend not only on statically accessible code, but also on the data that was used to train them -- which is arguably considerably more important than the specific tool that was used.

Just like you would store all the hyperparameters that a classical piece of software (or data generating hardware) used to produce a dataset, it is essential that we treat the training data that is part and parcel to the operation of a machine learning system just as seriously.

@robertoostenveld
Copy link
Collaborator

Just zooming out a bit: I wonder how data used in animal behavioral research is fundamentally different from data when the animal is Homo Sapiens. The consideration is similar to animal MRI versus human MRI.

As before with microscopy, for the micro electrophysiology BEP032 we are recognizing the similarities between micro electrodes in human and other animals. The specific difference that we identified to be accommodated with BEP032 is more between macro (already supported with iEEG) and micro (now being added).

Did anyone already give it a try to represent animal motion capture data in the existing standard specification? I realize that page refers to human motion, but if that were searched-and-replaced by "human and animal", might that not work? Probably not, but I wonder what the hypothetical attempt would reveal...

@MMathisLab
Copy link

Looking at the specs, it's extremely comparable with https://github.com/AdaptiveMotorControlLab/DLC2Kinematics and our output H5 files. This package take in the outputs of DeepLabCut and outputs velocity, angles, quaterions, etc :)

And yes -- humans are technically animals! :)

@yarikoptic
Copy link
Collaborator Author

Thanks everyone for sharing information/views/wisdom! It might be great to create a "table" to express what is supported by each of existing standards (in particular relating to BIDS Motion) and tools (which could read/write in what standard).

Meanwhile, just to answer @rly question:

Are symlinks allowed so that a single NWB file can be used in multiple places in the BIDS structure?

They are not since symlinks are not generally available across file systems. FWIW, within BIDS there is a number of locations within metadata (in .json typically) where BIDS URIs are supported to reference other files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants