Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nwb interface: allow full paths in __getitem__ #351

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

magland
Copy link
Collaborator

@magland magland commented Oct 4, 2024

I would like to be able to reference NWB items by their full paths.

For example in this NWB file

There is acquisition/position_sensor0 and acquisition/position_sensor1

In existing package, I can do nwbp['position_sensor0'] but I cannot do nwbp['acquisition/position_sensor0']

This PR modifies the NWB interface to allow both.

Why do I want this?

I am creating an AI assistant built into neurosift, and I am teaching it to load NWB into pynapple objects. Even if I tell it to not use the full path of the objects, it often tries to do that anyway. I think it reflects that using full paths is a natural thing to do.

I also think it is important to allow more than one units tables in the same NWB file. Right now, everything gets mapped to the "units" key. But I think that can be a separate PR/issue.

If you are interested in trying out the assistant, open the above neurosift link, click on the chat icon in the upper right and prompt something like

"Load the position sensors using pynapple, including showing how to load this NWB file"

(relevant to a discussion I was having with @bendichter)

@magland magland requested a review from gviejo as a code owner October 4, 2024 13:05
@magland
Copy link
Collaborator Author

magland commented Oct 4, 2024

And here's an example where the full paths are important

https://neurosift.app/?p=/nwb&url=https://api.dandiarchive.org/api/assets/ffcb1836-587e-42f4-887b-50b02948b779/download/&dandisetId=000623&dandisetVersion=0.240227.2023

/processing/behavior/EyeTracking/SpatialSeries

The name "SpatialSeries" is generic, and there could easily be two objects in the NWB with that name. So I'm wondering if the data should always be stored with the full path. This would also solve the problem with multiple units tables.

@BalzaniEdoardo
Copy link
Collaborator

Hi Jeremy,

Thanks for the PR! I personally agree with your point about having the full path represented, since the same label could be attached to the same object;

I am trying out your branch, and I have a few comments:

  • With this feature we should change the __repr__ of the NWBFile to reflect the folder structure. Right now it would show the final object name;
  • We should add a test with a small nwb, with multiple objects with the same names on different paths, making sure that are loaded and displayed correctly.
  • Right now both nwb['acquisition/position_sensor0'] and nwb['position_sensor0'] work, but what happens when there are multiple "position_sensor0" fields?

Additional thoughts:

We can facilitates accessing the content if the NWBFile by reflecting the hierarchical structure of it,

>>> # original nwb loading
>>> nwb = nap.NWBFile(io.read())

>>> # this creates an nwb with root acquisition (does this makes sense?)
>>> sub_nwb = nwb['acquisition/']
>>> sub_nwb # __repr__ returns a more readable content (less nested) 

@BalzaniEdoardo
Copy link
Collaborator

BalzaniEdoardo commented Oct 4, 2024

Adding some extra thoughts/suggestions on the repr/key access;

To keep things simple for the user, we could think of representing only the minimal amount of path required to identify the object:

nwb["folder1/folder2/object_name"]
nwb["folder1/folder3/object_name"]

will result in a key on the __repr__ table of: "folder2/object_name" and "folder3/object_name".

Objects can be accessed with the partial or full path, and we can additionally store the full path as a read only property.

@gviejo
Copy link
Contributor

gviejo commented Oct 7, 2024

Hi Jeremy, I think it's doable. The only thing is that giving the full path every time would probably make the NWB table of pynapple unreadable for the user.

The workaround would be to do first what Edoardo said. If you have a NWB file with :

  • /folder1/folder2/object_name
  • /folder1/folder3/object_name

Pynapple should print :

| folder2/object_name | Object type |
| folder3/object_name | Object type |

So then you can do:

nwb["folder2/object_name"]
nwb["folder3/object_name"]

But then you should also be able to do :

nwb["/folder1/folder2/object_name"]
nwb["/folder1/folder3/object_name"]

even if it's not in the __repr__ of the NWBFile object as long as you know the correct path.

To summarize, the representation of the object should alwayys show the minimal version of the key. But you can also index with the full path. I hope it makes sense.

@magland
Copy link
Collaborator Author

magland commented Oct 7, 2024

Thanks @BalzaniEdoardo and @gviejo

What you say makes sense. Personally I would like everything to be done by full path because that's the most explicit. But I understand wanting to reduce how much information is shown to the user.

@gviejo
Copy link
Contributor

gviejo commented Oct 11, 2024

No problem. Let us know if you need help for finishing the PR.

@gviejo
Copy link
Contributor

gviejo commented Nov 6, 2024

Hi Jeremy, I wanted to finish your PR. I sent you an invitation for the pynapple repo? Since this PR is now behind especially for the documentation, would you be ok to close it and reopen a branch within the pynapple-org? I can then push on it with your modifications to the NWB class. Otherwise I can just open a branch myself and start from your modifications. Please let me know.

@magland
Copy link
Collaborator Author

magland commented Nov 6, 2024

Hi Jeremy, I wanted to finish your PR. I sent you an invitation for the pynapple repo? Since this PR is now behind especially for the documentation, would you be ok to close it and reopen a branch within the pynapple-org? I can then push on it with your modifications to the NWB class. Otherwise I can just open a branch myself and start from your modifications. Please let me know.

Hi @gviejo
I created a new PR here: #364

And put in the modifications... although I haven't tested it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants