You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DynamicTables are one of the major non-spec (or i guess informal spec) with most of the meat of it in the implementation rather than the schema.
The reason they exist is bc of course we always need additional columns in our data that don't exist in the schema, and the typical schema makes it real hard to add new values. we won't have that problem anymore tho, bc if someone wants to make a table they just like inherit from the table and add some new attributes, generate model, boom done.
unhelpfully there is an absolute ocean of unspecified behavior that has been crammed into DynamicTable and its related classes, so we have to spec and accomodate them.
This is the first of probably several tracking issues for handling DynamicTables and related constructs, first let's just handle the most basic usage.
This issue is incomplete and will mostly serve as a place for notes and to track progress
Implementation
DynamicTables consist of a set of VectorData and optionally VectorIndex datasets - table columns
VectorData are implicitly an 1-4 dimension array of any type, and so each cell in a table can be a whole array
To support ragged arrays (in this case meaning that the table has equal length columns, but each of the cells may be different lengths) in hdf5 datasets, we unravel the array and store it alongside a VectorIndex which is a vector of ints that index the starting position of each cell.
VectorIndexes are supposed to have an explicit target, but often don't, and the relationship is encoded by the implicit _indexnaming convention
at access time, the VectorIndex class is silently substituted for the VectorData class (eg type(nwbfile.units['spike_times']) == VectorIndex in spite of the schema
Approach
nwb_language.py classes
Mixin for DynamicTable that emulates the VectorIndex behavior by wrapping all model fields with a __getitem__ method, allows extra fields.
model validator that ensures equal length columns
nwb -> linkml translation
drop VectorIndexes and just make a field for the VectorData corresponding to its dtype and dims/shape spec (are there cases we can't do this?)
pydantic model generation
insert custom language adapters into dynamictable model
ensure sufficient metadata is present to be able to invert models to schema
DynamicTables are one of the major non-spec (or i guess informal spec) with most of the meat of it in the implementation rather than the schema.
The reason they exist is bc of course we always need additional columns in our data that don't exist in the schema, and the typical schema makes it real hard to add new values. we won't have that problem anymore tho, bc if someone wants to make a table they just like inherit from the table and add some new attributes, generate model, boom done.
unhelpfully there is an absolute ocean of unspecified behavior that has been crammed into
DynamicTable
and its related classes, so we have to spec and accomodate them.This is the first of probably several tracking issues for handling DynamicTables and related constructs, first let's just handle the most basic usage.
This issue is incomplete and will mostly serve as a place for notes and to track progress
Implementation
DynamicTable
s consist of a set ofVectorData
and optionallyVectorIndex
datasets - table columnsVectorData
are implicitly an 1-4 dimension array of any type, and so each cell in a table can be a whole arrayVectorIndex
which is a vector of ints that index the starting position of each cell.VectorIndex
es are supposed to have an explicittarget
, but often don't, and the relationship is encoded by the implicit_index
naming conventionVectorIndex
class is silently substituted for theVectorData
class (egtype(nwbfile.units['spike_times']) == VectorIndex
in spite of the schemaApproach
nwb_language.py
classesDynamicTable
that emulates the VectorIndex behavior by wrapping all model fields with a__getitem__
method, allowsextra
fields.VectorIndex
es and just make a field for theVectorData
corresponding to its dtype and dims/shape spec (are there cases we can't do this?)References
The text was updated successfully, but these errors were encountered: