-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow recompute via _make_file
func
#1093
base: master
Are you sure you want to change the base?
Conversation
I ran in to some issues here, being unable to replicate the DJ-stored file hash with seemingly matched contents. Need to do more testing on whether or not memory pointers are factored into the hash. If so, we may need to store a hash of the object computed by the specific table, and adjust the DJ-stored file hash on recreation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overall structure looks pretty good. For comparing arrays, remind me why something like np.allclose wouldn't work well?
raise FileNotFoundError( | ||
f"Found {len(query)} files for: {analysis_nwb_file_name}" | ||
) | ||
return f"{analysis_dir}/" + query.fetch1("filepath") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably be better to make this less file system dependent by using pathlib
) | ||
|
||
external_tbl = schema.external["analysis"] | ||
file_path = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also another pathlib place
@@ -317,6 +317,8 @@ def make(self, key): | |||
recording = self._get_filtered_recording(key) | |||
recording_name = self._get_recording_name(key) | |||
|
|||
# recording_dir = Path("/home/cbroz/wrk/temp_ssr0/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove before merge
@@ -0,0 +1,257 @@ | |||
"""This schema is used to transition manage files for recompute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module could have a more descriptive name. recompute.py
perhaps?
self.file.close() | ||
|
||
def remove_version(self, key: str) -> bool: | ||
version_pattern = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do worry a bit about these regex's failing. Versioning is hopefully pretty stable.
Just adding a reminder here to warn the lab before this gets merged. |
Description
This PR adds the ability to recompute when fetching a non-existent file from
v1.SpikeSortingRecording
, adding some infrastructure to support expanding this tool in the future. To facilitate this, I add ..._make_file
func that other tables can implement in the future.fetch_nwb
I have tested these hashers for...
v1.SpikeSortingRecording
v0.SpikeSortingRecording
Raw
(ongoing, so far 50%)Checklist:
CITATION.cff
alter
snippet for release notes.CHANGELOG.md
with PR number and description.Question
I found at least one file/directory where hashes mismatched across
What should we do about these? Do we need to restrict our deletions to only items replicable with an up-to-date Spyglass? Should I start building the infrastructure to compile this list? What do we want to do with files that don't replicate? Do we need to then do some experimentation with dependency versions? This might point us to adding conda env dumps to some part of this process