-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multifile scanimage #297
Multifile scanimage #297
Conversation
src/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py
Outdated
Show resolved
Hide resolved
This looks great @pauladkisson, thank you! I can confirm it works with the data from Dombeck; I'm just wondering now whether it is possible to speed up the initialisation of the "parent" extractor. We're parsing the metadata for each file when initialising the individual extractors, but they're all time same except for these keys:
I think
for the first TIF file, but let me know what you think about this! |
Can we add some tests, even if they're just manually formed out of symlink copies of the current single file ScanImageTiff files we currently have? To showcase and verify the naming pattern we expect from this multifile procedure generated by the ScanImage software (in particular, the naming pattern of the files providing the ordering of sequences is something we expect to follow a stringent pattern, presumably compatible with natsort) |
And yes, consideration of expedited metadata parsing should be considered for speed. For a very long imaging session using this strategy, parsing metadata for each subfile can be expensive |
…it to all other files
Added tests and option to skip metadata i/o for multifile extractors. @weiglszonja lmk what you think! |
Sorry for the delay @pauladkisson, is this example data dual or single channel? Can we add test for dual channel multifile as well? I checked with the data I have from the Dombeck lab (see this note): from roiextractors.extractors.tiffimagingextractors.scanimagetiffimagingextractor import \
ScanImageTiffSinglePlaneMultiFileImagingExtractor
extractor1 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 1", plane_name="0")
video1 = extractor1.get_video(start_frame=0, end_frame=100)
extractor2 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 2", plane_name="0")
video2 = extractor1.get_video(start_frame=0, end_frame=100)
from numpy.testing import assert_array_equal
# This should not be True
assert_array_equal(video1, video2) However looks like the same video is returned for both channels, also looking at these plots: ![]() However when I'm using ![]() |
Hey @weiglszonja, thanks for taking a look. It shouldn't matter whether the example data is single channel or multi channel, since ScanImageTiffSinglePlaneImagingExtractor takes care of all of the indexing logic. What would matter is if these files are split midcycle, which is typical of the file name pattern |
Hey @weiglszonja, looks like there is a typo in your example code. You have ...
video2 = extractor1.get_video(start_frame=0, end_frame=100)
... But it should be ...
video2 = extractor2.get_video(start_frame=0, end_frame=100)
... I'm not seeing any problems with the multifile imaging extractors on my end: file_path = '/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data/2620749R2_231211_00001_00001.tif'
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
file_path=file_path,
channel_name='Channel 1',
plane_name='0',
)
video_ch1_single = imaging_extractor.get_video(end_frame=10)
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
file_path=file_path,
channel_name='Channel 2',
plane_name='0',
)
video_ch2_single = imaging_extractor.get_video(end_frame=10)
imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
file_pattern='2620749R2_231211_00001_*.tif',
channel_name='Channel 1',
plane_name='0',
)
video_ch1_multi = imaging_extractor.get_video(end_frame=10)
imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
file_pattern='2620749R2_231211_00001_*.tif',
channel_name='Channel 2',
plane_name='0',
)
video_ch2_multi = imaging_extractor.get_video(end_frame=10)
print(f'ch1 multi == ch1 single: {np.array_equal(video_ch1_multi, video_ch1_single)}') # True
print(f'ch2 multi == ch2 single: {np.array_equal(video_ch2_multi, video_ch2_single)}') # True
print(f'ch1 single == ch2 single: {np.array_equal(video_ch1_single, video_ch2_single)}') # False
print(f'ch1 multi == ch2 multi: {np.array_equal(video_ch1_multi, video_ch2_multi)}') # False |
Ah, my bad then. Sorry @pauladkisson, and thank you for checking. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #297 +/- ##
==========================================
+ Coverage 79.25% 79.59% +0.34%
==========================================
Files 39 39
Lines 3100 3147 +47
==========================================
+ Hits 2457 2505 +48
+ Misses 643 642 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Fixes #296