-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Frame Cache Writing using Multiprocessing #608
Comments
@darrencpagan I assume you mean these lines? The threads are supposed to finish before the |
Is there a best way to transfer a fairly large raw image (file) along with the scripts? I can provide on oneDrive or a cloud service. Maybe better for an email? |
@darrencpagan I can access it on |
I wrote a simple tester for the parallel frame cache writing. I'll include it below. It creates an imageseries with a lot of frames but small image shape, so that it runs fast. It writes it to a frame cache then reads it and compares to the original. I wasn't able to break it. It runs like this:
|
I've shared a OneDrive with you both with the script and data that causes a problem. The frame-cache only saves 3 of 3601 frames. Last missing piece of info is I'm working on a computer with 80 workers (40 cores operating 2 threads). |
Thank you for providing the example. I was able to reproduce the issue and determine the cause. Because this is a raw image series, it must be read in sequence. An exception was being raised because the indices in the imageseries were not being accessed in order (via the threadpool). However, because we were not evaluating the results of the Running it serially fixed the issue because that ensured that the indices were being accessed in order. We will add some logic to fix this issue for writing a frame cache from a raw image series. And we will also modify the code to ensure that if an exception occurs, it will be propagated so that it will be visible to the user. Thank you for reporting this issue, @darrencpagan! |
I can see it now. Got it. Thanks for figuring out the issue. I'll keep an eye out for the fix. |
This ensures that for multithreaded situations (such as writing out a frame cache), `__getitem__` will be thread-safe. The change ensures that the frame will be obtained immediately after seeking, and no other thread can seek until the frame is obtained. When I test the example Darren gave us (#608) on the master branch with no multithreading, it ran in 5m 36.826s. With this change, and using multithreading, it ran in 5m 12.811s. So the multithreading produced a minor speed increase. Fixes: #608 Signed-off-by: Patrick Avery <[email protected]>
@darrencpagan This is now fixed in the master branch (as of #611), and should be in the prerelease in about an hour. |
I do not think the multiprocessor mapping function in imageseries framecache writing option is properly waiting for all processes to finish the operation that generates the sparse matrices from the data.
We are trying to write frame caches from raw-image data with 3600 frames, however the output sparse matrices only have three frames. When I modified the code to run sequentially on a single processor, everything was able to work. I looked online and there is conflicting information regarding the implementation of the thread executors as to whether it waits for all processes to finish.
The text was updated successfully, but these errors were encountered: