Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Frame Cache Writing using Multiprocessing #608

Closed
darrencpagan opened this issue Jan 22, 2024 · 8 comments · Fixed by #611
Closed

Issue with Frame Cache Writing using Multiprocessing #608

darrencpagan opened this issue Jan 22, 2024 · 8 comments · Fixed by #611

Comments

@darrencpagan
Copy link
Contributor

I do not think the multiprocessor mapping function in imageseries framecache writing option is properly waiting for all processes to finish the operation that generates the sparse matrices from the data.

We are trying to write frame caches from raw-image data with 3600 frames, however the output sparse matrices only have three frames. When I modified the code to run sequentially on a single processor, everything was able to work. I looked online and there is conflicting information regarding the implementation of the thread executors as to whether it waits for all processes to finish.

@psavery
Copy link
Collaborator

psavery commented Jan 22, 2024

@darrencpagan I assume you mean these lines?

The threads are supposed to finish before the with block is exited. I just tried this with a 1440 frame example, and it appeared that all frames were written. Are you able to provide me an example script/data, as we might be doing something slightly different?

@darrencpagan
Copy link
Contributor Author

Is there a best way to transfer a fairly large raw image (file) along with the scripts? I can provide on oneDrive or a cloud service. Maybe better for an email?

@psavery
Copy link
Collaborator

psavery commented Jan 22, 2024

@darrencpagan I can access it on classe if you put it there (if you do that, send me a message on slack with the filepath). Otherwise, I think a oneDrive/Google drive link will be fine (and share it with my email)!

@donald-e-boyce
Copy link
Collaborator

I wrote a simple tester for the parallel frame cache writing. I'll include it below. It creates an imageseries with a lot of frames but small image shape, so that it runs fast. It writes it to a frame cache then reads it and compares to the original. I wasn't able to break it. It runs like this:

(hexrd-dev) (MBP: frame-cache-parallel) 1928. python test_fcp.py -nw 8
number of wokers:  8
saving file
comparing imageseries
- lengths match
- shapes match
- all frames match
compare: done
(hexrd-dev) (MBP: frame-cache-parallel) 1929. 

test_fcp.py.txt

@darrencpagan
Copy link
Contributor Author

I've shared a OneDrive with you both with the script and data that causes a problem. The frame-cache only saves 3 of 3601 frames.

Last missing piece of info is I'm working on a computer with 80 workers (40 cores operating 2 threads).

@psavery
Copy link
Collaborator

psavery commented Jan 22, 2024

Thank you for providing the example. I was able to reproduce the issue and determine the cause.

Because this is a raw image series, it must be read in sequence. An exception was being raised because the indices in the imageseries were not being accessed in order (via the threadpool).

However, because we were not evaluating the results of the map(), the exception was not being propagated - so you wouldn't see it.

Running it serially fixed the issue because that ensured that the indices were being accessed in order.

We will add some logic to fix this issue for writing a frame cache from a raw image series. And we will also modify the code to ensure that if an exception occurs, it will be propagated so that it will be visible to the user.

Thank you for reporting this issue, @darrencpagan!

@darrencpagan
Copy link
Contributor Author

I can see it now. Got it. Thanks for figuring out the issue. I'll keep an eye out for the fix.

psavery added a commit that referenced this issue Jan 25, 2024
This ensures that for multithreaded situations (such as writing out
a frame cache), `__getitem__` will be thread-safe. The change ensures
that the frame will be obtained immediately after seeking, and no other
thread can seek until the frame is obtained.

When I test the example Darren gave us (#608) on the master branch with
no multithreading, it ran in 5m 36.826s. With this change, and using
multithreading, it ran in 5m 12.811s. So the multithreading produced
a minor speed increase.

Fixes: #608

Signed-off-by: Patrick Avery <[email protected]>
@psavery
Copy link
Collaborator

psavery commented Jan 25, 2024

@darrencpagan This is now fixed in the master branch (as of #611), and should be in the prerelease in about an hour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants