Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use larger static buffer for key send and send in chunks. #27

Merged
merged 1 commit into from
Oct 1, 2015

Conversation

andreyto
Copy link
Contributor

That fixes a bug when using large chunks would lock the
worker indefinitely, e.g.

y = foreach(x=runif(100000),.combine = c,.options.redis=list(chunkSize=10000)) %dopar% {x}

was locking before (RHEL 6.5, R 3.2.0), and now it does not.
The reason for the lock was because the key was constructed as concatenation
of task indices, and truncated by nsprintf.
This might also solve issue #11.

That fixes a bug when using large chunks would lock the
worker indefinitely, e.g.
```
y = foreach(x=runif(100000),.combine = c,.options.redis=list(chunkSize=10000)) %dopar% {x}
```
was locking before (RHEL 6.5, R 3.2.0), and now it does not.
The reason for the lock was because the key was constructed as concatenation
of task indices, and truncated by nsprintf.
This might also solve issue bwlewis#11.
@bwlewis
Copy link
Owner

bwlewis commented Oct 1, 2015

This is a great catch, thanks!

Longer term, this has got me thinking that the way we encode the tasks is bad, I think a new approach is warranted, maybe a lookup table or something.

Thanks again, for this.

bwlewis added a commit that referenced this pull request Oct 1, 2015
Use larger static buffer for key send and send in chunks.
@bwlewis bwlewis merged commit fb0f526 into bwlewis:master Oct 1, 2015
@andreyto andreyto deleted the at_chunks branch October 1, 2015 14:03
@andreyto
Copy link
Contributor Author

andreyto commented Oct 1, 2015

Yes, it was my thought too, but that encoding was assumed in quite a few places throughout both master and worker R code, and I did not want to upset things too much. The C code would need some more checking for various error conditions, in my opinion. Maybe you could use some library like 0MQ that takes care of robustness? Thanks for the great package with some unique design features! I really like the ability of having dynamic membership of the worker pool - it is so easy to feed the workers through some array job on the cluster. And using a message broker makes deployment very flexible. For example, I can run the master on my laptop in RStudio, and have the heavy lifting done in foreach loop on the Linux cluster.
Andrey

bwlewis added a commit that referenced this pull request Oct 28, 2015
Andrey Tovchigrechko pointed out in #27 that the fault tolerance task encoding is bad. This commit improves that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants