Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote workers #11

Open
Qwasser opened this issue Aug 28, 2014 · 6 comments
Open

Remote workers #11

Qwasser opened this issue Aug 28, 2014 · 6 comments

Comments

@Qwasser
Copy link

Qwasser commented Aug 28, 2014

Hello!
I launched some remote workers with script from manual on several machines. Then I started script with a lot of tasks (bootstrap example from manual). After completing about 10000 tasks remote workers stoped getting jobs. This happens all the time after some amount of tasks if I started remote workers. If i start local workers everyrhing is OK.
I am running redis under Windows. Redis version is 2.6.14. R version is 3.0.3. I run remote worker's script through RScript.exe.

@bwlewis
Copy link
Owner

bwlewis commented Sep 14, 2014

Hi sorry for the delay, was on vacation! We're working through a bunch of issues with rredis and doRedis right now. There will be updates in github and sent to CRAN for both packages this weekend.

andreyto added a commit to andreyto/doRedis that referenced this issue Sep 30, 2015
That fixes a bug when using large chunks would lock the
worker indefinitely, e.g.
```
y = foreach(x=runif(100000),.combine = c,.options.redis=list(chunkSize=10000)) %dopar% {x}
```
was locking before (RHEL 6.5, R 3.2.0), and now it does not.
The reason for the lock was because the key was constructed as concatenation
of task indices, and truncated by nsprintf.
This might also solve issue bwlewis#11.
@hifin
Copy link

hifin commented Feb 18, 2016

Hello. Here's another question related to remote workers. Right now if I want to run parallel jobs on several computers, I have to open an R session on each remote computer and run an R script containing the following two lines:
require(doRedis)
startLocalWorkers(n=number_of_local_workers, queue=job_queue_name, host=host_IP)

One way to avoid doing this every time is to keep using the same job queue name and avoid using removeQueue (so remote workers will keep alive). However, I'm not sure if this a good practice and if there is a better way to run parallel jobs

@bwlewis
Copy link
Owner

bwlewis commented Feb 18, 2016

check out the scripts directory (at least if you're running Linux),
especially in GitHub ( a substantially revised/improved version is not yet
on cran).

there is a doredis service script for Linux, including a version that works
nicely on Amazon.

I'm in the slow process of revising the doc and making a tutorial about
that...
On Feb 18, 2016 2:26 AM, "hifin" [email protected] wrote:

Hello. Here's another question related to remote workers. Right now if I
want to run parallel jobs on several computers, I have to open an R session
on each remote computer and run an R script containing the following two
lines:
require(doRedis)
startLocalWorkers(n=number_of_local_workers, queue=job_queue_name,
host=host_IP)

One way to avoid doing this every time is to keep using the same job queue
name and avoid using removeQueue (so remote workers will keep alive).
However, I'm not sure if this a good practice and if there is a better way
to run parallel jobs


Reply to this email directly or view it on GitHub
#11 (comment).

@hifin
Copy link

hifin commented Feb 18, 2016

Thank you! how about windows? i thought about this after posting the question here. using psexec tools and R's system() function may do the job but havent tried yet

@bwlewis
Copy link
Owner

bwlewis commented Feb 18, 2016

Well, there is this ancient project that was used at Montefiore on Windows systems:

https://github.com/bwlewis/doRedisWindowsService

It has not been updated since 2011. But someday it should be updated to reflect the corresponding Linux service implementations...not sure if/when I can get around to that though, Windows is a system I use almost never.

@richett
Copy link

richett commented Apr 28, 2016

I'm trying to do the same as hifin and Qwasser above on windows and wondered if either had settled on a preferred approach to starting remote workers and keeping them alive by not removing the queue?

Bryan, I wonder if you could advise whether I'm likely to run into any problems with
a) reusing a queue and
b) calling registerDoRedis with the same queue name having not removed it previously.

I'm currently using a combination of psexec and TaskScheduler on startup but I'm finding the connections to not be reliable enough to run in production at the moment, possibly due to the multiple registerDoRedis calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants