-
+
This package will allow you to send function calls as jobs on a computing cluster with a minimal interface provided by the Q
function:
-# load the library and create a simple function
-library(clustermq)
-fx = function(x) x * 2
-
-# queue the function call on your scheduler
-Q(fx, x=1:3, n_jobs=1)
-# list(2,4,6)
+
# load the library and create a simple function
+library(clustermq)
+fx = function(x) x * 2
+
+# queue the function call on your scheduler
+Q(fx, x=1:3, n_jobs=1)
+# list(2,4,6)
Computations are done entirely on the network and without any temporary files on network-mounted storage, so there is no strain on the file system apart from starting up R once per job. All calculations are load-balanced, i.e. workers that get their jobs done faster will also receive more function calls to work on. This is especially useful if not all calls return after the same time, or one worker has a high load.
Browse the vignettes here:
@@ -98,22 +98,14 @@
Installation
-
First, we need the ZeroMQ system library. This is probably already installed on your system. If not, your package manager will provide it:
-
# You can skip this step on Windows and macOS, the package binary has it
-# On a computing cluster, we recommend to use Conda or Linuxbrew
-brew install zeromq # Linuxbrew, Homebrew on macOS
-conda install zeromq # Conda, Miniconda
-sudo apt-get install libzmq3-dev # Ubuntu
-sudo yum install zeromq-devel # Fedora
-pacman -S zeromq # Arch Linux
-
Then install the clustermq
package in R from CRAN:
-
+
Install the clustermq
package in R from CRAN (including the bundled ZeroMQ system library):
+
Alternatively you can use the remotes
package to install directly from Github:
-
-# install.packages('remotes')
-remotes::install_github('mschubert/clustermq')
-# remotes::install_github('mschubert/clustermq', ref="develop") # dev version
+
+# install.packages('remotes')
+remotes::install_github('mschubert/clustermq')
+# remotes::install_github('mschubert/clustermq', ref="develop") # dev version
Schedulers
@@ -157,31 +149,31 @@ Usage
export
- A named list of objects to export to the worker environment
The documentation for other arguments can be accessed by typing ?Q
. Examples of using const
and export
would be:
+
+# adding a constant argument
+fx = function(x, y) x * 2 + y
+Q(fx, x=1:3, const=list(y=10), n_jobs=1)
-# adding a constant argument
-fx = function(x, y) x * 2 + y
-Q(fx, x=1:3, const=list(y=10), n_jobs=1)
-
-# exporting an object to workers
-fx = function(x) x * 2 + y
-Q(fx, x=1:3, export=list(y=10), n_jobs=1)
+
# exporting an object to workers
+fx = function(x) x * 2 + y
+Q(fx, x=1:3, export=list(y=10), n_jobs=1)
clustermq
can also be used as a parallel backend for foreach
. As this is also used by BiocParallel
, we can run those packages on the cluster as well:
+
-
+library(BiocParallel)
+register(DoparParam()) # after register_dopar_cmq(...)
+bplapply(1:3, sqrt)
More examples are available in the user guide.
Comparison to other packages
There are some packages that provide high-level parallelization of R function calls on a computing cluster. We compared clustermq
to BatchJobs
and batchtools
for processing many short-running jobs, and found it to have approximately 1000x less overhead cost.
-
@@ -268,12 +254,12 @@
Dev status
diff --git a/news/index.html b/news/index.html
index a1f22e4..e9003db 100644
--- a/news/index.html
+++ b/news/index.html
@@ -1,5 +1,5 @@
-
Changelog • clustermqChangelog • clustermq
@@ -10,7 +10,7 @@
clustermq
-
0.8.96
+
0.9.0
+
+
clustermq 0.9.0
+
+
Features
+
- Reuse of common data is now supported (#154)
+- Jobs now error instead of stalling upon unexpected worker disconnect (#150)
+- Workers now error if they can not establish a connection within a time limit
+- Error if
n_jobs
and max_calls_worker
provide insufficient call slots (#258)
+- Request 1 GB by default in SGE template (#298) @nickholway
+- Error and warning summary now orders by index and severity (#304)
+- A call can have multiple warnings forwarded, not only last
+
+
+
Bugfix
+
- Fix bug where max memory reporting by
gc()
may be in different column (#240)
+- Fix passing numerical
job_id
to qdel
in PBS (#265)
+- The job port/id pool is now used properly upon binding failure (#270) @luwidmer
+- Common data size warning is now only displayed when exceeding limits (#287)
+
+
+
Under the hood
+
- Complete rewrite of the worker API
+- We no longer depend on the
purrr
package
+
+
clustermq 0.8.95
CRAN release: 2020-07-01
- We are now using ZeroMQ via
Rcpp
in preparation for v0.9
(#151)
-- New
multiprocess
backend via callr
instead of forking (#142, #197) FIXME: test case
+- New
multiprocess
backend via callr
instead of forking (#142, #197)
- Sending data on sockets is now blocking to avoid excess memory usage (#161)
-
multicore
, multiprocess
schedulers now support logging (#169)
@@ -149,16 +174,25 @@ clustermq 0.8.1
clustermq 0.8.0
CRAN release: 2017-11-11
+
+
Features
- Templates changed:
clustermq:::worker
now takes only master as argument
-- Fix a bug where copies of
common_data
are collected by gc too slowly (#19)
- Creating
workers
is now separated from Q
, enabling worker reuse (#45)
- Objects in the function environment must now be
export
ed explicitly (#47)
-- Messages on the master are now processed in threads (#42)
- Added
multicore
qsys using the parallel
package (#49)
- New function
Q_rows
using data.frame rows as iterated arguments (#43)
-- Jobs will now be submitted as array if possible
- Job summary will now report max memory as reported by
gc
(#18)
+
+
Bugfix
+
- Fix a bug where copies of
common_data
are collected by gc too slowly (#19)
+
+
+
Under the hood
+
- Messages on the master are now processed in threads (#42)
+- Jobs will now be submitted as array if possible
+
+
clustermq 0.7.0
CRAN release: 2017-08-28
- Initial release on CRAN
@@ -168,11 +202,11 @@ clustermq 0.7.0
diff --git a/pkgdown.js b/pkgdown.js
index a1b8b6d..5fccd9c 100644
--- a/pkgdown.js
+++ b/pkgdown.js
@@ -70,7 +70,7 @@
/* Search marking --------------------------*/
var url = new URL(window.location.href);
var toMark = url.searchParams.get("q");
- var mark = new Mark("div.col-md-9");
+ var mark = new Mark("main#main");
if (toMark) {
mark.mark(toMark, {
accuracy: {
diff --git a/pkgdown.yml b/pkgdown.yml
index bf01959..c9b6076 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -1,9 +1,9 @@
-pandoc: 2.14.2
-pkgdown: 2.0.2
+pandoc: 3.1.6
+pkgdown: 2.0.7
pkgdown_sha: ~
articles:
quickstart: quickstart.html
technicaldocs: technicaldocs.html
userguide: userguide.html
-last_built: 2022-03-04T21:08Z
+last_built: 2023-09-23T19:47Z
diff --git a/reference/LOCAL.html b/reference/LOCAL.html
index 39fcc64..a79e230 100644
--- a/reference/LOCAL.html
+++ b/reference/LOCAL.html
@@ -1,5 +1,5 @@
-Placeholder for local processing — LOCAL • clustermqPlaceholder for local processing — LOCAL • clustermq
@@ -10,7 +10,7 @@
clustermq
- 0.8.96
+ 0.9.0