As a successor of the packages
BatchJobs and
BatchExperiments,
batchtools provides a parallel implementation of Map for high
performance computing systems managed by schedulers like Slurm, Sun Grid
Engine, OpenLava, TORQUE/OpenPBS, Load Sharing Facility (LSF) or Docker
Swarm (see the setup section in the
vignette).
Main features:
Convenience: All relevant batch system operations (submitting,
listing, killing) are either handled internally or abstracted via
simple R functions
Portability: With a well-defined interface, the source is independent
from the underlying batch system - prototype locally, deploy on any
high performance cluster
Reproducibility: Every computational part has an associated seed
stored in a data base which ensures reproducibility even when the
underlying batch system changes
Abstraction: The code layers for algorithms, experiment definitions
and execution are cleanly separated and allow to write readable and
maintainable code to manage large scale computer experiments
Maintainability: The packages
BatchJobs and
BatchExperiments are
tightly connected which makes maintenance difficult. Changes have to
be synchronized and tested against the current CRAN versions for
compatibility. Furthermore, BatchExperiments violates CRAN policies by
calling internal functions of BatchJobs.
Data base issues: Although we invested weeks to mitigate issues with
locks of the SQLite data base or file system (staged queries, file
system timeouts, …), BatchJobs kept working unreliable on some
systems with high latency under certain conditions. This made
BatchJobs unusable for many users.
BatchJobs and
BatchExperiments will
remain on CRAN, but new features are unlikely to be ported back. The
vignette
contains a section comparing the packages.
JOSS Paper: Short paper on
batchtools. Please cite this if you use batchtools.
Paper on
BatchJobs/BatchExperiments: The
described concept still holds for batchtools and most examples work
analogously (see the
vignette
for differences between the packages).
Citation
Please cite the JOSS paper using
the following BibTeX entry:
@article{,
doi = {10.21105/joss.00135},
url = {https://doi.org/10.21105/joss.00135},
year = {2017},
month = {feb},
publisher = {The Open Journal},
volume = {2},
number = {10},
author = {Michel Lang and Bernd Bischl and Dirk Surmann},
title = {batchtools: Tools for R to work on batch systems},
journal = {The Journal of Open Source Software}
}
clustermq is a similar
approach which also supports multiple schedulers. Uses the ZeroMQ
network protocol for communication, and shines if you have millions of
fast jobs.
batch assists in splitting
and submitting jobs to LSF and MOSIX clusters.
flowr supports LSF, Slurm,
TORQUE and Moab and provides a scatter-gather approach to define
computational jobs.
drake uses graphs to
define computational jobs. batchtools is used as a backend via
future.batchtools.
Contributing to batchtools
This R package is licensed under the
LGPL-3. If you
encounter problems using this software (lack of documentation,
misleading or wrong documentation, unexpected behaviour, bugs, …) or
just want to suggest features, please open an issue in the issue
tracker. Pull requests
are welcome and will be included at the discretion of the author. If you
have customized a template file for your (larger) computing site, please
share it: fork the repository, place your template in inst/templates
and send a pull request.
batchtools
Package website: release | dev
As a successor of the packages BatchJobs and BatchExperiments, batchtools provides a parallel implementation of Map for high performance computing systems managed by schedulers like Slurm, Sun Grid Engine, OpenLava, TORQUE/OpenPBS, Load Sharing Facility (LSF) or Docker Swarm (see the setup section in the vignette).
Main features:
Installation
Install the stable version from CRAN:
For the development version, use devtools:
Next, you need to setup
batchtoolsfor your HPC (it will run sequentially otherwise). See the vignette for instructions.Why batchtools?
The development of BatchJobs and BatchExperiments is discontinued for the following reasons:
BatchJobskept working unreliable on some systems with high latency under certain conditions. This madeBatchJobsunusable for many users.BatchJobs and BatchExperiments will remain on CRAN, but new features are unlikely to be ported back. The vignette contains a section comparing the packages.
Resources
Citation
Please cite the JOSS paper using the following BibTeX entry:
Related Software
batchtoolsas backend for future.batchtoolsto foreach.batchtoolsis used as a backend via future.batchtools.Contributing to batchtools
This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behaviour, bugs, …) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the author. If you have customized a template file for your (larger) computing site, please share it: fork the repository, place your template in
inst/templatesand send a pull request.