目录
CRAN check status R CMD check status Coverage Status

future.apply: Apply Function to Elements in Parallel using Futures

Introduction

The purpose of this package is to provide worry-free parallel alternatives to base-R “apply” functions, e.g. apply(), lapply(), and vapply(). The goal is that one should be able to replace any of these in the core with its futurized equivalent and things will just work. For example, instead of doing:

library(datasets)
library(stats)
y <- lapply(mtcars, FUN = mean, trim = 0.10)

one can do:

library(future.apply)
plan(multisession) ## Run in parallel on local computer

library(datasets)
library(stats)
y <- future_lapply(mtcars, FUN = mean, trim = 0.10)

Reproducibility is part of the core design, which means that perfect, parallel random number generation (RNG) is supported regardless of the amount of chunking, type of load balancing, and future backend being used. To enable parallel RNG, use argument future.seed = TRUE.

Role

Where does the future.apply package fit in the software stack? You can think of it as a sibling to foreach, furrr, BiocParallel, plyr, etc. Just as parallel provides parLapply(), foreach provides foreach(), BiocParallel provides bplapply(), and plyr provides llply(), future.apply provides future_lapply(). Below is a table summarizing this idea:

Package Functions Backends
future.apply

Future-versions of common goto *apply() functions available in base R (of the base and stats packages):
future_apply(), future_by(), future_eapply(), future_Filter(), future_lapply(), future_kernapply(), future_Map(), future_mapply(), future_.mapply(), future_replicate(), future_sapply(), future_tapply(), and future_vapply().
The following function is not implemented:
future_rapply()
All future backends
parallel mclapply(), mcmapply(), clusterMap(), parApply(), parLapply(), parSapply(), ... Built-in and conditional on operating system
foreach foreach(), times() All future backends via doFuture
furrr future_imap(), future_map(), future_pmap(), future_map2(), ... All future backends
BiocParallel Bioconductor's parallel mappers:
bpaggregate(), bpiterate(), bplapply(), and bpvec()
All future backends via doFuture (because it supports foreach) or via BiocParallel.FutureParam (direct BiocParallelParam support; prototype)
plyr **ply(..., .parallel = TRUE) functions:
aaply(), ddply(), dlply(), llply(), ...
All future backends via doFuture (because it uses foreach internally)

Note that, except for the built-in parallel package, none of these higher-level APIs implement their own parallel backends, but they rather enhance existing ones. The foreach framework leverages backends such as doParallel, doMC and doFuture, and the future.apply framework leverages the future ecosystem and therefore backends such as built-in parallel, future.callr, and future.batchtools.

By separating future_lapply() and friends from the future package, it helps clarifying the purpose of the future package, which is to define and provide the core Future API, which higher-level parallel APIs can build on and for which any futurized parallel backends can be plugged into.

The API and identity of the future.apply package will be kept close to the *apply() functions in base R. In other words, it will neither keep growing nor be expanded with new, more powerful apply-like functions beyond those core ones in base R. Such extended functionality should be part of a separate package.

Installation

R package future.apply is available on CRAN and can be installed in R as:

install.packages("future.apply")

Pre-release version

To install the pre-release version that is available in Git branch develop on GitHub, use:

remotes::install_github("futureverse/future.apply", ref="develop")

This will install the package from source.

Contributing

To contribute to this package, please see CONTRIBUTING.md.

关于

提供并行化apply函数,用于在R语言中实现并行计算

2.6 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号