LSF is common high performance
computing batch system.
Specifying Project and Queue
LSF clusters can have mandatory resource indicators for
accounting and scheduling, [Project]{.title-ref} and
[Queue]{.title-ref}, respectivily. These resources are usually
omitted from Snakemake workflows in order to keep the workflow
definition independent from the platform. However, it is also possible
to specify them inside of the workflow as resources in the rule
definition (see snakefiles-resources{.interpreted-text role=”ref”}).
To specify them at the command line, define them as default resources:
Usually, it is advisable to persist such settings via a
configuration profile, which
can be provided system-wide, per user, and in addition per workflow.
This is an example of the relevant profile settings:
Most jobs will be carried out by programs which are either single core
scripts or threaded programs, hence SMP (shared memory
programs) in nature. Any
given threads and mem_mb requirements will be passed to LSF:
This will give jobs from this rule 14GB of memory and 8 CPU cores. It is
advisable to use resonable default resources, such that you don't need
to specify them for every rule. Snakemake already has reasonable
defaults built in, which are automatically activated when using any non-local executor
(hence also with lsf). Use mem_mb_per_cpu to give the standard LSF type memory per CPU
MPI jobs
Snakemake's LSF backend also supports MPI jobs, see
snakefiles-mpi{.interpreted-text role=”ref”} for details.
Please note: as --mem and --mem-per-cpu are mutually exclusive,
their corresponding resource flags mem/mem_mb and
mem_mb_per_cpu are mutually exclusive, too. You can only reserve
memory a compute node has to provide or the memory required per CPU
(LSF does not make any distintion between real CPU cores and those
provided by hyperthreads). The executor will convert the provided options
based on cluster config.
Additional custom job configuration
There are various bsub options not directly supported via the resource
definitions shown above. You may use the lsf_extra resource to specify
additional flags to bsub:
Again, rather use a profile to specify such resources.
Clusters that use per-job memory requests instead of per-core
By default, this plugin converts the specified memory request into the per-core request expected by most LSF clusters.
So threads: 4 and mem_mb=128 will result in -R rusage[mem=32]. If the request should be per-job on your cluster
(i.e. -R rusage[mem=<mem_mb>]) then set the environment variable SNAKEMAKE_LSF_MEMFMT to perjob.
The executor automatically detects the request unit from cluster configuration, so if your cluster does not use MB,
you do not need to do anything.
Snakemake executor plugin: LSF
LSF is common high performance computing batch system.
Specifying Project and Queue
LSF clusters can have mandatory resource indicators for accounting and scheduling, [Project]{.title-ref} and [Queue]{.title-ref}, respectivily. These resources are usually omitted from Snakemake workflows in order to keep the workflow definition independent from the platform. However, it is also possible to specify them inside of the workflow as resources in the rule definition (see
snakefiles-resources{.interpreted-text role=”ref”}).To specify them at the command line, define them as default resources:
If individual rules require e.g. a different queue, you can override the default per rule:
Usually, it is advisable to persist such settings via a configuration profile, which can be provided system-wide, per user, and in addition per workflow.
This is an example of the relevant profile settings:
Ordinary SMP jobs
Most jobs will be carried out by programs which are either single core scripts or threaded programs, hence SMP (shared memory programs) in nature. Any given threads and
mem_mbrequirements will be passed to LSF:This will give jobs from this rule 14GB of memory and 8 CPU cores. It is advisable to use resonable default resources, such that you don't need to specify them for every rule. Snakemake already has reasonable defaults built in, which are automatically activated when using any non-local executor (hence also with lsf). Use mem_mb_per_cpu to give the standard LSF type memory per CPU
MPI jobs
Snakemake's LSF backend also supports MPI jobs, see
snakefiles-mpi{.interpreted-text role=”ref”} for details.Advanced Resource Specifications
A workflow rule may support a number of resource specifications. For a LSF cluster, a mapping between Snakemake and LSF needs to be performed.
You can use the following specifications:
-qlsf_queue--Wwalltime-R "rusage[mem=<memory_amount>]"mem,mem_mbmem: string with unit,mem_mb: i)-R "rusage[mem=<memory_amount>]"mem_mb_per_cpu-R span[hosts=1]mpi-R span[ptile=<ptile>]ptilempibsubargumentslsf_extrabsub(str)Each of these can be part of a rule, e.g.:
walltimeandruntimeare synonyms.Please note: as
--memand--mem-per-cpuare mutually exclusive, their corresponding resource flagsmem/mem_mbandmem_mb_per_cpuare mutually exclusive, too. You can only reserve memory a compute node has to provide or the memory required per CPU (LSF does not make any distintion between real CPU cores and those provided by hyperthreads). The executor will convert the provided options based on cluster config.Additional custom job configuration
There are various
bsuboptions not directly supported via the resource definitions shown above. You may use thelsf_extraresource to specify additional flags tobsub:Again, rather use a profile to specify such resources.
Clusters that use per-job memory requests instead of per-core
By default, this plugin converts the specified memory request into the per-core request expected by most LSF clusters. So
threads: 4andmem_mb=128will result in-R rusage[mem=32]. If the request should be per-job on your cluster (i.e.-R rusage[mem=<mem_mb>]) then set the environment variableSNAKEMAKE_LSF_MEMFMTtoperjob.The executor automatically detects the request unit from cluster configuration, so if your cluster does not use MB, you do not need to do anything.