crisprBowtie provides two main functions to align short DNA sequences
to a reference genome using the short read aligner bowtie (Langmead et
al. 2009) and return the alignments as R objects: runBowtie and
runCrisprBowtie. It utilizes the Bioconductor package Rbowtie to
access the Bowtie program in a platform-independent manner. This means
that users do not need to install Bowtie prior to using crisprBowtie.
The latter function (runCrisprBowtie) is specifically designed to map
and annotate CRISPR guide RNA (gRNA) spacer sequences using CRISPR
nuclease objects and CRISPR genomic arithmetics defined in the
Bioconductor package
crisprBase. This enables a
fast and accurate on-target and off-target search of gRNA spacer
sequences for virtually any type of CRISPR nucleases. It also provides
an off-target search engine for our main gRNA design package
crisprDesign of the
crisprVerse ecosystem. See the
addSpacerAlignments function in crisprDesign for more details.
Installation and getting started
Software requirements
OS Requirements
This package is supported for macOS, Linux and Windows machines. Package
was developed and tested on R version 4.2.1.
Installation from Bioconductor
crisprBowtie can be installed from from the Bioconductor devel branch
using the following commands in a fresh R session:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version="devel")
BiocManager::install("crisprBowtie")
The complete documentation for the package can be found
here.
Building a bowtie index
To use runBowtie or runCrisprBowtie, users need to first build a
Bowtie genome index. For a given genome, this step has to be done only
once. The Rbowtie package conveniently provides the function
bowtie_build to build a Bowtie index from any custom genome from a
FASTA file.
As an example, we build a Bowtie index for a small portion of the human
chromosome 1 (chr1.fa file provided in the crisprBowtie package) and
save the index file as myIndex to a temporary directory:
To learn how to create a Bowtie index for a complete genome or
transcriptome, please visit our tutorial
page.
Alignment using runCrisprBowtie
As an example, we align 6 spacer sequences (of length 20bp) to the
custom genome built above, allowing a maximum of 3 mismatches between
the spacer and protospacer sequences.
We specify that the search is for the wildtype Cas9 (SpCas9) nuclease by
providing the CrisprNuclease object SpCas9 available through the
crisprBase package. The argument canonical=FALSE specifies that
non-canonical PAM sequences are also considered (NAG and NGA for
SpCas9). The function getAvailableCrisprNucleases in crisprBase
returns a character vector of available crisprNuclease objects found
in crisprBase.
The function runBowtie is similar to runCrisprBowtie, but does not
impose constraints on PAM sequences. It can be used to search for any
short read sequence in a genome.
Example using RNAi (siRNA design)
Seed-related off-targets caused by mismatch tolerance outside of the
seed region is a well-studied and characterized problem observed in RNA
interference (RNA) experiments. runBowtie can be used to map
shRNA/siRNA seed sequences to reference genomes to predict putative
off-targets:
Langmead, Ben, Cole Trapnell, Mihai Pop, and Steven L. Salzberg. 2009.
“Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the
Human Genome.” Genome Biology 10 (3): R25.
https://doi.org/10.1186/gb-2009-10-3-r25.
crisprBowtie: alignment of gRNA spacer sequences using bowtie
runCrisprBowtieAuthors: Jean-Philippe Fortin
Date: July 13, 2022
Overview of crisprBowtie
crisprBowtieprovides two main functions to align short DNA sequences to a reference genome using the short read aligner bowtie (Langmead et al. 2009) and return the alignments as R objects:runBowtieandrunCrisprBowtie. It utilizes the Bioconductor packageRbowtieto access the Bowtie program in a platform-independent manner. This means that users do not need to install Bowtie prior to usingcrisprBowtie.The latter function (
runCrisprBowtie) is specifically designed to map and annotate CRISPR guide RNA (gRNA) spacer sequences using CRISPR nuclease objects and CRISPR genomic arithmetics defined in the Bioconductor package crisprBase. This enables a fast and accurate on-target and off-target search of gRNA spacer sequences for virtually any type of CRISPR nucleases. It also provides an off-target search engine for our main gRNA design package crisprDesign of the crisprVerse ecosystem. See theaddSpacerAlignmentsfunction incrisprDesignfor more details.Installation and getting started
Software requirements
OS Requirements
This package is supported for macOS, Linux and Windows machines. Package was developed and tested on R version 4.2.1.
Installation from Bioconductor
crisprBowtiecan be installed from from the Bioconductor devel branch using the following commands in a fresh R session:The complete documentation for the package can be found here.
Building a bowtie index
To use
runBowtieorrunCrisprBowtie, users need to first build a Bowtie genome index. For a given genome, this step has to be done only once. TheRbowtiepackage conveniently provides the functionbowtie_buildto build a Bowtie index from any custom genome from a FASTA file.As an example, we build a Bowtie index for a small portion of the human chromosome 1 (
chr1.fafile provided in thecrisprBowtiepackage) and save the index file asmyIndexto a temporary directory:To learn how to create a Bowtie index for a complete genome or transcriptome, please visit our tutorial page.
Alignment using
runCrisprBowtieAs an example, we align 6 spacer sequences (of length 20bp) to the custom genome built above, allowing a maximum of 3 mismatches between the spacer and protospacer sequences.
We specify that the search is for the wildtype Cas9 (SpCas9) nuclease by providing the
CrisprNucleaseobjectSpCas9available through thecrisprBasepackage. The argumentcanonical=FALSEspecifies that non-canonical PAM sequences are also considered (NAG and NGA for SpCas9). The functiongetAvailableCrisprNucleasesincrisprBasereturns a character vector of availablecrisprNucleaseobjects found incrisprBase.Applications beyond CRISPR
The function
runBowtieis similar torunCrisprBowtie, but does not impose constraints on PAM sequences. It can be used to search for any short read sequence in a genome.Example using RNAi (siRNA design)
Seed-related off-targets caused by mismatch tolerance outside of the seed region is a well-studied and characterized problem observed in RNA interference (RNA) experiments.
runBowtiecan be used to map shRNA/siRNA seed sequences to reference genomes to predict putative off-targets:Reproducibility
References
Langmead, Ben, Cole Trapnell, Mihai Pop, and Steven L. Salzberg. 2009. “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome.” Genome Biology 10 (3): R25. https://doi.org/10.1186/gb-2009-10-3-r25.