Freddie is an annotation-free isoform detection and discovery tool that uses as input transcriptomic long-reads (e.g. Oxford Nanopore reads) aligned to the reference genome using a splice aligner.
Note that Freddie does not assume that the splice aligner (e.g. minimap2) is supplied with an annotation file of known splice isoforms.
If you use Freddie in your publications, please cite Freddie’s paper.
Note that if you wish to access the version of Freddie benchmarked in the paper check the benchmarking branch.
Running Freddie
Using Snakemake
The whole Freddie pipeline is readily available using Snakemake.
You can check the pipeile at the Snakefile and you can add samples and other paths (e.g. Gurobi licence, reference genome) to run Freddie on in the config.yaml file.
After editing config.yaml, you can run Snakemake with your specific settings, just make sure to use --use-conda to have all the requirements installed on the fly.
Note that the cluster stage uses Gurobi solver which needs a license to use.
If your affliation is academic, you can cost-free obtain a license here.
Make sure to update the license path in config.yaml to point to the installed license file.
Manually
Installation as a Conda package
Freddie is available as a Conda package.
However, since Freddie uses Gurobi, you need to install Gurobi alongside Freddie in order to use the Freddie clustering module:
py/freddie_split.py: Partitions the reads into independent sets that can be processed in parallel
py/freddie_segment.py: Computes the canonical segmentation for each read set
py/freddie_cluster.py: Clusters the reads using their canonical segmentation representation
py/freddie_isoforms.py: Generates consensus isoforms of each cluster and outputs them in GTF format
Obtaining Gurobi license
Freddie’s clustering stage (py/freddie_cluster.py) uses Gurobi Solver.
To use Freddie, you need to have a Gurobi license installed.
Luckily, you can obtain one for free using an academic email address:
Once you have registered and logged in to Gurobi.com with your account, go to here to obtain an academic license code.
You need to write this license code down because Gurobi does not save it for you.
Install the license on your machine using the following commands:
By default, Gurobi license installer, grbgetkey, installs the license file to /home/$USER/gurobi.lic.
If you installed the license somewhere else, you need to make Freddie know where the new license is installed.
If you are going to use the supplied Snakemake file, edit the following lines in config.yaml file:
gurobi:
license: /home/$USER/gurobi.lic
If you are going to use Freddie’s scripts directly, make sure to modify the Bash environment variable before running freddie_cluster.py using the following command:
export GRB_LICENSE_FILE=/path/to/gurobi.lic
Freddie stages
Freddie takes as input a SAM file of long-reads splice alignment to the genome that can be generated by an aligner such as minimap2 or deSALT.
As output, Freddie generates a GTF file of the detected isoforms.
--split-dir/-s: SPLIT output directory of the split stage
--threads/-t: Number of threads. Default: 1
--sigma/-sd: Standard deviation parameter for the Gaussian filter. Default: 5.0
--threshold-rate/-tp: Coverage percentage threshold for segments. Default: 0.90
--variance-factor/-vf: Variance factor for heuristic of prefixing breakpoints. Any breakpoint with signal greater than -vf times the standard deviation plus the average of the signals will be prefixed. Default: 3.0
--max-problem-size/-mps: Maximum allowed problem size after which the problem will be broken down uniformly. Default: 50
--min-read-support-outside: Minimum contrasting coverage support required for a breakpoint. Default: 3
--outdir/-o: Output directory of segment stage. Default: freddie_segment/
Cluster
The cluster stage uses Gurobi solver which needs a license to use.
If your affliation is academic, you can cost-free obtain a license here.
--split-dir/-s: SPLIT output directory of the segment stage
--cluster-dir/-c: CLUSTER output directory of the cluster stage
--output/-o: Output GTF file of isoforms stage. Default: freddie_isoforms.gtf
--threads/-t: Number of threads. Default: 1
Citating Freddie
Freddie’s paper was originally accepted to the proceeding of RECOMB 2021 conference and is now accepted to Nucluic Acid Research (still in-print):
Freddie: Annotation-independent Detection and Discovery of Transcriptomic Alternative Splicing Isoforms.
Baraa Orabi, Brian McConeghy, Ning Xie, Xuesen Dong, Cedric Chauve, Faraz Hach.
Nucleic Acid Research 2022; DOI: 10.1093/nar/gkac1112
Freddie
Freddie is an annotation-free isoform detection and discovery tool that uses as input transcriptomic long-reads (e.g. Oxford Nanopore reads) aligned to the reference genome using a splice aligner. Note that Freddie does not assume that the splice aligner (e.g.
minimap2) is supplied with an annotation file of known splice isoforms. If you use Freddie in your publications, please cite Freddie’s paper. Note that if you wish to access the version of Freddie benchmarked in the paper check thebenchmarkingbranch.Running Freddie
Using Snakemake
The whole Freddie pipeline is readily available using Snakemake. You can check the pipeile at the
Snakefileand you can add samples and other paths (e.g. Gurobi licence, reference genome) to run Freddie on in theconfig.yamlfile. After editingconfig.yaml, you can run Snakemake with your specific settings, just make sure to use--use-condato have all the requirements installed on the fly. Note that the cluster stage uses Gurobi solver which needs a license to use. If your affliation is academic, you can cost-free obtain a license here. Make sure to update the license path inconfig.yamlto point to the installed license file.Manually
Installation as a Conda package
Freddie is available as a Conda package. However, since Freddie uses Gurobi, you need to install Gurobi alongside Freddie in order to use the Freddie clustering module:
This will install Gurobi Python dependencies for Freddie and will add four Freddie scripts to your PATH.
Installation using Git
The simplest way to install the dependencies is using Conda:
There are few scripts/stages in Freddie:
py/freddie_split.py: Partitions the reads into independent sets that can be processed in parallelpy/freddie_segment.py: Computes the canonical segmentation for each read setpy/freddie_cluster.py: Clusters the reads using their canonical segmentation representationpy/freddie_isoforms.py: Generates consensus isoforms of each cluster and outputs them inGTFformatObtaining Gurobi license
Freddie’s clustering stage (
py/freddie_cluster.py) uses Gurobi Solver. To use Freddie, you need to have a Gurobi license installed. Luckily, you can obtain one for free using an academic email address:Go to https://pages.gurobi.com/registration to register an account. Make sure to select “Academic” to obtain the free license.
Once you have registered and logged in to Gurobi.com with your account, go to here to obtain an academic license code. You need to write this license code down because Gurobi does not save it for you.
Install the license on your machine using the following commands:
By default, Gurobi license installer,
grbgetkey, installs the license file to/home/$USER/gurobi.lic. If you installed the license somewhere else, you need to make Freddie know where the new license is installed. If you are going to use the supplied Snakemake file, edit the following lines inconfig.yamlfile:If you are going to use Freddie’s scripts directly, make sure to modify the Bash environment variable before running
freddie_cluster.pyusing the following command:Freddie stages
Freddie takes as input a
SAMfile of long-reads splice alignment to the genome that can be generated by an aligner such asminimap2ordeSALT. As output, Freddie generates aGTFfile of the detected isoforms.Align
Sort
Before running split stage, the SAM file needs to be sorted and indexed using SAMtools
Split (aka partition)
Align takes the following arguments:
--reads/-r: Space separated paths to reads in FASTQ or FASTA--bam/-b:BAMfile of read alignments from a split/splice long-read mapper that are position sorted and indexed.--outdir/-o: Output TSV file of split stage. Default:freddie_split/Segment
Align takes the following arguments:
--split-dir/-s:SPLIToutput directory of the split stage--threads/-t: Number of threads. Default: 1--sigma/-sd: Standard deviation parameter for the Gaussian filter. Default: 5.0--threshold-rate/-tp: Coverage percentage threshold for segments. Default: 0.90--variance-factor/-vf: Variance factor for heuristic of prefixing breakpoints. Any breakpoint with signal greater than-vftimes the standard deviation plus the average of the signals will be prefixed. Default: 3.0--max-problem-size/-mps: Maximum allowed problem size after which the problem will be broken down uniformly. Default: 50--min-read-support-outside: Minimum contrasting coverage support required for a breakpoint. Default: 3--outdir/-o: Output directory of segment stage. Default:freddie_segment/Cluster
The cluster stage uses Gurobi solver which needs a license to use. If your affliation is academic, you can cost-free obtain a license here.
Align takes the following arguments:
--segment-dir/-s:SEGMENToutput directory of the segment stage--gap-offset/-go: Constant +/- slack used in unaligned gap condition. Default: 20--epsilon/-e: +/- ratio of length as slack used in unaligned gap condition. Default: 0.2--max-rounds/-mr: Maximum allowed number of rounds per sub-partition of a read set. Default: 30--min-isoform-size/-is: Minimum read support allowed for an isoform. Default: 3--timeout/-to: Gurobi timeout per isoform in minutes. Default: 4--threads/-t: Number of threads. Default: 1--logs-dir/-l: Directory path where logs will be outputted. Default: No logging--outdir/-o: Output directory of cluster stage. Default:freddie_cluster/Isoforms
Align takes the following arguments:
--split-dir/-s:SPLIToutput directory of the segment stage--cluster-dir/-c:CLUSTERoutput directory of the cluster stage--output/-o: Output GTF file of isoforms stage. Default:freddie_isoforms.gtf--threads/-t: Number of threads. Default: 1Citating Freddie
Freddie’s paper was originally accepted to the proceeding of RECOMB 2021 conference and is now accepted to Nucluic Acid Research (still in-print):
Freddie: Annotation-independent Detection and Discovery of Transcriptomic Alternative Splicing Isoforms. Baraa Orabi, Brian McConeghy, Ning Xie, Xuesen Dong, Cedric Chauve, Faraz Hach. Nucleic Acid Research 2022; DOI: 10.1093/nar/gkac1112