To run, activate the conda environemnt first, then invoke the pipeline:
conda activate viralFlye
./viralFlye.py
viraFlye takes an existing metaFlye assembly directory as input. Use metaFlye v2.9+ with --meta option.
You will also need the original reads that were used to assemble.
Additional options are shown when running viralFlye script without parameters.
Output
viralFlye output can be found in the outdir directory (by default, equal to the input directory)
It consists of 3 fasta files, linears_viralFlye.fasta, components_viralFlye.fasta and circulars_viralFlye.fasta,
and a txt file that lists all erroneously circularized components.
Prediction of hosts within the sample is performed by a separate script crispr_host_match.py.
It takes metaFlye result as an input, extracts circular and linear isolated contigs,predicts viruses and CRISPR spacers and matches them using BLAST.
Result (BLAST output format 6) can be found in blast.out file in the output folder.
Dependencies
viralFlye package depends on the following software
viralFlye
viralFlye is a pipeline to recover high-quality viral genomes from long-read metagenomic sequencing.
Installation
You need the
condapackage manager for the installation process. To install viralFlye locally, run:This will create a conda environment viralFlye which contain all dependencies.
Alternatively, viralFlye can be installed through bioconda, https://bioconda.github.io/recipes/viralflye/README.html
You also need the Pham HMM database for viral genome identification. If you don’t have it yet, download using:
Running viralFlye
To run, activate the conda environemnt first, then invoke the pipeline:
viraFlye takes an existing metaFlye assembly directory as input. Use metaFlye v2.9+ with
--metaoption. You will also need the original reads that were used to assemble.Then, pipeline could be invoked as follows:
Additional options are shown when running viralFlye script without parameters.
Output
viralFlye output can be found in the outdir directory (by default, equal to the input directory) It consists of 3 fasta files,
linears_viralFlye.fasta,components_viralFlye.fastaandcirculars_viralFlye.fasta, and a txt file that lists all erroneously circularized components.Prediction of hosts within the sample is performed by a separate script
crispr_host_match.py. It takes metaFlye result as an input, extracts circular and linear isolated contigs,predicts viruses and CRISPR spacers and matches them using BLAST. Result (BLAST output format 6) can be found inblast.outfile in the output folder.Dependencies
viralFlye package depends on the following software
License
viralFlye is distributed under a BSD license. See the LICENSE file for details.
Credits
Code contributors: