sccmec - A tool for typing SCCmec cassettes in assemblies
sccmec
sccmec is a tool for typing SCCmec cassettes in assemblies. It was designed to be easy to
use. Unlike its predecessor, staphopia-sccmec,
sccmec is much simpler to maintain and update. This is because of camlhmp
which allows a organization to be defined in a YAML file.
Contributing
If you would like to become a curator for sccmec, please let me know! This could be in the
form of adding new SCCmec types, updating existing ones, or adjusting thresholds. I’m open
to any and all suggestions!
Supported SCCmec Types
The following SCCmec types are supported by sccmec.
Note:sccmec is utilizes the API from camlhmp
with the defaults for --yaml-targets, --yaml-regions, --regions and --targets
already set. Please don’t let this confuse you when you see all the camels!
Usage
Usage: sccmec [OPTIONS]
sccmec - typing SCCmec cassettes in assemblies
╭─ Required Options ──────────────────────────────────────────────────────────────────────────────╮
│ * --input -i TEXT Input file in FASTA format to classify [required] │
│ * --yaml-targets -yt TEXT YAML file documenting the targets and types [required] │
│ * --yaml-regions -yr TEXT YAML file documenting the regions and types [required] │
│ * --targets -t TEXT Query targets in FASTA format [required] │
│ * --regions -r TEXT Query regions in FASTA format [required] │
╰─────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Filtering Options ─────────────────────────────────────────────────────────────────────────────╮
│ --min-targets-pident INTEGER Minimum percent identity of targets to count a hit │
│ [default: 90] │
│ --min-targets-coverage INTEGER Minimum percent coverage of targets to count a hit │
│ [default: 80] │
│ --min-regions-pident INTEGER Minimum percent identity of regions to count a hit │
│ [default: 85] │
│ --min-regions-coverage INTEGER Minimum percent coverage of regions to count a hit │
│ [default: 83] │
╰─────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Additional Options ────────────────────────────────────────────────────────────────────────────╮
│ --prefix -p TEXT Prefix to use for output files [default: sccmec] │
│ --outdir -o PATH Directory to write output [default: ./] │
│ --force Overwrite existing reports │
│ --verbose Increase the verbosity of output │
│ --silent Only critical errors will be printed │
│ --version Print schema and camlhmp version │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────╯
As mentioned above, sccmec utilizes the camlhmp API. Except, please note that the
--yaml-targets, --yaml-regions, --regions and --targets options are already set to
the SCCmec defaults. This means you only need to provide the --input option with your
assembly file.
Example Usage
Here’s an example of how to use sccmec using an assembly file (both uncompressed and GZip
compressed are supported):
If needed, you could adjust the --min-targets-pident, --min-targets-coverage,
--min-regions-pident and/or --min-regions-coverage options to be more or less
depending on your needs. But please note the defaults are set to the recommended values.
Once the tool has completed, you will find five output files in the current directory which
described below.
Output Files
camlhmp-blast will generate three output files:
File Name
Description
{PREFIX}.tsv
A tab-delimited file with the predicted type
{PREFIX}.targets.blastn.tsv
A tab-delimited file of all target-specific blast hits
{PREFIX}.targets.details.tsv
A tab-delimited file with details for each type based on targets
{PREFIX}.regions.blastn.tsv
A tab-delimited file of all full cassette blast hits
{PREFIX}.regions.details.tsv
A tab-delimited file with details for each type based on full cassettes
Example {PREFIX}.tsv
sample type subtype mecA targets regions coverage hits target_schema target_schema_version region_schema region_schema_version camlhmp_version params target_comment region_comment comment
type-v V Va + ccrC1,IS431,IS431_1,IS431_2,mecA,mecR1 Va 100.00 12 sccmec_targets 1.2.0 sccmec_regions 1.2.0 1.0.1 min-targets-coverage=80;min-targets-pident=90;min-regions-coverage=83;min-regions-pident=85 Coverage based on 12 hits;There were one or more overlapping hits
Column
Description
sample
The sample name as determined by --prefix
type
The predicted type (based on targets and full cassettes)
subtype
The predicted subtype (based on full cassettes)
mecA
The mecA gene status (+=present or -=absent or not a significant hit)
targets
The targets for the given type that had a hit
regions
The regions for the given type that had a hit
coverage
The coverage of the full cassette in the regions column
hits
The number of hits that made up the full cassette coverage
target_schema
The schema used to determine the type based on targets
target_schema_version
The version of the schema used to determine the type based on targets
region_schema
The schema used to determine the type based on full cassettes
region_schema_version
The version of the schema used to determine the type based on full cassettes
sample type status targets missing coverage hits schema schema_version camlhmp_version params comment
type-v Ia False Ia 17.67 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v Ib False Ib 16.61 2 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 2 hits
type-v IIa False IIa 11.85 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits
type-v IIb False IIb 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83
type-v IIc False IIc 17.39 4 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 4 hits;There were one or more overlapping hits
type-v IId False IId 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83
type-v IIe False IIe 1.54 1 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83
type-v III False III 24.50 18 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 18 hits;There were one or more overlapping hits
type-v IVa False IVa 29.35 13 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 13 hits;There were one or more overlapping hits
type-v IVb False IVb 33.19 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v IVc False IVc 23.56 14 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 14 hits;There were one or more overlapping hits
type-v IVd False IVd 7.78 1 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83
type-v IVg False IVg 30.66 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v IVi False IVi 30.85 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v IVj False IVj 30.58 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v IVk False IVk 16.00 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v IVl False IVl 19.79 13 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 13 hits;There were one or more overlapping hits
type-v IVm False IVm 25.73 14 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 14 hits;There were one or more overlapping hits
type-v IVn False IVn 28.15 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v Va True Va 100.00 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v Vb False Vb 64.55 17 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 17 hits;There were one or more overlapping hits
type-v Vc False Vc 50.14 17 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 17 hits;There were one or more overlapping hits
type-v VI False VI 29.79 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v VII False VII 45.86 15 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 15 hits;There were one or more overlapping hits
type-v VIII False VIII 16.95 9 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 9 hits;There were one or more overlapping hits
type-v IX False IX 15.33 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits
type-v X False X 13.68 16 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 16 hits;There were one or more overlapping hits
type-v XI False XI 0.00 0 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83
type-v XII False XII 19.37 15 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 15 hits;There were one or more overlapping hits
type-v XIII False XIII 28.39 12 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 12 hits;There were one or more overlapping hits
type-v XIV False XIV 14.50 16 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 16 hits;There were one or more overlapping hits
type-v XV False XV 17.21 11 sccmec_regions 1.2.0 1.0.1 min-coverage=85;min-pident=83 Coverage based on 11 hits;There were one or more overlapping hits
This file provides a detailed view of the results. The columns are:
Column
Description
sample
The sample name as determined by --prefix
type
The type being tested
status
The status of the type (True if failed)
targets
The targets for the given type that had a match
missing
The targets for the given type that were not found
coverage
The coverage of the full cassette
hits
The number of hits that made up the full cassette coverage
schema
The schema used to determine the type
schema_version
The version of the schema used to determine the type
camlhmp_version
The version of camlhmp used to determine the type
params
The parameters used to determine the type
comment
A small comment about the result
Citations
If you use sccmec in your research, please cite the following:
sccmec - A tool for typing SCCmec cassettes in assemblies
sccmec
sccmecis a tool for typing SCCmec cassettes in assemblies. It was designed to be easy to use. Unlike its predecessor, staphopia-sccmec,sccmecis much simpler to maintain and update. This is because of camlhmp which allows a organization to be defined in a YAML file.Contributing
If you would like to become a curator for
sccmec, please let me know! This could be in the form of adding new SCCmec types, updating existing ones, or adjusting thresholds. I’m open to any and all suggestions!Supported SCCmec Types
The following SCCmec types are supported by
sccmec.The following SCCmec subtypes are supported by
sccmec.Installation
You can install
sccmecusingconda:Note:
sccmecis utilizes the API from camlhmp with the defaults for--yaml-targets,--yaml-regions,--regionsand--targetsalready set. Please don’t let this confuse you when you see all the camels!Usage
As mentioned above,
sccmecutilizes thecamlhmpAPI. Except, please note that the--yaml-targets,--yaml-regions,--regionsand--targetsoptions are already set to the SCCmec defaults. This means you only need to provide the--inputoption with your assembly file.Example Usage
Here’s an example of how to use
sccmecusing an assembly file (both uncompressed and GZip compressed are supported):If needed, you could adjust the
--min-targets-pident,--min-targets-coverage,--min-regions-pidentand/or--min-regions-coverageoptions to be more or less depending on your needs. But please note the defaults are set to the recommended values.Once the tool has completed, you will find five output files in the current directory which described below.
Output Files
camlhmp-blastwill generate three output files:{PREFIX}.tsv{PREFIX}.targets.blastn.tsv{PREFIX}.targets.details.tsv{PREFIX}.regions.blastn.tsv{PREFIX}.regions.details.tsvExample {PREFIX}.tsv
--prefixExample {PREFIX}.targets.blastn.tsv
This is the standard BLAST output with
-outfmt 6Example {PREFIX}.targets.details.tsv
This file provides a detailed view of the results. The columns are:
--prefixExample {PREFIX}.regions.blastn.tsv
This is the standard BLAST output with
-outfmt 6Example {PREFIX}.regions.details.tsv
This file provides a detailed view of the results. The columns are:
--prefixCitations
If you use
sccmecin your research, please cite the following:camlgmp
🐪Classification through yAML Heuristic Mapping Protocol 🐪
Petit III RA camlhmp: Classification through yAML Heuristic Mapping Protocol (GitHub)
BLAST
Basic Local Alignment Search Tool
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009)*
Naming
I considered thinking of a fun name for this tool, but sometimes it’s best to get straight to the point! So, here we are with
sccmec.License
I’m not a lawyer and MIT has always been my go-to license. So, MIT it is!
Curators