$ tooldistillator --help
usage: tooldistillator <tool> <options>
Extract information from tool output(s) to JSON and/oraggregate several JSON reports
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
--logfile LOGFILE Log file location
Tools:
{abricate,amrfinderplus,argnorm,bakta,bandage,bracken,bwa,checkm2,coreprofiler,concoct,coverm,dastool,deeparg,drep,eggnogmapper,fastp,fastqc,filtlong,flye,groot,gtdbtk,integronfinder2,isescan,kraken2,maxbin2,megahit,metabat2,mmseqs2linclust,mmseqs2taxonomy,multiqc,plasmidfinder,polypolish,prodigal,quast,recentrifuge,refseqmasher,shovill,staramr,sylph,sylphtax,tabular_file,tooltest,summarize}
abricate Extract information from abricate's output report i.e., OUTPUT.tsv
amrfinderplus Extract information from amrfinderplus's output report i.e., report.tsv
argnorm Extract information from argnorm's output report i.e., output.tsv
bakta Extract information from bakta's output report i.e., OUTPUT.json
bandage Extract information from bandage's output report i.e., OUTPUT.txt
bracken Extract information from bracken's output report i.e., report.tsv
bwa Extract information from bwa's output report i.e., input.bam
checkm2 Extract information from checkm2's output report i.e., quality_report.tsv
concoct Extract information from concoct's output report i.e., merge_cut_clusters.csv
coreprofiler Extract information from coreprofiler's output report i.e., results.tsv
coverm Extract information from coverm coverage output report i.e., coverage_report.tsv
dastool Extract information from dastool's summary report i.e., summary_bins.tabular
deeparg Extract information from deeparg's output report i.e., report.txt
drep Extract information from drep's output report i.e., quality_and_cluster_summary.csv (Widb file)
eggnogmapper Extract information from eggnogmapper's output report i.e., annotations_report.tsv
fastp Extract information from fastp's output report i.e., report.json
fastqc Extract information from fastqc's output report i.e., report.txt
filtlong Extract information from filtlong's output report i.e., input.fastq
flye Extract information from flye's output report i.e., contig.fasta
groot Extract information from groot's output report i.e., report.tsv
gtdbtk Extract information from gtdbtk's output report i.e., taxonomy_summary.tsv
integronfinder2 Extract information from integronfinder2's output report i.e., OUTPUT.integrons, OUTPUT.summary
isescan Extract information from isescan's output report i.e., OUTPUT.tsv
kraken2 Extract information from kraken2's output report i.e., report.tsv
maxbin2 Extract information from maxbin2's output report i.e., bin_summary.tsv
megahit Extract information from megahit's output assembly i.e., assembly.fasta
metabat2 Extract information from metabat2's output report i.e., cluster_membership.tsv
mmseqs2linclust Extract information from mmseqs2 linclust module output clustering i.e., cluster.fasta
mmseqs2taxonomy Extract information from mmseqs2 taxonomy module output i.e., tax_output.tsv
multiqc Extract information from multiqc's output report i.e., output.html
plasmidfinder Extract information from plasmidfinder's output report i.e., plasmidfinder.tsv
polypolish Extract information from polypolish's output report i.e., contig.fasta
prodigal Extract information from prodigal's output fasta i.e., cds.fasta
quast Extract information from quast's output report i.e., report.tsv
recentrifuge Extract information from recentrifuge's output report i.e., data.tsv
refseqmasher Extract information from refseqmasher's output report i.e., OUTPUT.tsv
shovill Extract information from shovill's output report i.e., contig.fasta
staramr Extract information from staramr's output report i.e., resfinder.tsv
sylph Extract information from sylph's output report i.e., report.tsv
sylphtax Extract information from sylph-tax's output report i.e., merge_report.tsv
tabular_file Extract information from tabular_file's output report i.e., report.tsv
tooltest Extract information from tooltest's output report i.e., unitest
summarize Aggregate several JSON reports generated from a tool output
Tool specification
For each tool, the requirements can be accessed using the --help argument, e.g.
$ tooldistillator abricate --help
usage: tooldistillator.py abricate <options>
Extract information from output(s) of abricate (OUTPUT.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
abricate version for abricate
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for abricate
--hid HID historic ID for abricate file from galaxy for abricate
You can also test the command using the test data available in test/data/dummy folders. For example:
This will generate only one JSON file for all reports.
When you can provide different kind of files to a tool (e.g. shovill option use the contig.fasta, but can also use the alignment bam and assembly graph file), you can not submit in multiple file mode!
Aggregate JSON reports from different tools
To aggregate JSON reports from different tools in one final JSON file, you can use the summarize subcommand:
$ tooldistillator summarize --help
usage: tooldistillator summarize <options> <list of reports>
Aggregate several reports
positional arguments:
tooldistillator_reports
list of tooldistillator reports
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output file path for summary
-f, --force Overwrite to the output file mandatory
You can try it using test data in test/data/dummy/summarize folder:
$ tooldistillator abricate --help
usage: tooldistillator.py abricate <options>
Extract information from output(s) of abricate (OUTPUT.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
abricate version for abricate
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for abricate
--hid HID historic ID for abricate file from galaxy for abricate
AMRFinderPlus
$ tooldistillator amrfinderplus --help
usage: tooldistillator.py amrfinderplus <options>
Extract information from output(s) of amrfinderplus (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
amrfinderplus version number for amrfinderplus
--hid HID historic ID for amrfinderplus file from galaxy for amrfinderplus
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for amrfinderplus
--point_mutation_report_path POINT_MUTATION_REPORT_PATH
point mutation report file for amrfinderplus
--point_mutation_report_hid POINT_MUTATION_REPORT_HID
historic ID for point mutation report file from galaxy for amrfinderplus
--nucleotide_sequence_path NUCLEOTIDE_SEQUENCE_PATH
nucleotide identified sequence fasta file for amrfinderplus
--nucleotide_sequence_hid NUCLEOTIDE_SEQUENCE_HID
historic ID for nucleotide sequence fasta file from galaxy for amrfinderplus
argNorm
$ tooldistillator argnorm --help
usage: tooldistillator.py argnorm <options>
Extract information from output(s) of argnorm (output.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for argnorm
--hid HID historic ID for argnorm file from galaxy
Bakta
$ tooldistillator bakta --help
usage: tooldistillator.py bakta <options>
Extract information from output(s) of bakta (OUTPUT.json)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID historic ID to bakta result file from galaxy for bakta
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
bakta version for bakta
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for bakta
--annotation_tabular_path ANNOTATION_TABULAR_PATH
annotation file in TSV format for bakta
--annotation_tabular_hid ANNOTATION_TABULAR_HID
historic ID for annotation tsv file from galaxy for bakta
--gff_file_path GFF_FILE_PATH
annotation file result in gff3 format for bakta
--gff_file_hid GFF_FILE_HID
historic ID for gff file from galaxy for bakta
--annotation_genbank_path ANNOTATION_GENBANK_PATH
annotation file in genbank format for bakta
--annotation_genbank_hid ANNOTATION_GENBANK_HID
historic ID for annotation genbank file from galaxy for bakta
--annotation_embl_path ANNOTATION_EMBL_PATH
annotation file in embl format for bakta
--annotation_embl_hid ANNOTATION_EMBL_HID
historic ID for annotation embl file from galaxy for bakta
--contig_sequences_path CONTIG_SEQUENCES_PATH
contig sequences in fasta ([output].fna) for bakta
--contig_sequences_hid CONTIG_SEQUENCES_HID
historic ID for contigs fasta file from galaxy for bakta
--nucleotide_annotation_path NUCLEOTIDE_ANNOTATION_PATH
nuleotide file ([output].ffn) of the annotation for bakta
--nucleotide_annotation_hid NUCLEOTIDE_ANNOTATION_HID
historic ID for nucleotide file from galaxy for bakta
--amino_acid_annotation_path AMINO_ACID_ANNOTATION_PATH
amino acid file of the annotation for bakta
--amino_acid_annotation_hid AMINO_ACID_ANNOTATION_HID
historic ID for amino acide sequence file from galaxy for bakta
--hypothetical_protein_path HYPOTHETICAL_PROTEIN_PATH
hypothetical protein CDS amino acid sequences as fasta for bakta
--hypothetical_protein_hid HYPOTHETICAL_PROTEIN_HID
historic ID for hypothetical protein file file from galaxy for bakta
--hypothetical_tabular_path HYPOTHETICAL_TABULAR_PATH
hypothetical protein CDS for bakta
--hypothetical_tabular_hid HYPOTHETICAL_TABULAR_HID
historic ID for hypothetical tabular file from galaxy for bakta
--summary_result_path SUMMARY_RESULT_PATH
summary file of the bakta analysis in txt format for bakta
--summary_result_hid SUMMARY_RESULT_HID
historic ID for summary file from galaxy for bakta
--plot_file_path PLOT_FILE_PATH
genome annotation plot file path for bakta
--plot_file_hid PLOT_FILE_HID
historic ID for plot file from galaxy for bakta
Bakta tool generate a complete JSON file in output, which is so big and redondant to a database integration.
The ToolDistillator Bakta:
Uses some informations from the Bakta JSON file
Adds summary of the analysis if provided in summary file from Bakta (optional)
Remove informations related to nucleotide and amino acid sequences and add path of the sequences files if provided
Bandage
$ tooldistillator bandage --help
usage: tooldistillator.py bandage <options>
Extract information from output(s) of bandage (OUTPUT.txt)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
bandage version number for bandage
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for bandage
--hid HID historic ID for bandage file from galaxy for bandage
--bandage_plot_path BANDAGE_PLOT_PATH
bandage plot file for bandage
--bandage_plot_hid BANDAGE_PLOT_HID
historic ID for bandage plot from galaxy for bandage
Bracken
$ tooldistillator bracken --help
usage: tooldistillator.py bracken <options>
Extract information from output(s) of bracken (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to kraken report file from Galaxy for bracken
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
bracken version for bracken
--reference_database_version REFERENCE_DATABASE_VERSION
bracken DB version for bracken
--read_len READ_LEN read length value for bracken
--level LEVEL level to estimate abundance for bracken
--threshold THRESHOLD
number of reads required PRIOR to abundance estimation for bracken
--kraken_report_path KRAKEN_REPORT_PATH
New kraken report estimated from bracken for bracken
--kraken_report_hid KRAKEN_REPORT_HID
Historic ID to kraken results file from Galaxy for bracken
BWA
$ tooldistillator bwa --help
usage: tooldistillator.py bwa <options>
Extract information from output(s) of bwa (input.bam)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to bwa contigs file from Galaxy for bwa
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
bwa version number for bwa
--reference_database_version REFERENCE_DATABASE_VERSION
bwa reference genome for bwa
--paired_second_file_path PAIRED_SECOND_FILE_PATH
if paired inputs are uses for bwa
--paired_second_file_hid PAIRED_SECOND_FILE_HID
Galaxy HID to the paired file for bwa
CheckM2
$ tooldistillator checkm2 --help
usage: tooldistillator.py checkm2 <options>
Extract information from output(s) of checkm2 (quality_report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
checkm2 version number for checkm2
--hid HID historic ID for checkm2 file from galaxy for checkm2
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for checkm2
--diamond_results_path DIAMOND_RESULTS_PATH
DIAMOND_RESULTS file for checkm2
--diamond_results_hid DIAMOND_RESULTS_HID
historic ID for DIAMOND results file from galaxy for checkm2
--protein_zip_path PROTEIN_ZIP_PATH
protein sequence fasta files in a ZIP file for checkm2
--protein_zip_hid PROTEIN_ZIP_HID
historic ID for protein sequence fasta files in a ZIP file from galaxy for checkm2
--checkm2_log_path CHECKM2_LOG_PATH
Checkm2 log file for checkm2
--checkm2_log_hid CHECKM2_LOG_HID
historic ID for Checkm2 log file from galaxy for checkm2
Concoct
$ tooldistillator concoct --help
usage: tooldistillator.py concoct <options>
Extract information from output(s) of concoct (merge_cut_clusters.csv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
concoct version number for checkm2
--hid HID historic ID for concoct file from galaxy
--fasta_bin_zip_folder_path FASTA_BIN_ZIP_FOLDER_PATH
Fasta bin folder for concoct
--fasta_bin_zip_folder_hid FASTA_BIN_ZIP_FOLDER_HID
historic ID for fasta bin folder from galaxy for concoct
--contigs_cut_up_file_path CONTIGS_CUT_UP_FILE_PATH
Contigs cut up fasta file for Concoct
--contigs_cut_up_file_hid CONTIGS_CUT_UP_FILE_HID
historic ID for contigs cut up fasta file from galaxy for concoct
--coordinates_cut_up_file_path COORDINATES_CUT_UP_FILE_PATH
Coordinates contigs cut up bed file for concoct
--coordinates_cut_up_file_hid COORDINATES_CUT_UP_FILE_HID
historic ID for coordinates contigs cut up bed file from galaxy for concoct
--coverage_table_file_path COVERAGE_TABLE_FILE_PATH
Coverage table file from Concoct
--coverage_table_file_hid COVERAGE_TABLE_FILE_HID
historic ID for coverage table file from galaxy for concoct
--log_file_path LOG_FILE_PATH
Log file from Concoct
--log_file_hid LOG_FILE_HID
historic ID for log file from galaxy for concoct
CoreProfiler
$ tooldistillator coreprofiler --help
usage: tooldistillator.py coreprofiler <options>
Extract information from output(s) of coreprofiler (results.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID for coreprofiler file from galaxy for coreprofiler
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
coreprofiler version for coreprofiler
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for coreprofiler
--profiles_json_path PROFILES_JSON_PATH
JSON file containing info about files with temporary alleles for coreprofiler
--profiles_json_hid PROFILES_JSON_HID
Historic ID for profiles JSON file from galaxy for coreprofiler
--alleles_fna_path ALLELES_FNA_PATH
FASTA file with new alleles sequences if detected for coreprofiler
--alleles_fna_hid ALLELES_FNA_HID
Historic ID for new alleles FASTA file from galaxy for coreprofiler
CoverM
$ tooldistillator coverm --help
usage: tooldistillator.py coverm <options>
Extract information from output(s) of coverm (coverage_report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version number for coverm
--hid HID historic ID for coverm file from galaxy for coverm
--dereplication_cluster_definition_path DEREPLICATION_CLUSTER_DEFINITION_PATH
dereplicated representative and member lines file from coverM
--dereplication_cluster_definition_hid DEREPLICATION_CLUSTER_DEFINITION_HID
Historic ID to dereplicated representative and member lines file from Galaxy
--dereplication_representative_fasta_zip_path DEREPLICATION_REPRESENTATIVE_FASTA_ZIP_PATH
representative genome files in a ZIP file from coverM
--dereplication_representative_fasta_zip_hid DEREPLICATION_REPRESENTATIVE_FASTA_ZIP_HID
Historic ID to representative genome files from Galaxy
DasTool
$ tooldistillator dastool --help
usage: tooldistillator.py dastool <options>
Extract information from output(s) of dastool (summary_bins.tabular)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version number for dastool
--hid HID historic ID for dastool bins summary file from galaxy for dastool
--fasta_bin_zip_folder_path FASTA_BIN_ZIP_FOLDER_PATH
Bin fasta format ZIP folder for DasTool
--fasta_bin_zip_folder_hid FASTA_BIN_ZIP_FOLDER_HID
Historic ID to Bin fasta format ZIP folder from Galaxy
--contig_to_bin_file_path CONTIG_TO_BIN_FILE_PATH
Contig to bin file from DasTool
--contig_to_bin_file_hid CONTIG_TO_BIN_FILE_HID
Historic ID to contig to bin file from Galaxy
--quality_and_completness_file_path QUALITY_AND_COMPLETNESS_FILE_PATH
Quality and completness file from DasTool
--quality_and_completness_file_hid QUALITY_AND_COMPLETNESS_FILE_HID
Historic ID to quality and completness file from Galaxy
--protein_sequences_file_path PROTEIN_SEQUENCES_FILE_PATH
Proteins sequences fasta file from DasTool
--protein_sequences_file_hid PROTEIN_SEQUENCES_FILE_HID
Historic ID to proteins sequences fasta file file from Galaxy
--unbinned_sequences_file_path UNBINNED_SEQUENCES_FILE_PATH
Unbinned sequences fasta file from DasTool
--unbinned_sequences_file_hid UNBINNED_SEQUENCES_FILE_HID
Historic ID to unbinned sequences fasta file file from Galaxy
--log_file_path LOG_FILE_PATH
Log file from DasTool
--log_file_hid LOG_FILE_HID
Historic ID to log file file from Galaxy
DeepARG
$ tooldistillator deeparg --help
usage: tooldistillator.py deeparg <options>
Extract information from output(s) of deeparg (ARG prediction and quantification)
positional arguments:
report Path to report(s)
options:
-h, --help Show this help message and exit
-o OUTPUT, --output OUTPUT Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
DeepARG version
--reference_database_version REFERENCE_DATABASE_VERSION
Database version used for DeepARG
--hid HID Historic ID for DeepARG file from Galaxy
--report_ARG_merged_path REPORT_ARG_MERGED_PATH
DeepARG ARG merged report file
--report_ARG_merged_hid REPORT_ARG_MERGED_HID
Historic ID to ARG merged report file from Galaxy
--report_ARG_merged_quant_subtype_path REPORT_ARG_MERGED_QUANT_SUBTYPE_PATH
DeepARG ARG merged quantitative subtype report file
--report_ARG_merged_quant_subtype_hid REPORT_ARG_MERGED_QUANT_SUBTYPE_HID
Historic ID to ARG merged quantitative subtype report file from Galaxy
--report_ARG_merged_quant_type_path REPORT_ARG_MERGED_QUANT_TYPE_PATH
DeepARG ARG merged quant type report file
--report_ARG_merged_quant_type_hid REPORT_ARG_MERGED_QUANT_TYPE_HID
Historic ID to ARG merged quantitative type report file from Galaxy
--report_potential_ARG_path REPORT_POTENTIAL_ARG_PATH
DeepARG potential ARG report file
--report_potential_ARG_hid REPORT_POTENTIAL_ARG_HID
Historic ID to potential ARG report file from Galaxy
--sequence_clean_file_path SEQUENCE_CLEAN_FILE_PATH
Sequence clean file from DeepARG
--sequence_clean_file_hid SEQUENCE_CLEAN_FILE_HID
Historic ID to sequence clean file from Galaxy
--bam_clean_file_path BAM_CLEAN_FILE_PATH
Binary Alignment clean file from DeepARG
--bam_clean_file_hid BAM_CLEAN_FILE_HID
Historic ID to clean alignment file from Galaxy
--sam_clean_file_path SAM_CLEAN_FILE_PATH
Sequence Alignment clean file from DeepARG
--sam_clean_file_hid SAM_CLEAN_FILE_HID
Historic ID to clean alignment file from Galaxy
--bam_clean_sorted_file_path BAM_CLEAN_SORTED_FILE_PATH
Binary Alignment clean sorted file from DeepARG
--bam_clean_sorted_file_hid BAM_CLEAN_SORTED_FILE_HID
Historic ID to clean sorted alignment file from Galaxy
--daa_clean_align_file_path DAA_CLEAN_ALIGN_FILE_PATH
DAA clean align file from DeepARG
--daa_clean_align_file_hid DAA_CLEAN_ALIGN_FILE_HID
Historic ID to DAA sorted alignment file from Galaxy
dRep
$ tooldistillator drep --help
usage: tooldistillator.py drep <options>
Extract information from output(s) of dRep (dereplication results)
positional arguments:
report Path to report(s)
options:
-h, --help Show this help message and exit
-o OUTPUT, --output OUTPUT Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
dRep version number
--hid HID Historic ID to dRep clusters file from Galaxy
--fasta_dereplicated_bin_zip_folder_path FASTA_DEREPLICATED_BIN_ZIP_FOLDER_PATH
Dereplicated bin fasta format ZIP folder for dRep
--fasta_dereplicated_bin_zip_folder_hid FASTA_DEREPLICATED_BIN_ZIP_FOLDER_HID
Historic ID to dereplicated bin fasta format ZIP folder from Galaxy
--bdb_file_path BDB_FILE_PATH
Bdb file for dRep
--bdb_file_hid BDB_FILE_HID
Historic ID to Bdb file from Galaxy
--cdb_file_path CDB_FILE_PATH
Cdb file for dRep
--cdb_file_hid CDB_FILE_HID
Historic ID to Cdb file from Galaxy
--chdb_file_path CHDB_FILE_PATH
Chdb file for dRep
--chdb_file_hid CHDB_FILE_HID
Historic ID to Chdb file from Galaxy
--mdb_file_path MDB_FILE_PATH
Mdb file for dRep
--mdb_file_hid MDB_FILE_HID
Historic ID to Mdb file from Galaxy
--ndb_file_path NDB_FILE_PATH
Ndb file for dRep
--ndb_file_hid NDB_FILE_HID
Historic ID to Ndb file from Galaxy
--sdb_file_path SDB_FILE_PATH
Sdb file for dRep
--sdb_file_hid SDB_FILE_HID
Historic ID to Sdb file from Galaxy
--wdb_file_path WDB_FILE_PATH
Wdb file for dRep
--wdb_file_hid WDB_FILE_HID
Historic ID to Wdb file from Galaxy
--winning_genomes_pdf_path WINNING_GENOMES_PDF_PATH
Winning genomes pdf file for dRep
--winning_genomes_pdf_hid WINNING_GENOMES_PDF_HID
Historic ID to winning genomes pdf file from Galaxy
--cluster_scoring_pdf_path CLUSTER_SCORING_PDF_PATH
Cluster scoring pdf file for dRep
--cluster_scoring_pdf_hid CLUSTER_SCORING_PDF_HID
Historic ID to cluster scoring pdf file from Galaxy
--clustering_scatterplots_pdf_path CLUSTERING_SCATTERPLOTS_PDF_PATH
Clustering scatterplots pdf file for dRep
--clustering_scatterplots_pdf_hid CLUSTERING_SCATTERPLOTS_PDF_HID
Historic ID to clustering scatterplots pdf file from Galaxy
--primary_clustering_dendrogram_pdf_path PRIMARY_CLUSTERING_DENDROGRAM_PDF_PATH
Primary clustering dendrogram pdf file for dRep
--primary_clustering_dendrogram_pdf_hid PRIMARY_CLUSTERING_DENDROGRAM_PDF_HID
Historic ID to primary clustering dendrogram pdf file from Galaxy
--secondary_clustering_dendrogram_pdf_path SECONDARY_CLUSTERING_DENDROGRAM_PDF_PATH
Secondary clustering dendrogram pdf file for dRep
--secondary_clustering_dendrogram_pdf_hid SECONDARY_CLUSTERING_DENDROGRAM_PDF_HID
Historic ID to secondary clustering dendrogram pdf file from Galaxy
--secondary_clustering_mds_pdf_path SECONDARY_CLUSTERING_MDS_PDF_PATH
Secondary clustering MDS pdf file for dRep
--secondary_clustering_mds_pdf_hid SECONDARY_CLUSTERING_MDS_PDF_HID
Historic ID to secondary clustering MDS pdf file from Galaxy
--log_file_path LOG_FILE_PATH
Log file from dRep
--log_file_hid LOG_FILE_HID
Historic ID to dRep log file from Galaxy
EggNOG-mapper
$ tooldistillator eggnogmapper --help
usage: tooldistillator.py eggnogmapper <options>
Extract information from output(s) of eggnogmapper annotation results files
positional arguments:
report Path to report(s)
options:
-h, --help Show this help message and exit
-o OUTPUT, --output OUTPUT Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
EggNOG-mapper version
--reference_database_version REFERENCE_DATABASE_VERSION
Database version used for EggNOG-mapper
--hid HID Historic ID for EggNOG-mapper file from Galaxy
--seed_orthologs_report_path SEED_ORTHOLOGS_REPORT_PATH
EggNOG-mapper seed orthologs result file
--seed_orthologs_report_hid SEED_ORTHOLOGS_REPORT_HID
Historic ID to seed orthologs result file from Galaxy
--orthologs_report_path ORTHOLOGS_REPORT_PATH
EggNOG-mapper orthologs result file
--orthologs_report_hid ORTHOLOGS_REPORT_HID
Historic ID to orthologs result file from Galaxy
Fastp
$ tooldistillator fastp --help
usage: tooldistillator.py fastp <options>
Extract information from output(s) of fastp (report.json)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
fastp version number for fastp
--hid HID historic ID for fastp file from galaxy for fastp
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for fastp
--trimmed_forward_R1_path TRIMMED_FORWARD_R1_PATH
forward R1 trimmed file for fastp
--trimmed_forward_R1_hid TRIMMED_FORWARD_R1_HID
historic ID for forward reads file from galaxy for fastp
--trimmed_reverse_R2_path TRIMMED_REVERSE_R2_PATH
reverse R2 trimmed file for fastp
--trimmed_reverse_R2_hid TRIMMED_REVERSE_R2_HID
historic ID for reverse reads file from galaxy for fastp
--html_report_path HTML_REPORT_PATH
html fastp report path for fastp
--html_report_hid HTML_REPORT_HID
historic ID for fastp html report for fastp
Fastqc
$ tooldistillator fastqc --help
usage: tooldistillator.py fastqc <options>
Extract information from output(s) of fastqc (report.txt)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
fastqc version number for fastqc
--hid HID historic ID for fastqc file from galaxy for fastqc
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for fastqc
--html_report_path HTML_REPORT_PATH
html fastqc report path for fastqc
--html_report_hid HTML_REPORT_HID
historic ID for fastqc html report for fastqc
Filtlong
$ tooldistillator filtlong --help
usage: tooldistillator.py filtlong <options>
Extract information from output(s) of filtlong (input.fastq)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
filtlong version number for filtlong
--hid HID Historic ID to filtlong contigs file from Galaxy for filtlong
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for filtlong
Flye
$ tooldistillator flye --help
usage: tooldistillator.py flye <options>
Extract information from output(s) of flye (contig.fasta)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to flye contigs file from Galaxy for flye
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
flye version number for flye
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for flye
--contig_graph_path CONTIG_GRAPH_PATH
Assembly graph file for flye
--contig_graph_hid CONTIG_GRAPH_HID
Historic ID to assembly graph from Galaxy for flye
--tsv_file_path TSV_FILE_PATH
Assembly info file from flye for flye
--tsv_file_hid TSV_FILE_HID
Historic ID to assembly info file from Galaxy for flye
Groot
$ tooldistillator groot --help
usage: tooldistillator.py groot <options>
Extract information from output(s) of groot (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version number for groot
--hid HID historic ID for groot file from galaxy
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for groot
--groot_log_path GROOT_LOG_PATH
log file for groot
--groot_log_hid GROOT_LOG_HID
historic ID for Groot log file from galaxy
--bam_file_path BAM_FILE_PATH
binary Alignment file from groot
--bam_file_hid BAM_FILE_HID
historic ID for groot alignment file from Galaxy
GTDB-tk
$ tooldistillator gtdbtk --help
usage: tooldistillator.py gtdbtk <options>
Extract information from output(s) of gtdbtk taxonomy files
positional arguments:
report Path to report(s)
options:
-h, --help Show this help message and exit
-o OUTPUT, --output OUTPUT Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
GTDB-tk version
--reference_database_version REFERENCE_DATABASE_VERSION
Database version used for GTDB-tk
--hid HID Historic ID for GTDB-tk file from Galaxy
--classify_directory_path CLASSIFY_DIRECTORY_PATH
GTDB-tk classify ZIP directory
--classify_directory_hid CLASSIFY_DIRECTORY_HID
Historic ID to classify ZIP directory from Galaxy
--align_directory_path ALIGN_DIRECTORY_PATH
GTDB-tk align ZIP directory
--align_directory_hid ALIGN_DIRECTORY_HID
Historic ID to align ZIP directory from Galaxy
--identify_directory_path IDENTIFY_DIRECTORY_PATH
GTDB-tk identify ZIP directory
--identify_directory_hid IDENTIFY_DIRECTORY_HID
Historic ID to identify ZIP directory from Galaxy
--log_file_path LOG_FILE_PATH
GTDB-tk log file
--log_file_hid LOG_FILE_HID
Historic ID to log file from Galaxy
Integronfinder2
$ tooldistillator integronfinder2 --help
usage: tooldistillator.py integronfinder2 <options>
Extract information from output(s) of integronfinder2 (OUTPUT.integrons, OUTPUT.summary)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID historic ID for integronfinder file from galaxy for integronfinder2
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
integronfinder version for integronfinder2
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for integronfinder2
--summary_file_path SUMMARY_FILE_PATH
integronfinder summary file path for integronfinder2
--summary_file_hid SUMMARY_FILE_HID
historic ID for summary file from galaxy for integronfinder2
ISEScan
$ tooldistillator isescan --help
usage: tooldistillator.py isescan <options>
Extract information from output(s) of isescan (OUTPUT.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID for isescan file from galaxy for isescan
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
isescan version for isescan
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for isescan
--orf_fna_path ORF_FNA_PATH
fasta file with nucleotide orf sequences for isescan
--orf_fna_hid ORF_FNA_HID
Historic ID for orf fasta file from galaxy for isescan
--orf_faa_path ORF_FAA_PATH
fasta file with amino acide orf sequences for isescan
--orf_faa_hid ORF_FAA_HID
Historic ID for orf amino acid file from galaxy for isescan
--is_fna_path IS_FNA_PATH
fasta file with nucleotide IS sequences for isescan
--is_fna_hid IS_FNA_HID
Historic ID for IS file from galaxy for isescan
--summary_path SUMMARY_PATH
summary of isescan analysis for isescan
--summary_hid SUMMARY_HID
Historic ID for summary file from galaxy for isescan
--annotation_path ANNOTATION_PATH
isescan annotation gff3 file for isescan
--annotation_hid ANNOTATION_HID
Historic ID for gff annotation file from galaxy for isescan
Kraken2
$ tooldistillator kraken2 --help
usage: tooldistillator.py kraken2 <options>
Extract information from output(s) of kraken2 (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID kraken report hid for kraken2
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
kraken2 version for kraken2
--reference_database_version REFERENCE_DATABASE_VERSION
kraken2 DB version for kraken2
--seq_classification_file_path SEQ_CLASSIFICATION_FILE_PATH
file containing the classification of each reads for kraken2
--seq_classification_file_hid SEQ_CLASSIFICATION_FILE_HID
historic ID for read classification file from Galaxy for kraken2
MaxBin2
$ tooldistillator maxbin2 --help
usage: tooldistillator.py maxbin2 <options>
Extract information from output(s) of maxbin2 (bin_summary.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID bin summary file hid for maxbin2
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for maxbin2
--fasta_bin_zip_folder_path FASTA_BIN_ZIP_FOLDER_PATH
folder containing fasta bins from binning for maxbin2
--fasta_bin_zip_folder_hid FASTA_BIN_ZIP_FOLDER_HID
historic ID for folder containing fasta bins from Galaxy for maxbin2
--bin_predicted_markers_zip_folder_path BIN_PREDICTED_MARKERS_ZIP_FOLDER_PATH
folder containing predicted markers files for maxbin2
--bin_predicted_markers_zip_folder_path BIN_PREDICTED_MARKERS_ZIP_FOLDER_HID
historic ID for folder containing predicted markers files from Galaxy for maxbin2
--too_short_sequences_file_path TOO_SHORT_SEQUENCES_PATH
too short sequences fasta file for maxbin2
--too_short_sequences_file_hid TOO_SHORT_SEQUENCES_HID
historic ID for too short sequences fasta file from Galaxy for maxbin2
--unclassified_sequences_file_path UNCLASSIFIED_SHORT_SEQUENCES_PATH
unclassified sequences fasta file for maxbin2
--unclassified_sequences_file_HID UNCLASSIFIED_SHORT_SEQUENCES_HID
historic ID for unclassified sequences fasta file from Galaxy for maxbin2
--marker_gene_presence_file_path MARKER_GENE_PRESENCE_FILE_PATH
gene marker presence file for maxbin2
--marker_gene_presence_file_hid MARKER_GENE_PRESENCE_FILE_HID
historic ID for gene marker presence file from Galaxy for maxbin2
--marker_gene_presence_plot_file_path MARKER_GENE_PRESENCE_PLOT_FILE_PATH
gene marker presence plot file for maxbin2
--marker_gene_presence_plot_file_hid MARKER_GENE_PRESENCE_PLOT_FILE_HID
historic ID for gene marker presence plot file from Galaxy for maxbin2
--log_file_path LOG_FILE_PATH
Log file from maxbin2
--log_file_hid LOG_FILE_HID
historic ID to log file from Galaxy for maxbin2
Megahit
$ tooldistillator megahit --help
usage: tooldistillator.py megahit <options>
Extract information from output(s) of megahit (assembly.fasta)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID assembly fasta file hid for megahit
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for megahit
--intermediate_contig_folder_path INTERMEDIATE_CONTIG_FOLDER_PATH
folder containing intermediate contigs from assembly for megahit
--intermediate_contig_folder_hid INTERMEDIATE_CONTIG_FOLDER_HID
historic ID for folder containing intermediate contigs from assembly from Galaxy for megahit
--log_file_path LOG_FILE_PATH
Log file from megahit
--log_file_hid LOG_FILE_HID
historic ID to log file from Galaxy for megahit
Metabat2
$ tooldistillator metabat2 --help
usage: tooldistillator.py metabat2 <options>
Extract information from output(s) of metabat2 (cluster_membership.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID cluster membership file hid for metabat2
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for metabat2
--fasta_bin_zip_folder_path FASTA_BIN_ZIP_FOLDER_PATH
folder containing fasta bins from binning for metabat2
--fasta_bin_zip_folder_hid FASTA_BIN_ZIP_FOLDER_HID
historic ID for folder containing fasta bins from Galaxy for metabat2
--too_short_sequences_file_path TOO_SHORT_SEQUENCES_PATH
too short sequences fasta file for metabat2
--too_short_sequences_file_hid TOO_SHORT_SEQUENCES_HID
historic ID for too short sequences fasta file from Galaxy for metabat2
--unbinned_sequences_file_path UNBINNED_SHORT_SEQUENCES_PATH
unbinned sequences fasta file for metabat2
--unbinned_sequences_file_HID UNBINNED_SHORT_SEQUENCES_HID
historic ID for unbinned sequences fasta file from Galaxy for metabat2
--low_depth_sequences_file_path LOW_DEPTH_SHORT_SEQUENCES_PATH
low depth sequences fasta file for metabat2
--low_depth_sequences_file_hid LOW_DEPTH_SHORT_SEQUENCES_HID
historic ID for low depth fasta file from Galaxy for metabat2
MMseqs2linclust
$ tooldistillator mmseqs2linclust --help
usage: tooldistillator.py mmseqs2linclust <options>
Extract information from output(s) of mmseqs2linclust (rep_seqs.fasta : file with representative cluster sequences)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID rep_seq file hid for mmseqs2 linclust
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for mmseqs2
--cluster_fasta_like_path CLUSTER_FASTA_LIKE_PATH
file containing all the clustered sequences for mmseqs2 linclust
--cluster_fasta_like_hid CLUSTER_FASTA_LIKE_HID
historic ID for file containing all the clustered sequences from Galaxy for mmseqs2 linclust
--tsv_file_path TSV_FILE_PATH
Cluster TSV file from mmseqs2 linclust
--tsv_file_hid TSV_FILE_HID
historic ID to cluster TSV file from Galaxy for mmseqs2 linclust
MMseqs2taxonomy
$ tooldistillator mmseqs2taxonomy --help
usage: tooldistillator.py mmseqs2taxonomy <options>
Extract information from output(s) of mmseqs2taxonomy (tax_output.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID rep_seq file hid for mmseqs2 taxonomy
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for mmseqs2
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for mmseqs2
--kraken_report_path KRAKEN_REPORT_PATH
file containing kraken taxonomy report for mmseqs2 taxonomy
--kraken_report_hid KRAKEN_REPORT_HID
historic ID for kraken taxonomy report from Galaxy for mmseqs2 taxonomy
--krona_report_path KRONA_REPORT_PATH
file containing krona taxonomy report for mmseqs2 taxonomy
--krona_report_hid KRONA_REPORT_HID
historic ID for krona taxonomy report from Galaxy for mmseqs2 taxonomy
MultiQC
$ tooldistillator multiqc --help
usage: tooldistillator.py multiqc <options>
Extract information from output(s) of multiqc (output.html)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
abricate version for multiqc
--hid HID historic ID for abricate file from galaxy for multiqc
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for multiqc
Plasmidfinder
$ tooldistillator plasmidfinder --help
usage: tooldistillator.py plasmidfinder <options>
Extract information from output(s) of plasmidfinder (plasmidfinder.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID for plasmidfinder file from galaxy for plasmidfinder
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
plasmidfinder version for plasmidfinder
--reference_database_version REFERENCE_DATABASE_VERSION
plasmidfinder DB version for plasmidfinder
--plasmid_result_tabular_path PLASMID_RESULT_TABULAR_PATH
plasmidfinder results in tabular format for plasmidfinder
--plasmid_result_tabular_hid PLASMID_RESULT_TABULAR_HID
plasmidfinder results hid in Galaxy for plasmidfinder
--genome_hit_path GENOME_HIT_PATH
fasta file with hits in genome, doesn't work for multiple input for plasmidfinder
--genome_hit_hid GENOME_HIT_HID
Historic ID for genome hit file from galaxy for plasmidfinder
--plasmid_hit_path PLASMID_HIT_PATH
fasta file with plasmid sequences, doesn't work for multiple input for plasmidfinder
--plasmid_hit_hid PLASMID_HIT_HID
Historic ID for plasmid sequence hit file from galaxy for plasmidfinder
Polypolish
$ tooldistillator polypolish --help
usage: tooldistillator.py polypolish <options>
Extract information from output(s) of polypolish (contig.fasta)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to polypolish contigs file from Galaxy for polypolish
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
polypolish version number for polypolish
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for polypolish
Prodigal
$ tooldistillator prodigal --help
usage: tooldistillator.py prodigal <options>
Extract information from output(s) of prodigal (output.fnn)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to prodigal contigs file from Galaxy for prodigal
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
prodigal version number for prodigal
--protein_translation_file_path PROTEIN_TRANSLATION_FILE_PATH
Proteins from all the sequences fasta file for prodigal
--protein_translation_file_hid PROTEIN_TRANSLATION_FILE_HID
Historic ID to proteins fasta file from Galaxy for prodigal
--potential_gene_start_file_path POTENTIAL_GENE_START_FILE_PATH
Potential genes start file for prodigal
--potential_gene_start_file_hid POTENTIAL_GENE_START_FILE_HID
Historic ID to potential genes start file from Galaxy for prodigal
--gbk_genes_coordinate_file_path GBK_GENES_COORDINATE_FILE_PATH
GBK genes coordinate file for prodigal
--gbk_genes_coordinate_file_hid GBK_GENES_COORDINATE_FILE_HID
Historic ID to GBK genes coordinate file from Galaxy for prodigal
--gff_genes_coordinate_file_path GFF_GENES_COORDINATE_FILE_PATH
GFF3 genes coordinate file for prodigal
--gff_genes_coordinate_file_hid GFF_GENES_COORDINATE_FILE_HID
Historic ID to GFF3 genes coordinate file from Galaxy for prodigal
--sco_genes_coordinate_file_path SCO_GENES_COORDINATE_FILE_PATH
SCO genes coordinate file for prodigal
--sco_genes_coordinate_file_hid SCO_GENES_COORDINATE_FILE_HID
Historic ID to SCO genes coordinate file from Galaxy for prodigal
Quast
$ tooldistillator quast --help
usage: tooldistillator.py quast <options>
Extract information from output(s) of quast (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to quast file from Galaxy for quast
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for quast
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
Quast version number for quast
--quast_html_path QUAST_HTML_PATH
Quast html report file for quast
--quast_html_hid QUAST_HTML_HID
Historic ID to quast html file from Galaxy for quast
Recentrifuge
$ tooldistillator recentrifuge --help
usage: tooldistillator.py recentrifuge <options>
Extract information from output(s) of recentrifuge (data.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID historic ID to recentrifuge data file provided by Galaxy for recentrifuge
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
recentrifuge version for recentrifuge
--reference_database_version REFERENCE_DATABASE_VERSION
ncbi taxonomy DB version for recentrifuge
--rcf_stat_path RCF_STAT_PATH
recentrifuge statistic file for recentrifuge
--rcf_stat_hid RCF_STAT_HID
historic ID provided by Galaxy for recentrifuge
--rcf_html_path RCF_HTML_PATH
recentrifuge html report file for recentrifuge
--rcf_html_hid RCF_HTML_HID
recentrifuge html report file for recentrifuge
RefseqMasher
$ tooldistillator refseqmasher --help
usage: tooldistillator.py refseqmasher <options>
Extract information from output(s) of refseqmasher (OUTPUT.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to refseq result from Galaxy for refseqmasher
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for refseqmasher
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
refseqmasher version number for refseqmasher
Shovill
$ tooldistillator shovill --help
usage: tooldistillator.py shovill <options>
Extract information from output(s) of shovill (contig.fasta)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID to shovill contigs file from Galaxy for shovill
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
shovill version number for shovill
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for shovill
--contig_graph_path CONTIG_GRAPH_PATH
Assembly graph file for shovill
--contig_graph_hid CONTIG_GRAPH_HID
Historic ID to assembly graph from Galaxy for shovill
--bam_file_path BAM_FILE_PATH
Binary Alignment file from shovill for shovill
--bam_file_hid BAM_FILE_HID
Historic ID to alignment file from Galaxy for shovill
StarAMR
$ tooldistillator staramr --help
usage: tooldistillator.py staramr <options>
Extract information from output(s) of staramr (resfinder.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID provided by Galaxy for resfinder file for staramr
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
tool version for staramr
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for staramr
--mlst_file_path MLST_FILE_PATH
mlst output file from staramr for staramr
--mlst_file_hid MLST_FILE_HID
Historic ID provided by Galaxy for mlst file for staramr
--plasmidfinder_file_path PLASMIDFINDER_FILE_PATH
plasmid output file from staramr for staramr
--plasmidfinder_file_hid PLASMIDFINDER_FILE_HID
Historic ID provided by Galaxy for plasmid for staramr
--pointfinder_file_path POINTFINDER_FILE_PATH
pointfinder output file from staramr for staramr
--pointfinder_file_hid POINTFINDER_FILE_HID
Historic ID provided by Galaxy for pointfinder for staramr
--setting_file_path SETTING_FILE_PATH
setting file from staramr analysis for staramr
--setting_file_hid SETTING_FILE_HID
Historic ID provided by Galaxy for settings for staramr
Sylph
$ tooldistillator sylph --help
usage: tooldistillator.py sylph <options>
Extract information from output(s) of sylph (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for sylph
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for sylph
--hid HID historic ID for sylph file from galaxy
Sylph-tax
$ tooldistillator sylphtax --help
usage: tooldistillator.py sylphtax <options>
Extract information from output(s) of sylphtax (merge_report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
version for sylphtax
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for sylphtax
--hid HID historic ID for sylphtax file from galaxy
--taxonomic_profile_folder_path TAXONOMIC_PROFILE_FOLDER_PATH
taxonomic profile folder for sylph-tax
--taxonomic_profile_folder_hid TAXONOMIC_PROFILE_FOLDER_HID
historic ID to taxonomic profile folder from Galaxy
Tabular_file
$ tooldistillator tabular_file --help
usage: tooldistillator.py tabular_file <options>
Extract information from output(s) of tabular_file (report.tsv)
positional arguments:
report Path to report(s)
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output location
--hid HID Historic ID provided by Galaxy for tabular file for tabular_file
--analysis_software_name ANALYSIS_SOFTWARE_NAME
Tool name to the input file for tabular_file
--reference_database_version REFERENCE_DATABASE_VERSION
DB version for tabular_file
--analysis_software_version ANALYSIS_SOFTWARE_VERSION
Software version to the input file for tabular_file
ToolDistillator: a tool to extract and aggregate information from different tool outputs to JSON parsable files
ToolDistillator is a tool to extract information from output files of specific tools, expose it as JSON files, and aggregate over several tools.
It can produce both a single file to each tool or a summarized file from a set of reports.
It was initially developped to be used on Galaxy and some options are only available on Galaxy (e.g. extract the historic ID from a galaxy analysis).
Tool was inspirated from the hAMRonization project (author: @dfornika, @fmaguire, @raphenya, @jodyphelan, @pvanheus)
Content
Installation
Requirement
Conda installation
Installation from sources
Clone the GitLab repository
Move inside the created folder
Install dependencies
Usage
Command list
Tool specification
For each tool, the requirements can be accessed using the
--helpargument, e.g.You can also test the command using the test data available in
test/data/dummyfolders. For example:Parse multiple same inputs
It is possible to parse multiple reports from the same tool at once by giving a list of reports as the argument, e.g.:
This will generate only one JSON file for all reports.
When you can provide different kind of files to a tool (e.g. shovill option use the
contig.fasta, but can also use the alignment bam and assembly graph file), you can not submit in multiple file mode!Aggregate JSON reports from different tools
To aggregate JSON reports from different tools in one final JSON file, you can use the
summarizesubcommand:You can try it using test data in
test/data/dummy/summarizefolder:Available tools
A diverse set of tools is available, along with a generic one for tabular files with headers. There is also a command to aggregate JSON outputs.
coverage_table.tabular, log_file.txt
proteins.fasta, unbinned.fasta, log_file.txt
report_ARG_merged_quant_type.txt, report_potential_ARG.txt,
sequence_clean_file.txt, bam_clean_file.bam, sam_clean_file.sam,
bam_clean_sorted_file.bam, daa_clean_align_file.daa
winning_genomes.pdf, cluster_scoring.pdf, clustering_scatterplots.pdf,
primary_clustering_dendrogram.pdf, secondary_clustering_dendrogram.pdf,
secondary_clustering_MDS.pdf, log_file.txt
unclassified_sequences.fasta, marker_gene_presence.tabular,
marker_gene_presence_plot.pdf, log_file.txt
Abricate
AMRFinderPlus
argNorm
Bakta
Bakta tool generate a complete JSON file in output, which is so big and redondant to a database integration. The ToolDistillator Bakta:
Bandage
Bracken
BWA
CheckM2
Concoct
CoreProfiler
CoverM
DasTool
DeepARG
dRep
EggNOG-mapper
Fastp
Fastqc
Filtlong
Flye
Groot
GTDB-tk
Integronfinder2
ISEScan
Kraken2
MaxBin2
Megahit
Metabat2
MMseqs2linclust
MMseqs2taxonomy
MultiQC
Plasmidfinder
Polypolish
Prodigal
Quast
Recentrifuge
RefseqMasher
Shovill
StarAMR
Sylph
Sylph-tax
Tabular_file
Galaxy
This tool is also available for Galaxy on the ToolShed
Contributing
If you want to contribute, please read our Contributing guidelines
Citation
Please cite the ABRomics project when using the tool
Licence
GNU GENERAL PUBLIC LICENSE V.3
Contact
You can contact the ABRomics team to abromics-support@groupes.france-bioinformatique.fr.