🐪 camlhmp 🐪 - Classification through yAML Heuristic Mapping Protocol
camlhmp is a tool for generating organism typing tools from YAML schemas. Through discussions
with Tim Read, we identified a need for a straightforward method to define and manage typing
schemas for organisms of interest. YAML was chosen for its simplicity and readability.
The primary purpose of camlhmp is to provide a framework that enables researchers to
independently define typing schemas for their organisms of interest using YAML. This
approach facilitates the management and analysis biological data for researchers at any
level of experience.
camlhmp does not supply pre-defined typing schemas. Instead, it equips researchers
with the necessary tools to create and maintain their own schemas, ensuring these schemas
can easily remain up to date with the latest scientific developments.
Finally, the development of camlhmp was driven by a practical need to streamline
maintenance of multiple organism typing tools. Managing these tools separately is
time-consuming and challenging. camlhmp simplifies this by providing a single
framework for each tool.
Quick Start
To quickly get started with camlhmp, you can install it through Bioconda and run the
command-line interface:
# Install camlhmp through Bioconda
conda create -n camlhmp -c conda-forge -c bioconda camlhmp
conda activate camlhmp
camlhmp --help
# Example usage of camlhmp-blast-alleles
# Acquire test data
wget https://raw.githubusercontent.com/rpetit3/camlhmp/refs/heads/main/tests/data/blast/alleles/spn-pbptype.yaml
wget https://raw.githubusercontent.com/rpetit3/camlhmp/refs/heads/main/tests/data/blast/alleles/spn-pbptype.fasta
wget https://github.com/rpetit3/camlhmp/raw/refs/heads/main/tests/data/blast/alleles/SRR2912551.fna.gz
# Run camlhmp-blast-alleles
camlhmp-blast-alleles \
--yaml spn-pbptype.yaml \
--targets spn-pbptype.fasta \
--input SRR2912551.fna.gz
Running camlhmp-blast-alleless with following parameters:
--input SRR2912551.fna.gz
--yaml spn-pbptype.yaml
--targets spn-pbptype.fasta
--outdir ./
--prefix camlhmp
--min-pident 95
--min-coverage 95
Starting camlhmp for S. pneumoniae PBP typing...
Running tblastn...
Processing hits...
Final Results...
S. pneumoniae PBP typing
┏━━━┳━━━┳━━━┳━━━┳━━━┳━━━┳━━━┳━━━┳━━━┳━━━━┳━━━┳━━━━┳━━━┳━━━━┳━━━┳━━━━┳━━━┳━━━━┳━━━┳━━━━┓
┃ … ┃ … ┃ … ┃ … ┃ … ┃ … ┃ … ┃ … ┃ … ┃ 1… ┃ … ┃ 2… ┃ … ┃ 2… ┃ … ┃ 2… ┃ … ┃ 2… ┃ … ┃ 2… ┃
┡━━━╇━━━╇━━━╇━━━╇━━━╇━━━╇━━━╇━━━╇━━━╇━━━━╇━━━╇━━━━╇━━━╇━━━━╇━━━╇━━━━╇━━━╇━━━━╇━━━╇━━━━┩
│ … │ … │ … │ … │ … │ … │ … │ … │ … │ │ 0 │ 1… │ … │ 5… │ │ 2 │ … │ 1… │ … │ │
└───┴───┴───┴───┴───┴───┴───┴───┴───┴────┴───┴────┴───┴────┴───┴────┴───┴────┴───┴────┘
Writing outputs...
Final predicted type written to ./camlhmp.tsv
tblastn results written to ./camlhmp.tblastn.tsv
For more example commands and outputs, see the documentation for each command:
camlhmp is available through PyPI and
Bioconda. While you can install it
through PyPi, it is recommended to install it through BioConda so that non-Python dependencies
are also installed.
System Requirements
camlhmp has been developed and tested on x86-64 Linux and macOS systems.
OS
Architecture
Supported?
Linux
x86-64
✅
Linux
aarch64
❌ (missing dependencies)
macOS
x86-64
✅
macOS
arm64
❌ (missing dependencies)
Windows
x86-64
❌ _(consider using WSL2) _
[!TIP]
Docker containers are available from biocontainers/camlhmp
which can be used with the --platform flag to run on Apple Silicon and ARM-based Linux systems.
conda create -n camlhmp -c conda-forge -c bioconda camlhmp
conda activate camlhmp
camlhmp
🐪 camlhmp 🐪 - Classification through YAML Heuristic Mapping Protocol
Available camlhmp commands
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ command ┃ description ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ camlhmp-blast-alleles │ Classify assemblies using BLAST against alleles of a set of genes │
│ camlhmp-blast-regions │ Classify assemblies using BLAST against larger genomic regions │
│ camlhmp-blast-targets │ Classify assemblies using BLAST against individual genes or proteins │
│ camlhmp-extract │ Extract typing targets from a set of reference sequences │
└───────────────────────┴──────────────────────────────────────────────────────────────────────┘
PyPi Installation
To install camlhmp through PyPi, you can can use pip:
pip install camlhmp
camlhmp
🐪 camlhmp 🐪 - Classification through YAML Heuristic Mapping Protocol
Available camlhmp commands
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ command ┃ description ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ camlhmp-blast-alleles │ Classify assemblies using BLAST against alleles of a set of genes │
│ camlhmp-blast-regions │ Classify assemblies using BLAST against larger genomic regions │
│ camlhmp-blast-targets │ Classify assemblies using BLAST against individual genes or proteins │
│ camlhmp-extract │ Extract typing targets from a set of reference sequences │
└───────────────────────┴──────────────────────────────────────────────────────────────────────┘
[!WARNING]
Installing through PyPi will not install non-Python dependencies. You will need to ensure
these are installed manually.
Citing camlhmp
If you make use of camlhmp in your analysis, please cite the following:
If I’m being honest, I really wanted to name a tool with “camel” in it because they are my
wife’s favorite animal🐪 and they also remind me of my friends in Oman!
Once it was decided YAML was going to be the format for defining schemas, I quickly stumbled
on “Classification through YAML” and quickly found out I wasn’t the only once who thought
of “CAML”. But, no matter, it was decided it would be something with “CAML”, then Tim Read
came with the save and suggested “Heuristic Mapping Protocol”. So, here we are - camlhmp!
License
I’m not a lawyer and MIT has always been my go-to license. So, MIT it is!
Artificial Intelligence Disclaimer
As of v1.1.3, camlhmp has been developed with minimal assistance of Artificial
Intelligence (AI). GitHub Copilot was used for auto-completion, but otherwise all
code was written and reviewed by the author.
camlhmp
🐪 camlhmp 🐪 - Classification through yAML Heuristic Mapping Protocol
camlhmpis a tool for generating organism typing tools from YAML schemas. Through discussions with Tim Read, we identified a need for a straightforward method to define and manage typing schemas for organisms of interest. YAML was chosen for its simplicity and readability.Full documentation for
camlhmpcan be found at https://rpetit3.github.io/camlhmp/.Purpose
The primary purpose of
camlhmpis to provide a framework that enables researchers to independently define typing schemas for their organisms of interest using YAML. This approach facilitates the management and analysis biological data for researchers at any level of experience.camlhmpdoes not supply pre-defined typing schemas. Instead, it equips researchers with the necessary tools to create and maintain their own schemas, ensuring these schemas can easily remain up to date with the latest scientific developments.Finally, the development of
camlhmpwas driven by a practical need to streamline maintenance of multiple organism typing tools. Managing these tools separately is time-consuming and challenging.camlhmpsimplifies this by providing a single framework for each tool.Quick Start
To quickly get started with
camlhmp, you can install it through Bioconda and run the command-line interface:For more example commands and outputs, see the documentation for each command:
Installation
camlhmpis available through PyPI and Bioconda. While you can install it through PyPi, it is recommended to install it through BioConda so that non-Python dependencies are also installed.System Requirements
camlhmphas been developed and tested on x86-64 Linux and macOS systems.Dependencies
camlhmprelies on the following dependencies:Bioconda Installation
PyPi Installation
To install
camlhmpthrough PyPi, you can can usepip:Citing
camlhmpIf you make use of
camlhmpin your analysis, please cite the following:camlhmp
Petit III RA, Read TD camlhmp: Classification through yAML Heuristic Mapping Protocol (GitHub)
BLAST+
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009)
Naming
If I’m being honest, I really wanted to name a tool with “camel” in it because they are my wife’s favorite animal🐪 and they also remind me of my friends in Oman!
Once it was decided YAML was going to be the format for defining schemas, I quickly stumbled on “Classification through YAML” and quickly found out I wasn’t the only once who thought of “CAML”. But, no matter, it was decided it would be something with “CAML”, then Tim Read came with the save and suggested “Heuristic Mapping Protocol”. So, here we are - camlhmp!
License
I’m not a lawyer and MIT has always been my go-to license. So, MIT it is!
Artificial Intelligence Disclaimer
As of v1.1.3,
camlhmphas been developed with minimal assistance of Artificial Intelligence (AI). GitHub Copilot was used for auto-completion, but otherwise all code was written and reviewed by the author.Funding
Support for this project came (in part) from the Wyoming Public Health Division, and the Center for Applied Pathogen Epidemiology and Outbreak Control (CAPE).