git clone https://github.com/tseemann/snp-dists.git
cd snp-dists
make
# run tests
make check
bats test/test.sh # if you have BATS installed
# install into $HOME/.local/bin
make install
Options
snp-dists -h (help)
USAGE
snp-dists [opts] aligned.fasta[.gz] > matrix.tsv
OPTIONS
-h Show this help
-v Print version and exit
-j CPUS Threads to use [1]
-q Quiet mode; no progress messages
-a Count all differences not just [AGTC]
-k Keep case, don't uppercase all letters
-m Output MOLTEN instead of TSV
-L Ootput lower-trangle only (unique pairs)
-c Use comma instead of tab in output
-b Blank top left corner cell
-t Add column headers when using molten format
-x INT Stop counting distance beyond this [99999]
URL
https://github.com/tseemann/snp-dists
snp-dists -v (version)
Prints the name and version separated by a space in standard Unix fashion.
Once a distance between two samples becomes
very large there is often not much point
keeping on counting. Th -x option allows you
to “short-circuit” the counting. This can reduce
computation time significantly on large
alignment is you only care about small distance.
Issues
Report bugs and give suggesions on the
Issues page
snp-dists
Convert a FASTA alignment to SNP distance matrix
Quick Start
Installation
snp-distsis written in C to the C99 standard and only depends onzlib.Bioconda
Containers
Docker images are available on dockerhub and quay.io. These are maintained by the StaPH-B workgroup. Dockerfiles can be found here.
Source
Options
snp-dists -h(help)snp-dists -v(version)Prints the name and version separated by a space in standard Unix fashion.
snp-dists -q(quiet mode)Don’t print informational messages, only errors.
snp-dists -c(CSV instead of TSV)snp-dists -b(omit the toolname/version)snp-dists -L(lower-triangle only)Advanced options
By default, all letters are (1) uppercased and (2) ignored if not A,G,T or C.
snp-dists -a(don’t just count AGTC)Normally one would not want to count ambiguous letters and gaps as a “difference” but if you desire, you can enable this option.
snp-dists -k(don’t uppercase any letters)You may wish to preserve case, as you may wish lower-case characters to be masked in the comparison.
snp-dists -m(“molten” output format)snp-dists -m -t(“molten” output format with column headers; requires -m “molten” format enabled)snp-dists -x INT(stop counting afterINTSNPsOnce a distance between two samples becomes very large there is often not much point keeping on counting. Th
-xoption allows you to “short-circuit” the counting. This can reduce computation time significantly on large alignment is you only care about small distance.Issues
Report bugs and give suggesions on the Issues page
Related software
Licence
GPL Version 3
Authors