mice computes synteny blocks from genomes expressed as sequences of genomic elements.
These elements can come from a genome graph (e.g., unitigs of a compacted de Bruijn graph), or from any other segmentation such as k-mers, genes, or MUMs/MEMs.
The input of mice is a GFF file in which each feature has an ID attribute (1-based index) specifying the element used in the path spelling the genome or chromosome.
Installation
mice is written in rust, therefore you only need cargo to install it:
cargo install --path .
Alternatively, mice is available on bioconda (use conda or mamba):
mamba install -c bioconda mice
Quick start
We provide five E. coli genomes as an example dataset.
Use the provided graph A precomputed example/graph.gff.gz is included. Uncompress it (for example: gunzip -c example/graph.gff.gz > graph.gff) and go directly to running mice.
mice
micecomputes synteny blocks from genomes expressed as sequences of genomic elements. These elements can come from a genome graph (e.g., unitigs of a compacted de Bruijn graph), or from any other segmentation such as k-mers, genes, or MUMs/MEMs.The input of
miceis a GFF file in which each feature has anIDattribute (1-based index) specifying the element used in the path spelling the genome or chromosome.Installation
miceis written in rust, therefore you only need cargo to install it:Alternatively,
miceis available on bioconda (use conda or mamba):Quick start
We provide five E. coli genomes as an example dataset.
Use the provided graph
A precomputed
example/graph.gff.gzis included.Uncompress it (for example:
gunzip -c example/graph.gff.gz > graph.gff) and go directly to runningmice.(Optional) Build the pangenome graph yourself
Install
ggcat:Build a compacted de Bruijn graph:
Convert the graph to GFF:
Run mice
Usage
<GRAPH_INPUT>– input graph file (GFF or GFA with path representing genomes)Options
-o, --out-dir <DIR>Output directory (default:mice_output)-r, --remove-dup <X>Remove an element if it occurs at least X times in any genome (0= disable, default:0)-m, --min-size <bp>After first compression, drop unmerged elements shorter than<bp>base pairs, then recompress (default:0)-s, --no-group-byTreat every path as its own genome-h, --help,-V, --versionOutput
In
<OUT_DIR>MICE writes:output.gff: block annotations (GFF)paths.txt: genomes rewritten as synteny blockspartitions.txt: each synteny block which element it contains