Coelho, L.P., Alves, R., del Río, Á.R. et al. Towards the biogeography of
prokaryotic genes. Nature 601, 252–256 (2022).
[https://doi.org/10.1038/s41586-021-04233-4](DOI: 10.1038/s41586-021-04233-4)
Command line tool to query the Global Microbial Gene Catalog (GMGC).
Install
GMGC-mapper runs on Python 3.6-3.10 and requires
prodigal to be available for genome
mode.
Conda install
The easiest way to install GMGC-mapper is through bioconda, which will ensure
all dependencies (including prodigal) are installed automatically:
conda install -c bioconda gmgc-mapper
pip install
Alternatively, GMGC-mapper is available from PyPI, so can be installed
through pip:
pip install GMGC-mapper
Note that this does not install prodigal (which is necessary for the
genome-based workflow).
Install from source
Finally, especially if you are retrieving the cutting edge version from Github,
you can install with the standard
GMGC-mapper
CITATION
If you use results from this tool, please cite
Command line tool to query the Global Microbial Gene Catalog (GMGC).
Install
GMGC-mapper runs on Python 3.6-3.10 and requires prodigal to be available for genome mode.
Conda install
The easiest way to install GMGC-mapper is through bioconda, which will ensure all dependencies (including
prodigal) are installed automatically:pip install
Alternatively,
GMGC-mapperis available from PyPI, so can be installed through pip:Note that this does not install
prodigal(which is necessary for the genome-based workflow).Install from source
Finally, especially if you are retrieving the cutting edge version from Github, you can install with the standard
Examples
The nucleotide input is optional (but should be used if available so that the quality of the hits can be refined):
If yout input is a metagenome, you can use NGLess for assembly and gene prediction. For more details, read the docs.
Output
The output folder will contain
For more details, read the docs. A description of the outputs is also written to output folder for convenience.
Parameters
-i/--input: path to the input genome file (FASTA, possibly .gz/.bz2/.xz compressed).-o/--output: Output directory (will be created if non-existent).--nt-genes: path to the input DNA gene file (FASTA, possibly .gz/.bz2/.xz compressed).--aa-genes: path to the input Protein gene file (FASTA, possibly .gz/.bz2/.xz compressed).