From DNAnexus R&D: scalable gVCF merging and joint variant calling for population sequencing projects.
(GL, genotype likelihood)
Reading
Our 2018 manuscript with collaborators at Regeneron Genetics Center and Baylor College of Medicine details the design of GLnexus and scientific validation using up to 240,000 human exomes and 22,600 genomes. Compared to the DNAnexus cloud-native deployment used for such large projects, this open-source version produces identical scientific results but lacks some of the scalability and production-oriented features.
For each tagged revision, the Releases page has a static executable suitable for most Linux x86-64 hosts; just download it and chmod +x glnexus_cli. Each release also provides a lightweight Docker image wrapping glnexus_cli.
Build & test
The GLnexus build process has a number of dependencies, but produces a standalone, statically-linked executable glnexus_cli. The easiest way to build it is to use our Dockerfile to control all the compile-time dependencies, then simply copy the static executable out of the resting Docker container and put it anywhere you like.
# Clone repo
git clone https://github.com/dnanexus-rnd/GLnexus.git
cd GLnexus
git checkout vX.Y.Z # optional, check out desired revision
# Build GLnexus in docker
docker build --target builder -t glnexus_tests .
# Run GLnexus unit tests.
docker run --rm glnexus_tests
# Copy the static GLnexus executable to the current working directory.
docker run --rm -v $(pwd):/io glnexus_tests cp glnexus_cli /io
# Run it to see its usage message.
./glnexus_cli
To build GLnexus without Docker, make sure you have gcc 5+, CMake 3.2+, and all the dependencies indicated in the Dockerfile.
Then,
git clone https://github.com/dnanexus-rnd/GLnexus.git
cd GLnexus
cmake -Dtest=ON . && make -j$(nproc) && ctest -V
The Performance wiki page has practical advice for deploying GLnexus on a powerful server.
The code has some hooks for performance profiling using
perf and
FlameGraph.
To profile performance within the DNAnexus applet run the applet as
usual plus -i perf=true. This produces an output file
genotype.stacks containing sampling observation counts for common call
stacks. To generate an SVG visualization with FlameGraph:
GLnexus
From DNAnexus R&D: scalable gVCF merging and joint variant calling for population sequencing projects. (GL, genotype likelihood)
Reading
Our 2018 manuscript with collaborators at Regeneron Genetics Center and Baylor College of Medicine details the design of GLnexus and scientific validation using up to 240,000 human exomes and 22,600 genomes. Compared to the DNAnexus cloud-native deployment used for such large projects, this open-source version produces identical scientific results but lacks some of the scalability and production-oriented features.
NEW for 2020: Accurate, scalable cohort variant calls using DeepVariant and GLnexus (by Google Health team) including public bucket with 1000 Genomes Project modern resequencing products.
Getting Started
The Getting Started wiki page has a tutorial for first-time users.
Prebuilt executables
For each tagged revision, the Releases page has a static executable suitable for most Linux x86-64 hosts; just download it and
chmod +x glnexus_cli. Each release also provides a lightweight Docker image wrappingglnexus_cli.Build & test
The GLnexus build process has a number of dependencies, but produces a standalone, statically-linked executable
glnexus_cli. The easiest way to build it is to use our Dockerfile to control all the compile-time dependencies, then simply copy the static executable out of the resting Docker container and put it anywhere you like.To build GLnexus without Docker, make sure you have gcc 5+, CMake 3.2+, and all the dependencies indicated in the Dockerfile.
Then,
You will also find
./glnexus_clihere.Coding conventions
Status, defined early in types.hS()defined just belowStatusStatusLibraries used
Performance profiling
The Performance wiki page has practical advice for deploying GLnexus on a powerful server.
The code has some hooks for performance profiling using
perfand FlameGraph.To profile performance within the DNAnexus applet run the applet as usual plus
-i perf=true. This produces an output filegenotype.stackscontaining sampling observation counts for common call stacks. To generate an SVG visualization with FlameGraph: