Cactus uses many different algorithms and individual code contributions, principally from Joel Armstrong, Glenn Hickey, Mark Diekhans and Benedict Paten. We are particularly grateful to:
Yung H. Tsin and Nima Norouzi for contributing their 3-edge connected components program code, which is crucial in constructing the cactus graph structure, see: Tsin,Y.H., “A simple 3-edge-connected component algorithm,” Theory of Computing Systems, vol.40, No.2, 2007, pp.125-142.
Bob Harris for providing endless support for his LastZ pairwise, blast-like genome alignment tool.
Melissa Jane Hubiz and Adam Siepel for halPhyloP and Phast.
B Gulhan, R Burhans, R Harris, M Kandemir, M Haeussler, A Nekrutenko for KegAlign, the GPU-accelerated version of LastZ.
The instructions below are meant primarily for developers. Everyone else should try to use the precompiled binaries (Linux X86) or Docker image from the latest release instead.
The top-level cactus interface (cactus, cactus-pangenome, cactus-hal2maf, etc) is awlays a Python package that is pip installed into a Python virtualenv. This package runs several other tools as subprocesses, which are compiled into binaries.
Cactus contains many submodules, so it is necessary to clone with --recursive or to run git submodule update --init --recursive inside the cactus directory after cloning.
If you have Docker installed, you can now run Cactus. All binaries, such as lastz and cactus-consolidated will be run via Docker using the latest release. Singularity binaries can be used in place of docker binaries with the --binariesMode singularity flag. Note, you must use Singularity 2.3 - 2.6 or Singularity 3.1.0+. Singularity 3 versions below 3.1.0 are incompatible with cactus (see issue #55 and issue #60).
By default, cactus will use the image corresponding to the latest release when running docker binaries. This is usually okay, but can be overridden with the CACTUS_DOCKER_ORG and CACTUS_DOCKER_TAG environment variables. For example, to use GPU release 2.4.4, run export CACTUS_DOCKER_TAG=v2.4.4-gpu before running cactus.
Compiling the binaries
In order to compile the binaries locally and not use a Docker image, you need some dependencies installed. On Ubuntu (we’ve tested on 20.04 and 22.04), you can look at the Cactus Dockerfile for guidance. To obtain the apt-get command:
grep apt-get Dockerfile | head -1 | sed -e 's/RUN //g' -e 's/apt-get/sudo apt-get/g'
Progressive Cactus can be built on ARM cpus including on Mac, but Minigraph-Cactus is currently X86-only.
To build Cactus, run (from inside cactus/):
make -j 8
In order to run the Minigraph-Cactus pipeline, you must also run
build-tools/downloadPangenomeTools
If you want to work with MAF, including running cactus-hal2maf, you must also run
build-tools/downloadMafTools
In order to toggle between local and Docker binaries, use the --binariesMode command line option. If --binariesMode is not specified, local binaries will be used if found in PATH, otherwise a Docker image will be used.
Building on Mac
These are the steps I used to build Cactus on a new M4 Mac Mini with MacOS Sequoia 15.5:
Developer Tools
Install command-line developer tools. I did this by typing make on the command line (in Terminal), and accepting the prompt in the pop-up window to install them. The version installed, as obtained from pkgutil --pkg-info=com.apple.pkg.CLTools_Executables was
I pasted the install commanid from the Homebrew homepage into the Terminal and ran it. For me this command was the following, but you’re probably better off to get it from the webpage
Cactus
Cactus is a reference-free whole-genome alignment program, as well as a pangenome graph construction toolkit.
Getting Cactus
Getting help
Please subscribe to the cactus-announce low-volume mailing list so we may reach out about releases and other announcements.
To ask questions or request help, please use the Cactus GitHub Discussions.
To file a bug report or enhancement request against the code or documentation, create a GitHub Issue.
Align Genomes from Different Species
Align Genomes from the Same Species and Build Pangenome Graphs
Acknowledgements
Cactus uses many different algorithms and individual code contributions, principally from Joel Armstrong, Glenn Hickey, Mark Diekhans and Benedict Paten. We are particularly grateful to:
last-trainfrom last. last-train citationInstalling Manually From Source
The instructions below are meant primarily for developers. Everyone else should try to use the precompiled binaries (Linux X86) or Docker image from the latest release instead.
The top-level cactus interface (
cactus,cactus-pangenome,cactus-hal2maf, etc) is awlays a Python package that ispip installed into a Python virtualenv. This package runs several other tools as subprocesses, which are compiled into binaries.Contents
Cloning Cactus
Cactus contains many submodules, so it is necessary to clone with
--recursiveor to rungit submodule update --init --recursiveinside the cactus directory after cloning.Creating the Python virtualenv
Cactus requires Python >= 3.9 along with Python development headers and libraries
Install virtualenv first if needed with
python3 -m pip install virtualenv.Create the Python virtual environment run (from inside
cactus/):If you have Docker installed, you can now run Cactus. All binaries, such as
lastzandcactus-consolidatedwill be run via Docker using the latest release. Singularity binaries can be used in place of docker binaries with the--binariesMode singularityflag. Note, you must use Singularity 2.3 - 2.6 or Singularity 3.1.0+. Singularity 3 versions below 3.1.0 are incompatible with cactus (see issue #55 and issue #60).By default, cactus will use the image corresponding to the latest release when running docker binaries. This is usually okay, but can be overridden with the
CACTUS_DOCKER_ORGandCACTUS_DOCKER_TAGenvironment variables. For example, to use GPU release 2.4.4, runexport CACTUS_DOCKER_TAG=v2.4.4-gpubefore running cactus.Compiling the binaries
In order to compile the binaries locally and not use a Docker image, you need some dependencies installed. On Ubuntu (we’ve tested on 20.04 and 22.04), you can look at the Cactus Dockerfile for guidance. To obtain the
apt-getcommand:Progressive Cactus can be built on ARM cpus including on Mac, but Minigraph-Cactus is currently X86-only.
To build Cactus, run (from inside
cactus/):In order to run the Minigraph-Cactus pipeline, you must also run
If you want to work with MAF, including running
cactus-hal2maf, you must also runIn order to toggle between local and Docker binaries, use the
--binariesModecommand line option. If--binariesModeis not specified, local binaries will be used if found inPATH, otherwise a Docker image will be used.Building on Mac
These are the steps I used to build Cactus on a new M4 Mac Mini with MacOS Sequoia 15.5:
Developer Tools
Install command-line developer tools. I did this by typing
makeon the command line (in Terminal), and accepting the prompt in the pop-up window to install them. The version installed, as obtained frompkgutil --pkg-info=com.apple.pkg.CLTools_ExecutableswasHomebrew
I pasted the install commanid from the Homebrew homepage into the Terminal and ran it. For me this command was the following, but you’re probably better off to get it from the webpage
Then I installed the brew dependencies for Cactus
When installing
libompabove (usebrew reinstall libompif you missed it), it printed some messages about setting:So I added some flags to this effect to ~/.zprofile. Doing so, and including
CFLAGS, is crucial for the build to work.Make sure to reload the profile to apply the changes immediately:
Virtualenv
Python3 seemed to be installed by default. In order to install virtualenv I ran
That’s it
You should now be able to run the steps to clone, setup the virtualenv and compile the binaries exactly as they are described above.
Note that Minigraph-Cactus is not (yet) supported on Mac.
I recommend running
in order to test your installation (you will need to have run
build-tools/downloadMafToolswhen setting up the binaries for this to work)