Haplomap is a successor project of HBCGM, as development on the latter was last continued in 2010. Haplomap has been adopted as a replacement for the original HBCGM
or compile GSL(makesure that GSL include and lib path is exported)
./configure --prefix=${HOME}/program/gsl
make && make install
# you may need to add this line to your .bashrc
export LD_LIBRARY_PATH="${HOME}/program/gsl/lib:$LD_LIBRARY_PATH"
build and install to path
cd ${haplomap_repo}
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/path/to/directory/bin ..
make
1. Prepare MPD measnum id file. One id per row, suffixed with “-m” or “-f”(f: female, m: male)
26720-m
26720-f
9940-f
...
2. Edit the config.yaml file path in workflows folder:
only edit HBCGM section.
HBCGM:
# working directory
WORKSPACE: "/data/bases/fangzq/MPD/results_drug_diet"
# path to haplomap
BIN: "/home/fangzq/github/HBCGM/build/bin"
# MPD id file, one id per line
TRAIT_IDS: "/data/bases/fangzq/MPD/drug-diet.ids.txt"
# set to true will select individual animal data. Default: use strain means.
USE_RAWDATA: false
# strains metadata: map strain abbrev to full name, jax ids, etc.
# see docs folder to view examples
STRAIN_ANNO: "/data/bases/shared/haplomap/PELTZ_20210609/strains.metadata.csv"
# filtered VCF files after variant calling step
VCF_DIR: "/data/bases/shared/haplomap/PELTZ_20210609/VCFs"
# Ensembl-vep output after variant calling step
VEP_DIR: "/data/bases/shared/haplomap/PELTZ_20210609/VEP"
## Optional files
# genetic relation file from PLink output
GENETIC_REL: "/data/bases/shared/haplomap/PELTZ_20210609/mouse54_grm.rel"
# gene expression file
GENE_EXPRS: "/data/bases/shared/haplomap/PELTZ_20210609/mus.compact.exprs.txt"
3. run haplomap pipeline
3.1 create conda envs
conda create -n hbcgm -f environment.yaml
3.2 run on a local computing node.
source activate hbcgm
# modify the file path in haplomap and run with 24 cores
snakemake -s workflows/haplomap.smk \
--configfile workflows/config.yaml
-k -p -j 24
3.3 Run on the HPC, e.g. Stanford Sherlock
e.g. Sherlock slurm
edit slurm.submit.sh, change file path to HBCGM/workflows
edit workflows/slurm_config.yaml, specify the resource you need.
Haplomap
Haplotype-based computational genetic mapping (a.k.a HBCGM)
Haplomap is a successor project of HBCGM, as development on the latter was last continued in 2010. Haplomap has been adopted as a replacement for the original HBCGM
Citation:
see what’s new in the CHANGELOG.
Dependency
Works both on
LinuxandMacOSHaplomap:
For Variant Calling, you need:
Running pipeline
Installation
Installl from source
Ubuntu
MacOS
or compile GSL(makesure that GSL include and lib path is exported)
Usage
Run haplomap standalone
See more detail in
haplomapsubfolder: Run haplomap standaloneUse
snakemakeworkflow to run Mouse Phenome Database (MPD) datasets0. Variant calling
See variant calling using GATK, BCFtools, svtools.
e.g.
Mouse Phenome Database have > 10K datasets. Try to configure the files below to run
1. Prepare MPD
measnumid file. One id per row, suffixed with “-m” or “-f”(f: female, m: male)2. Edit the
config.yamlfile path inworkflowsfolder:only edit
HBCGMsection.3. run haplomap pipeline
3.1 create conda envs
3.2 run on a local computing node.
3.3 Run on the HPC, e.g. Stanford Sherlock
e.g. Sherlock slurm
slurm.submit.sh, change file path toHBCGM/workflowsworkflows/slurm_config.yaml, specify the resource you need.Output
output explanation, see here: Run haplomap standalone
Contact
Email:
Copyright and License Information
Copyright (C) 2019-2022 Stanford University, Zhuoqing Fang and Gary Peltz.
Authors: Zhuoqing Fang and Gary Peltz.
The original HBCGM (the maximal haplotype construction method) was developed by Dr. David Dill and Dr. Gary Peltz at Stanford.
HBCGM/Halomap is patented to Dr. Gary Peltz.
This program is licensed with commercial restriction use license. Please see the attached LICENSE file for details.