HORmon is a tool for annotation of alpha satellite arrays in centromeres of a newly assembled human genome. HORmon consists of two modules:
Monomer Inference extracts draft human monomers based on the given alpha-satellite consensus template and centromeric sequence.
HORmon polishes monomers extracted on the previous stage to make it consistent with Centromere Evolution postulate, extract HORs and decompose centromeric sequence into HORs.
HORmon has been used to infer monomers from the recently announced complete human genome assembly of the CHM13 cell line generated by the Telomere-to-Telomere Consortium.
The data generated in the paper that describes HORmon (Kunyavskaya et al., 2021) can be found at Figshare.
The data includes extracted monomers and HORs from all live alpha satellite arrays in CHM13 cell line, as well as annotations of these arrays.
Jupyter notebook that reproduces figures of the HOR paper is available at github.
Installation
The recommended way to install HORmon is with conda package manager:
conda install -c bioconda hormon
Alternatively, HORmon can be build and installed from source as described below.
Requirements:
Linux only. Mac OS is not yet supported
Python3.6+
biopython
clustalo
joblib
python-edlib
setuptools
networkx
pygraphviz
stringdecomposer
The required python packages can be installed through conda using
conda install --file requirements.txt
Installing from source
git clone https://github.com/ablab/HORmon.git
cd HORmon
python3 setup.py install --record hormon_files.txt
Then, HORmon is available as monomer_inference and HORmon
Afterward, to uninstall HORmon please run
xargs rm -rf < hormon_files.txt
Quick start
Monomer Inference
Monomer Inference script needs two parameters: (1) (centromeric) sequence and (2) monomer template:
Resulting monomers can be found in toy8_mi/final/monomers.fa and sequence annotation in toy8_mi/final/final_decomposition.tsv.
HORmon
HORmon takes as input: (1) (centromeric) sequence, (2) monomers from “monomer_inference” stage, (3) centromere id for nicer output, (4) the minimum number of occurrence for monomers, monomers pairs and HOR and (5) output folder
Note: HORmon should be launch on each centromere independently. We currently cannot guarantee adequate results, in the case of running HORmon on all centromeres simultaneously.
HORmon
HORmon is a tool for annotation of alpha satellite arrays in centromeres of a newly assembled human genome. HORmon consists of two modules:
HORmon has been used to infer monomers from the recently announced complete human genome assembly of the CHM13 cell line generated by the Telomere-to-Telomere Consortium. The data generated in the paper that describes HORmon (Kunyavskaya et al., 2021) can be found at Figshare. The data includes extracted monomers and HORs from all live alpha satellite arrays in CHM13 cell line, as well as annotations of these arrays. Jupyter notebook that reproduces figures of the HOR paper is available at github.
Installation
The recommended way to install HORmon is with conda package manager:
Alternatively, HORmon can be build and installed from source as described below.
Requirements:
The required python packages can be installed through conda using
Installing from source
Then, HORmon is available as
monomer_inferenceandHORmonAfterward, to uninstall HORmon please run
Quick start
Monomer Inference
Monomer Inference script needs two parameters: (1) (centromeric) sequence and (2) monomer template:
Resulting monomers can be found in
toy8_mi/final/monomers.faand sequence annotation intoy8_mi/final/final_decomposition.tsv.HORmon
HORmon takes as input: (1) (centromeric) sequence, (2) monomers from “monomer_inference” stage, (3) centromere id for nicer output, (4) the minimum number of occurrence for monomers, monomers pairs and HOR and (5) output folder
Output:
toy8/mn.fa– final monomerstoy8/final_decomposition.tsv– monomer decompositiontoy8/HORs.tsv– HORs descriptiontoy8/HORdecomposition.tsv– HORs decompositionNote: HORmon should be launch on each centromere independently. We currently cannot guarantee adequate results, in the case of running HORmon on all centromeres simultaneously.
CentromereArchitect
CentromereArchitect (early version of HORmon), as it is described in the paper, is available at the branch centromere-architect. Please cite Dvorkina et al., 2021.
Feedback and bug reports
Your comments, bug reports, and suggestions are very welcomed.
Please leave them at our GitHub repository tracker.
Cite