The HiCool R/Bioconductor package provides an end-to-end interface to
process and normalize Hi-C paired-end fastq reads into .(m)cool files.
The heavy lifting (fastq mapping, pairs parsing and pairs filtering) is
performed by the underlying lightweight hicstuff python library
(https://github.com/koszullab/hicstuff).
Pairs filering is done using the approach described in
Cournac et al., 2012 and implemented
in hicstuff.
Cooler (https://github.com/open2c/cooler)
library is used to parse pairs into a multi-resolution, balanced .mcool file.
.(m)cool is a compact, indexed HDF5 file format specifically tailored
for efficiently storing HiC-based data. The .(m)cool file format was
developed by Abdennur and Mirny and
published in 2019.
Internally, all these external dependencies are automatically installed and
managed in R by a basilisk environment.
Processing .fastq paired-end files into a .mcool Hi-C contact matrix
The main processing function offered in this package is HiCool().
One simply needs to specify:
The path to each fastq file;
The genome reference, as a .fasta sequence, a pre-computed bowtie2 index
or a supported ID (hg38, mm10, dm6, R64-1-1, WBcel235, GRCz10,
Galgal4);
On top of processing fastq reads, HiCool provides convenient reports for
single/multiple sample(s).
x <- importHiCoolFolder(output = 'HiCool/', hash = '55IONQ')
HiCReport(x)
Installation
As an R/Bioconductor package, HiCool should be very easy to install. The only
dependency is R (>= 4.2). In R, one can run:
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("HiCool")
The first time a HiCool() function is executed, a basilisk environment
will be automatically set up. In this environment, few dependencies will be
installed:
python (pinned 3.9.1)
numpy (pinned 1.23.4)
bowtie2 (pinned 2.4.5)
samtools (pinned 1.7)
hicstuff (pinned 3.1.5)
cooler (pinned 0.8.11)
HiCExperiment ecosystem
HiCool is integrated within the HiCExperiment ecosystem in Bioconductor.
Read more about the HiCExperiment class and handling Hi-C data in R
here.
HiCool
The
HiCoolR/Bioconductor package provides an end-to-end interface to process and normalize Hi-C paired-end fastq reads into.(m)coolfiles.hicstuffpython library (https://github.com/koszullab/hicstuff).hicstuff.Cooler(https://github.com/open2c/cooler) library is used to parse pairs into a multi-resolution, balanced.mcoolfile..(m)coolis a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based data. The.(m)coolfile format was developed by Abdennur and Mirny and published in 2019.basiliskenvironment.Processing
.fastqpaired-end files into a.mcoolHi-C contact matrixThe main processing function offered in this package is
HiCool(). One simply needs to specify:.fastasequence, a pre-computedbowtie2index or a supported ID (hg38,mm10,dm6,R64-1-1,WBcel235,GRCz10,Galgal4);Output files
Reporting
On top of processing fastq reads, HiCool provides convenient reports for single/multiple sample(s).
Installation
As an R/Bioconductor package,
HiCoolshould be very easy to install. The only dependency is R (>= 4.2). In R, one can run:The first time a
HiCool()function is executed, abasiliskenvironment will be automatically set up. In this environment, few dependencies will be installed:HiCExperiment ecosystem
HiCoolis integrated within theHiCExperimentecosystem in Bioconductor. Read more about theHiCExperimentclass and handling Hi-C data in R here.