ViewBS has several top level commands which determine the required and optimal arguments. These top level commands can be divided into two parts: methylation report and data visualization of functional regions.
Methylation report part has several different top commands which can generate report about read coverage, distribution of methylation level, global methylation leve, etc.
The part of visualization for functional regions also has several different top commands. For ViewBS, the first input that users should provide is the regions of interest. These regions could be functional elements, like genes, transposable elements (TE), or differentially methylated regions (DMR). The other type of input that the users should provide is the methylation information. Methylation information are the outputs from BS-seq aligner, like Bismark, etc.
If you want to install ViewBS in an existing conda environment, please run:
conda activate <your_environment_name>
## If you want to install a specific verison,
## replace 'viewbs' with 'viewbs=<version_number>' (e.g. 'viewbs=0.1.10')
conda install -c bioconda viewbs
Scenario 2
If you want to install ViewBS in a new conda environment, please run:
## You can change `env4viewbs` to other name you want
conda create -n env4viewbs -c bioconda viewbs
## To activate the environment
conda activate env4viewbs
Installation with Docker
docker pull xie186/viewbs
## Use "docker run <image name>" ViewBS" to replace "ViewBS". Here is an example:
cd ViewBS_testdata/
docker run -v ${PWD}:/data -w /data bc1743f3418f ViewBS MethOneRegion --region chr5:19497000-19499600 --sample bis_WT.tab.gz,WT --sample bis_cmt23.tab.gz,cmt23 --outdir MethOneRegion --prefix chr5_19497000-19499600 --context CHG
## 1) ${PWD}:/data: means mount the current directory to /data in Docker image
## 2) bc1743f3418f: IMAGE ID (run `docker image ls` to get the IMAGE ID).
To make the installation of dependencies easier, a script was developped. perl INSTALL.pl can be used as a helper to install and check the dependencies.
You can also install it step by step shown as below:
ViewBS uses Genome-wide cytosine methylation report as input file. It is sorted by chromosomal coordinates but also contains the sequence context and is in the following format:
NOTES: If you use other tools rather than Bismark to generate the methylation information, you can still use ViewBS. You have two ways to use ViewBS: 1) if you have the bam file (e.g. bam file generated by bwa-meth), you can use MethylDackel with ‘–cytosine_report’ to output the methylation information in Genome-wide cytosine methylation report format; 2) We also include the scripts to convert the results of other tools (BSseeker and Brat) https://github.com/xie186/ViewBS/tree/master/lib/scripts to Genome-wide cytosine methylation report format. If the script for your tool is not included, please feel free to contact us at xie186@purdue.edu
Tips: how to generate Genome-wide Cytosine Methylation Report
If you already have finished the mapping using Bismark, you should have a sam/bam file. Let’s say you have a sam file named test.sam. What you can do to generate Genome-wide Cytosine Methylation Report is:
### This step will generate several files:
bismark_methylation_extractor --bedGraph --CX test.sam
### This step will generate a file named bis_test.tab
coverage2cytosine -CX -o test.bis_rep.cov --genome_folder ara/ test.bismark.cov
*For BS-seq that is processed by Bismark but by other tools like BRAT, BS seeker2, ViewBS provides supports to convert DNA methylation data in other format to the format of genome-wide cytosine methylation report. Supports for other tools will be developed upon requests from the users. If you have DNA methylation data generated by other tools and you have difficulties on converting the data format, just give a post in the issuse. We’re happy to add new functions for the file format conversion. *
Since ViewBS uses Bio::DB::HTS::Tabix to quickly retrieves information from the input (TAB-delited) files, the Genome-wide Cytosine Methylation Report files should be bgzipped and tabix indexed. bgzip and tabix .
bgzip test.bis_rep.cov ## test.bis_rep.cov.gz will be generated. Note: test.bis_rep.cov shoud be sorted based on chromosome coordinates.
tabix -C -p vcf test.bis_rep.cov.gz ## test.bis_rep.cov.gz.csi will be generated. Now test.bis_rep.cov.gz can be used as input for ViewBS.
## If there is no chromosome length beyond (2^29-1), you can also run:
tabix -p vcf test.bis_rep.cov.gz ## test.bis_rep.cov.gz.tbi will be generated. Now test.bis_rep.cov.gz can be used as input for ViewBS.
Besides providing sample and region information in the commind line, you can also read the information from a TEXT file. For example, if you are interested in more than one group of genes and you want to study the differences of DNA methylation patterns in the one sample, the methylation information can also be read from a TEXT file. Instead of giving an explicit sample information pairs, you need to write “file:” followed by the name of the TEXT file. In this case, you can only use –sample once and you cann’t use –region anymore.
The genes were devided into quintiles based on gene expression level. Rank1 group was the group with lowest expression level. Users can use this method to study the correlation between DNA methylation and gene expression.
Here is the figure generated by the command line above:
MethOneRegion
View MethOneRegion will output the methylation information for one region give by the users and then plot the methylation levels across the chromsomesome region.
Here is an example:
To generate the figure above, you can use the following command line:
In ViewBS, all the figure objects will be saved into RDS files. The users can restore the RDS files and merge the figures into one graph.
There are two ways to do this: 1) use the helper script named mer_fig.R ; 2) the users can write R script to read the RDS files and merge the figures into one graph with cowplot.
Please see the following for the help information:
$ Rscript ../../lib/scripts/mer_fig.R -h
USAGE
Usage: Rscript mer_fig.R --input <fig1.rds,fig2.rds> --labels <A,B,C,D> [options]
DESCRIPTION
mer_fig.R is developed to merge figures into on graph.
Options
-help | -h
Prints the help message and exits.
--input [required]
- RDS files. <fig1.rds,fig2.rds...>
--labels [optional]
- Labesl for each figure. Default: <A,B,C,D...>
--output [optional]
- Output files for the graph. Default: cowplot_mer_fig.pdf
--ncol [optional]
- Number of columns on the graph.
--base_height [optional]
- The height (in inches) of each sub-plot
--base_aspect_ratio [optional]
- The aspect ratio of each sub-plot. Default: 1.6
Error: Please check the help information!
Execution halted
2. Use the template below to merge multiple figures into one graph.
library(cowplot) # https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html
p1 <- readRDS("BisNonConvRate/cmt2_proj_allsam.tab.rds")
p2 <- readRDS("MethGlobal/cmt2_proj_allsam.tab.rds")
plot2by2 <- plot_grid(p1, p2,
labels=c("A", "B"), ncol = 2)
save_plot("plot2by2.pdf", plot2by2,
ncol = 2, # we're saving a grid plot of 2 columns
#nrow = 2, # and 2 rows
# each individual subplot should have an aspect ratio of 1.3
base_aspect_ratio = 2
)
Here is how plot2by2.png looks like:
Further improvement of the graph can be done in Inkscape if a PDF file was generated.
Table of Contents
ViewBS
Workflow of ViewBS
ViewBS has several top level commands which determine the required and optimal arguments. These top level commands can be divided into two parts: methylation report and data visualization of functional regions.
Methylation report part has several different top commands which can generate report about read coverage, distribution of methylation level, global methylation leve, etc.
The part of visualization for functional regions also has several different top commands. For ViewBS, the first input that users should provide is the regions of interest. These regions could be functional elements, like genes, transposable elements (TE), or differentially methylated regions (DMR). The other type of input that the users should provide is the methylation information. Methylation information are the outputs from BS-seq aligner, like Bismark, etc.
Here is the workflow of ViewBS:
Installation
Installation via
conda[recommended]Install
condaFirst you need to install miniconda following the instructions here: https://conda.io/en/latest/miniconda.html
Scenario 1
If you want to install
ViewBSin an existing conda environment, please run:Scenario 2
If you want to install
ViewBSin a new conda environment, please run:Installation with
DockerInstallation of dependencies step by step
Download the lastest version:
You can also install it step by step shown as below:
Install htslib
Perl version: >v5.8.7
Perl packages:
R version: > 3.3.0
R packages
Install the required libraries in R:
Preparation of input files
ViewBS uses Genome-wide cytosine methylation report as input file. It is sorted by chromosomal coordinates but also contains the sequence context and is in the following format:
Please see details in Bismark websites.
For details, please see the link below: https://github.com/xie186/ViewBS/wiki/Support-for-nonBismark-results
Since ViewBS uses Bio::DB::HTS::Tabix to quickly retrieves information from the input (TAB-delited) files, the Genome-wide Cytosine Methylation Report files should be bgzipped and tabix indexed. bgzip and tabix .
Note: tabix and bgzip binaries are now part of the HTSlib project. https://github.com/samtools/htslib
Here is an example:
USAGE
Download test data
https://gitlab.com/BS-seq/ViewBS_testdata
Top commands of ViewBS
MethCoverage
An Example of Reverse Cumulative Plot with x-axis representing the coverage of BS-seq.
To generate the figure above, use the command shown as below:
Under methCoverage folder, there will be three files generated.
BisNonConvRate
An Example of BisNonConvRate
To generate the figure above, use the command shown as below:
Under BisNonConvRate, there will be three files generated.
GlobalMethLev
An Example of GlobalMethLev
To generate the figure above, use the command shown as below:
Under methGlobal, there will be three files generated.
MethLevDist
An Example of MethLevDist
To generate the figure above, use the command shown as below:
MethGeno
An example of MethGeno
To generate the figure above, use the command shown as below:
Note: fai file can generated by samtools:
samtools faidx TAIR10_chr_all.fastaView MethHeatmap
Region file format:
Note: If the file has 4th column, each row in this column should be unique.
An example of MethHeatmap
To generate the figure above, use the command shown as below:
MethOverRegion
An example of MethOverregion
Besides providing sample and region information in the commind line, you can also read the information from a TEXT file. For example, if you are interested in more than one group of genes and you want to study the differences of DNA methylation patterns in the one sample, the methylation information can also be read from a TEXT file. Instead of giving an explicit sample information pairs, you need to write “file:” followed by the name of the TEXT file. In this case, you can only use –sample once and you cann’t use –region anymore.
The TEXT file should follow the following format:
Here is an example:
The genes were devided into quintiles based on gene expression level. Rank1 group was the group with lowest expression level. Users can use this method to study the correlation between DNA methylation and gene expression.
See an example here: https://gitlab.com/BS-seq/ViewBS_testdata/blob/master/testdata/TAIR10_Transposable_Elements.chr1.bed
Here is the figure generated by the command line above:
MethOneRegion
View MethOneRegion will output the methylation information for one region give by the users and then plot the methylation levels across the chromsomesome region.
Here is an example:
To generate the figure above, you can use the following command line:
How to merge figures into one graph
In ViewBS, all the figure objects will be saved into RDS files. The users can restore the RDS files and merge the figures into one graph.
There are two ways to do this: 1) use the helper script named
mer_fig.R; 2) the users can write R script to read the RDS files and merge the figures into one graph withcowplot.1. Use the R script in ViewBS
Example as below:
Please see the following for the help information:
2. Use the template below to merge multiple figures into one graph.
Here is how
plot2by2.pnglooks like:Where to find help
If you have bugs, feature requests, please report the issues here: (https://github.com/readbio/ViewBS/issues).
Commercial use
ViewBS uses GNU GPLv3 and is free for use by academic users. If you want to use it in commercial settings, please contact us.
How to cite
Xiaosan Huang, Shaoling Zhang, Kongqing Li, Jyothi Thimmapuram, Shaojun Xie; ViewBS: a powerful toolkit for visualization of high-throughput bisulfite sequencing data, Bioinformatics, , btx633, https://doi.org/10.1093/bioinformatics/btx633
Authors
Drs. Xiaosan Huang (huangxs@njau.edu.cn), Kong-Qing Li (likq@njau.edu.cn) and Shaoling Zhang (slzhang@njau.edu.cn).
Drs. Shaojun Xie: (Email: xie186@purdue.edu) and Jyothi Thimmapuram (jyothit@purdue.edu)