目录

Python implementation of the GreyListChIP Bioconductor package.

For a ChIP-seq experiment with a paired input and ChIP sample, this will calculate a greylist for peaks from the input for that particular pair. These are questionable peaks for this particular input-ChIP pair.

The reason for doing this is that peak callers can sometimes have trouble in high depth input regions even though the caller adjusts for the reads in the input lane. It is standard practice to use the ENCODE blacklist regions for commonly problematic regions but there can be additional, sample-specific high depth input regions that are not covered by the blacklist regions. This flags peaks that falls into those sample-specific high input regions as questionable.

This implementation improves on the R implementation by not needing a separate genome file and being easily runnable on the command line. It contains no original ideas.

https://bioconductor.org/packages/release/bioc/html/GreyListChIP.html is the source of the idea and the algorithm.

usage

Run chipseq-greylist on your input BAM file for each input-ChIP pair:

chipseq-greylist bamfile

this will produce a few files:

  • bamfile-input-greystats.csv: bootstrapped negative binomial parameters and estimated threshold
  • bamfile-input-greydepth.tsv: sambamba windowed depth
  • bamfile-input-grey.bed: BED file of greylist regions exceeding coverage threshold in the input file

You can now filter out/annotate peaks falling in the greylist regions by interesecting the peaks with the greylist file. For example:

bedtools intersect -wao -a bamfile-peaks.bed -b bamfile-input-grey.bed > bamfile-peaks-greylist-annotated.bed
关于

用于ChIP-seq数据分析中识别和过滤低质量测序区域

2.0 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号