目录

ExcludonFinder

Untitled (7)

Bioconda DOI

An easy to use tool for identifying and analyzing excludons in genomic data using RNA-seq data.

Outline

rect1

From a given RNA-seq data, alignment is performed against reference genome (1) and coverage per nucleotide is calculated (2). Convergent (-> <-) and divergent (<- ->) pairs of genes are substarcted and median covergage is calculated for each of them (3). Trancriptional units (TUs) for each gene is annotated (4) based on gene coverage. A threshold of the covegare decreasing is set, gene gene expression decays under this threshold transcription start and end sites (TSS and TTS) is set. If TUs of convergen and divergent pairs overlaps, this pair is annotated as Excludon (5).

Features

  • Fast parallel processing for large datasets
  • Support for both short and long-read data
  • Support for paired-end and single-end RNA-seq data
  • Built-in quality checks and mapping statistics

Installation

conda install -c bioconda excludonfinder

From source

git clone https://github.com/Alvarosmb/ExcludonFinder.git
cd ExcludonFinder
conda env create -f environment.yml
conda activate ExcludonFinder

Usage

If installed with conda:

ExcludonFinder -f <reference.fasta> -1 <reads_R1.fastq> -2 <reads_R2.fastq> -g <annotation.gff>

If installed from source

./scripts/ExcludonFinder -f <reference.fasta> -1 <reads_R1.fastq> -2 <reads_R2.fastq> -g <annotation.gff>

Options

- `-f`: Reference genome in FASTA format
- `-1`: Input FASTQ file for Read 1
- `-2`: Input FASTQ file for Read 2
- `-g`: Annotation file in GFF format
- `-t`: Coverage threshold (default: 0.5)
- `-j`: Number of threads (default: 8)
- `-l`: Long-read data
- `-o`: Custom output dir (default: `./output`)
- `-k`: Keep intermediate files (default: remove)

Example

./scripts/ExcludonFinder \
 -f data/example/E.coli_K12_MG1655.fasta \
 -1 data/example/test_R1.fastq \
 -2 data/example/test_R2.fastq \
 -g data/example/E.coli_K12_MG1655.gff \
 -t 0.5 \
 -j 4

Examples

The data/examples directory contains test RNA-seq data from E. coli K12 MG1655. For faster testing and analysis, the dataset is reduced to reads mapping only to the first 50 genes. Expected results can be found in data/examples/output/.

Citation

If you found this tool useful, please cite:

Alvaro Sanmartin, Pablo Iturbe, Jeronimo Rodriguez-Beltran, Iñigo Lasa. ExcludonFinder: Mapping Transcriptional Overlaps Between Neighboring Genes
关于

用于发现细菌转录终止相关 excludon 结构或反向重叠转录单元的工具。

29.4 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号