Given a set of BAM files and a gene annotation BED file, calculates the
Transcript Integrity Number (TIN) for each transcript.
Main usage
python calculate-tin.py [-h] [options]
Parameters
--version show program's version number and exit
-h, --help show this help message and exit
-i INPUT_FILES, --input=INPUT_FILES
Input BAM file(s). "-i" takes these input: 1) a single
BAM file. 2) "," separated BAM files (no spaces
allowed). 3) directory containing one or more bam
files. 4) plain text file containing the path of one
or more bam files (Each row is a BAM file path). All
BAM files should be sorted and indexed using samtools.
[required]
-r REF_GENE_MODEL, --refgene=REF_GENE_MODEL
Reference gene model in BED format. Must be strandard
12-column BED file. [required]
-c MINIMUM_COVERAGE, --minCov=MINIMUM_COVERAGE
Minimum number of read mapped to a transcript.
default=10
-n SAMPLE_SIZE, --sample-size=SAMPLE_SIZE
Number of equal-spaced nucleotide positions picked
from mRNA. Note: if this number is larger than the
length of mRNA (L), it will be halved until it's
smaller than L. default=100
--names=SAMPLE_NAMES sample names, comma separated (no spaces allowed);
number must match the number of provided bam_files
-s, --subtract-background
Subtract background noise (estimated from intronic
reads). Only use this option if there are substantial
intronic reads.
-p NRPROCESSES, --processes=NRPROCESSES
Number of child processes for the parallelization.
Default: 1
The tool was forked off the script tin.py (v2.6.4) of the
RSeQC package to achieve some speed-up.
This program calculates transcript integrity number (TIN) for each transcript
(or gene) in BED file. TIN is conceptually similar to RIN (RNA integrity number)
but provides transcript level measurement of RNA quality and is more sensitive
to measure low quality RNA samples:
TIN score of a transcript is used to measure the RNA integrity of the
transcript.
Median TIN score across all transcripts can be used to measure RNA integrity
of that “RNA sample”.
TIN ranges from 0 (the worst) to 100 (the best). TIN = 60 means: 60% of the
transcript has been covered if the reads coverage were uniform.
TIN will be assigned to 0 if the transcript has no coverage or covered reads
is fewer than cutoff.
Extended usage
Additionaly, this repository has been updated with three simple Python scripts.
TIN score merging
Merge TIN score tables for multiple samples.
python merge-tin.py [-h] [options]
Parameters
-h, --help show this help message and exit
-v {DEBUG,INFO,WARN,ERROR,CRITICAL}, --verbosity {DEBUG,INFO,WARN,ERROR,CRITICAL}
Verbosity/Log level. Defaults to ERROR
-l LOGFILE, --logfile LOGFILE
Store log to this file.
--input-files INFILES
Space-separated paths to the input tables.
--output-file OUTFILE
Path for the outfile with merged TIN scores
-h, --help show this help message and exit
-v {DEBUG,INFO,WARN,ERROR,CRITICAL}, --verbosity {DEBUG,INFO,WARN,ERROR,CRITICAL}
Verbosity/Log level. Defaults to ERROR
-l LOGFILE, --logfile LOGFILE
Store log to this file.
--input-file INFILE Path to the table with merged TIN scores
--output-file-prefix OUTFILE_PREFIX
Prefix for the path to the TIN boxplots.
The boxplots are generated in PDF and
PNG formats under
output-file-prefix+.pdf and output-file-prefix+.png.
TIN score summary
Calculate simple summary statistics for the per-sample TIN scores.
python summarize-tin.py [-h] [options]
Parameters
-h, --help show this help message and exit
-v {DEBUG,INFO,WARN,ERROR,CRITICAL}, --verbosity {DEBUG,INFO,WARN,ERROR,CRITICAL}
Verbosity/Log level. Defaults to ERROR
-l LOGFILE, --logfile LOGFILE
Store log to this file.
--input-file INFILE Path to the table with merged TIN scores
--output-file OUTFILE
Path for the output table with TIN statistics.
Output file is formatted in a TSV table as well.
Run locally
In order to use the scripts you will need to clone this repository and install
the dependencies:
git clone https://github.com/zavolanlab/tin-score-calculation
cd tin-score-calculation
pip install .
You may want to install dependencies inside a virtual environment,
e.g., using virtualenv. Alternatively, if you use conda we provide an environment recipe too - in such case just run conda env create.
Some of the dependencies require specific system libraries to be installed, this however should be taken care of by the package manager.
You can then find the scripts in directory scripts/ and run it as described in
the Main usage and Extended usage sections.
To run the tool with minimum test files, try:
docker run -it quay.io/biocontainers/tin-score-calculation:0.6--pyh5e36f6f_0 calculate-tin.py --help
docker run -it quay.io/biocontainers/tin-score-calculation:0.6--pyh5e36f6f_0 merge-tin.py --help
docker run -it quay.io/biocontainers/tin-score-calculation:0.6--pyh5e36f6f_0 plot-tin.py --help
docker run -it quay.io/biocontainers/tin-score-calculation:0.6--pyh5e36f6f_0 summarize-tin.py --help
NOTE: To run the tool on your own data in that manner, you will probably
need to mount a volume to allow
the container read input files and write persistent output from/to the host
file system.
TIN score calculation
Given a set of BAM files and a gene annotation BED file, calculates the Transcript Integrity Number (TIN) for each transcript.
Main usage
Parameters
File formats
Sample output (TSV):
Information
The tool was forked off the script
tin.py(v2.6.4) of theRSeQCpackage to achieve some speed-up.This program calculates transcript integrity number (TIN) for each transcript (or gene) in BED file. TIN is conceptually similar to RIN (RNA integrity number) but provides transcript level measurement of RNA quality and is more sensitive to measure low quality RNA samples:
Extended usage
Additionaly, this repository has been updated with three simple Python scripts.
TIN score merging
Merge TIN score tables for multiple samples.
Parameters
Output file is formatted in a TSV table as well.
TIN score plotting
Create per-sample boxplots of TIN scores.
Parameters
The boxplots are generated in PDF and PNG formats under
output-file-prefix+.pdfandoutput-file-prefix+.png.TIN score summary
Calculate simple summary statistics for the per-sample TIN scores.
Parameters
Output file is formatted in a TSV table as well.
Run locally
In order to use the scripts you will need to clone this repository and install the dependencies:
Alternatively you can install it via pypi by:
Alternatively you can install it via conda by:
You can then find the scripts in directory
scripts/and run it as described in the Main usage and Extended usage sections. To run the tool with minimum test files, try:Run inside container
If you have Docker installed, you can also pull the Docker image:
You can execute the scripts as following:
Version
0.6.3
Contact
Please see the list of contributors for contact information.