Note: Alignment tasks can be limited to a specified set of pairwise comparisons
where appropriate (i.e. when homologous chromosome pairs are known between
assemblies) using the option --pairs.
Comparisons are specified with a
tab-delimited text file, where column 1 contains sequence names from genome A, and
column 2 contains sequences from genome B.
In the example Chromosome_pairs.txt, Chr A2 has been assembled
as two scaffolds (B2, B3) in genome B.
#Chromosome_pairs.txt
A1 B1
A2 B2
A2 B3
A3 B4
Find Insertions
Scan alignments for insertion events and report as GFF annotation of Genome A
Tinscan: TE-Insertion-Scanner
Scan whole genome alignments for transposon insertion signatures.
Table of contents
Algorithm overview
Options and usage
Installing Tinscan
Requirements:
You can set up a conda environment with the required dependencies using the YAML files in this repo:
For ARM64 (Apple Silicon Macs) create a virtual intel env.
For all other operating systems use
environment.ymlWith the conda env active you can now install tinscan.
Install from PyPi.
For the latest development version, clone and install from this repository.
Example usage
Find insertion events in genome A (target) relative to genome B (query).
Prepare Input Genomes
Split A and B genomes into two directories containing one scaffold per file. Check that sequence names are unique within genomes.
Output: data/A_target_split/.fa data/B_query_split/.fa
Align Genomes
Align each scaffold from genome B onto each genome A scaffold. Report alignments with >= 60% identity and length >= 100bp.
Output: A_Inserts/A_Inserts_vs_B.tab
Note: Alignment tasks can be limited to a specified set of pairwise comparisons where appropriate (i.e. when homologous chromosome pairs are known between assemblies) using the option
--pairs.Comparisons are specified with a tab-delimited text file, where column 1 contains sequence names from genome A, and column 2 contains sequences from genome B.
In the example Chromosome_pairs.txt, Chr A2 has been assembled as two scaffolds (B2, B3) in genome B.
Find Insertions
Scan alignments for insertion events and report as GFF annotation of Genome A
Output: A_Inserts/A_Inserts_vs_B_l100_id80.gff3
License
Software provided under MIT license.