目录

FASTQ tools

fastqtk is a fast and lightweight tool for interleaving/deinterleaving/counting/trimming FASTQ files.

Installation

git clone https://github.com/ndaniel/fastqtk.git
cd fastqtk
make

Usage

Usage:   fastqtk <command> <arguments>

Command:
      interleave       interleaves two paired-end FASTQ files.
      deinterleave     splits an (already) interleaved (paired-end) FASTQ file.
      count            counts all reads from a FASTQ file.
      lengths          summary statistics for lengths of reads from a FASTQ file.
      count-lengths    number of reads and summary statistics for lengths of reads from a FASTQ file.
      tab-4            converts a FASTQ file to a text tab-delimited file with 4 columns.
      tab-8            converts a (interleaved paired-end) FASTQ file to text tab-delimited file with 8 columns.
      detab            converts a text tab-delimited file with 4/8 columns (converted using tab4/tab8) to FASTQ file.
      retain-5         retains the first N bp from 5'end of the reads from a FASTQ file.
      retain-3         retains the last N bp from 3'end of the reads from a FASTQ file.
      trim-5           trims 5' end of the reads from a FASTQ file.
      trim-3           trims 3' end of the reads from a FASTQ file.
      trim-id          trims reads ids (removes everything after first space) from a FASTQ file.
      trim-poly        trims poly-A/C/G/T/N tails at both ends of the reads sequences from a FASTQ file.
      drop-se          drops unpaired reads from an interleaved paired-end FASTQ file.
      drop-short       drops reads that have short sequences (below a given threshold).
      fq2fa            converts a FASTQ file to FASTA file.
      fa2fq            converts a FASTA file to FASTQ file.
      compress-id      lossy compression of the reads ids from a FASTQ file.
      NtoA             replaces all Ns in reads sequences with As in a FASTQ file.
      rev-com          reverse complements all reads in a FASTQ file.

Examples

  • Interleave two FASTQ paired-end files:

      fastqtk interleave reads1.fq reads2.fq out.fq
      fastqtk interleave reads1.fq reads2.fq - | gzip > out.fq.gz
  • Deinterleave a FASTQ file:

      fastqtk deinterleave reads.fq ou1.fq ou2.fq
      zcat reads.fq.gz | fastqtk deinterleave - ou1.fq ou2.fq
  • Count reads from a FASTQ file:

      fastqtk count reads.fq count.txt
  • Summary statistics regarding lengths of reads from a FASTQ file:

      fastqtk lengths reads.fq lengths.txt
  • Count of reads and summary statistics regarding lengths of reads from a FASTQ file:

      fastqtk count-lengths reads.fq count.txt lengths.txt
  • Convert a FASTQ file into a text tab-delimited file with four columns:

      fastqtk tab4 reads.fq fastq.txt
      
  • Convert an interleaved FASTQ file into a text tab-delimited file with 8 columns:

      fastqtk tab8 reads.fq fastq.txt
  • Convert back text tab-delimited file with 4 or 8 columns into a FASTQ file:

      fastqtk detab fastq.txt reads.fq
  • Remove the reads that are strictly shorter than 30bp from a FASTQ file:

      fastqtk drop-short 30 reads.fq out.fq
      
  • Replace all Ns in reads sequences from a FASTQ file:

      fastqtk NtoA reads.fq out.fq
  • Trim 10bp from 5’ end of the reads sequences from FASTQ file:

      fastqtk trim5 10 reads.fq out.fq
      
  • Trim 10bp from 3’ end of the reads sequences from FASTQ file:

      fastqtk trim3 10 reads.fq out.fq
      
  • Retain the first 70bp from 5’ end of the reads sequences and trim the rest:

      fastqtk retain5 70 reads.fq out.fq
      
  • Retain the first 70bp from 3’ end of the reads sequences and trim the rest:

      fastqtk retain3 70 reads.fq out.fq
      
  • Trim N or Ns from both ends of the reads sequences from FASTQ file:

      fastqtk trim-poly N 1 reads.fq out.fq
  • Trim polyA, which are strictly or equally longer than 15bp, from both ends of the reads sequences from FASTQ file:

      fastqtk trim-poly A 15 reads.fq out.fq
  • Compress the reads ids from a FASTQ file:

      fastqtk count reads.fq count.txt
      fastqtk compress-id count.txt reads.fq out.fq
      
  • Compress the reads ids from a FASTQ file (without counting the reads and when it is known that number of reads is below 200 million reads):

      fastqtk compress-id 200000000 reads.fq out.fq
      
  • Remove the unpaired reads from an interleaved FASTQ file:

      fastqtk drop-seq reads.fq out.fq
  • Convert a FASTQ file to a FASTA file:

      fastqtk fq2fa reads.fq reads.fa
  • Convert a FASTA file to a FASTQ file:

      fastqtk fa2fq reads.fa reads.fq
      
  • Reverse complement all reads from a FASTQ file:

      fastqtk rev-com reads.fq reads_revcom.fq
      
      
      
关于

用于快速处理FASTQ格式的测序数据文件,提供格式转换、质量评估和过滤等功能。

238.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号