fqtools is a software suite for fast processing of FASTQ files. Various file manipulations are supported. See below for a full list of the subcommands available and a brief description of their purpose. Most of the individual subcommands will take either a single file or a pair of files as input. If no input file is specified, fqtools will attempt to read data from stdin. In this case, it is advisabe to specify the format of the data provided. For subcommands that generate FASTQ data, either a single file or a pair of files will be generated. If no -o argument is provided, single files will be writted to stdout.
Citation
If you use fqtools in pblished work, please can you include a reference to my Bioinformatics paper:
Droop, A. P. (2016). fqtools: An efficient software suite for modern FASTQ file manipulation. Bioinformatics (Oxford, England). [**DOI**:10.1093/bioinformatics/btw088]
Installation
fqtools requires building against both the zlib and htslib libraries:
zlib is required for processing compressed (.gz) data. The code relies on several recent zlib file IO functions, so must be a version >= 1.2.3.5.
htslib is required for reading BAM files. If htslib is not installed, download and compile htslib. Then, alter the HTSDIR path in the fqtools Makefile to point to the htslib source directory.
If ZLib is already installed, building can be performed similar to the following:
git clone https://github.com/alastair-droop/fqtools
cd fqtools/
git clone https://github.com/samtools/htslib
cd htslib/
autoheader
autoconf
./configure
make
make install
cd ..
make
You might need to run the make install as sudo make install. The htslib library must be installed into a location that the built fqtools program can find (as fqtools executable is dynamically linked to the htslib library). So, if you can not (or do not want to) install HTSlib, you must add the location of the libhts.so file to your LD_LIBRARY_PATH variable.
Introduction
fqtoolsis a software suite for fast processing ofFASTQfiles. Various file manipulations are supported. See below for a full list of the subcommands available and a brief description of their purpose. Most of the individual subcommands will take either a single file or a pair of files as input. If no input file is specified, fqtools will attempt to read data fromstdin. In this case, it is advisabe to specify the format of the data provided. For subcommands that generate FASTQ data, either a single file or a pair of files will be generated. If no-oargument is provided, single files will be writted tostdout.Citation
If you use
fqtoolsin pblished work, please can you include a reference to my Bioinformatics paper:Installation
fqtoolsrequires building against both the zlib and htslib libraries:zlibis required for processing compressed (.gz) data. The code relies on several recent zlib file IO functions, so must be a version >= 1.2.3.5.htslibis required for reading BAM files. If htslib is not installed, download and compilehtslib. Then, alter theHTSDIRpath in thefqtoolsMakefile to point to the htslib source directory.If ZLib is already installed, building can be performed similar to the following:
You might need to run the
make installassudo make install. Thehtsliblibrary must be installed into a location that the builtfqtoolsprogram can find (asfqtoolsexecutable is dynamically linked to thehtsliblibrary). So, if you can not (or do not want to) install HTSlib, you must add the location of thelibhts.sofile to yourLD_LIBRARY_PATHvariable.Licence
fqtoolsis released under the GNU General Public License version 3.Subcommands
The
fqtoolssuite contains the following subcommands:viewView FASTQ filesheadView the first reads in FASTQ filescountCount FASTQ file readsheaderView FASTQ file header datasequenceView FASTQ file sequence dataqualityView FASTQ file quality dataheader2View FASTQ file secondary header datafastaConvert FASTQ files to FASTA formatbasetabTabulate FASTQ base frequenciesqualtabTabulate FASTQ quality character frequenciestypeAttempt to guess the FASTQ quality encoding typevalidateValidate FASTQ filesfindFind FASTQ reads containing specific sequencestrimTrim reads in a FASTQ filequalmapTranslate quality values using a mapping fileEach subcommand has its own set of arguments. The global arguments are:
-hShow this help message and exit.-vShow the program version and exit.-dAllow DNA sequence bases (ACGTN)-rAllow RNA sequence bases (ACGUN)-aAllow ambiguous sequence bases (RYKMSWBDHV)-mAllow mask sequence base (X)-uAllow uppercase sequence bases-lAllow lowercase sequence bases-p CHRSet the pair replacement character (default “%”)-b BUFSIZESet the input buffer size-B BUFSIZESet the output buffer size-q QUALTYPESet the quality score encoding-f FORMATSet the input file format-F FORMATSet the output file format-iRead interleaved input file pairs-IWrite interleaved output file pairsCHRThis character will be replaced by the pair value when writing paired files.
BUFSIZEPossible suffixes are [bkMG]. If no suffix is given, value is in bytes.
QUALTYPEuDo not assume specifc quality score encodingsInterpret quality scores as Sanger encodedoInterpret quality scores as Solexa encodediInterpret quality scores as Illumina encodedFORMATFuncompressed FASTQ format (.fastq)fcompressed FASTQ format (.fastq.gz)bunaligned BAM format (.bam)uattempt to infer format from file extension, (default .fastq.gz)