SeqStat is a package that contains tools
to generate stats from a FastQ file,
merge those stats for multiple samples,
and validate the generated stats files.
Mode - Generate
Generate outputs several stats on a FASTQ file.
Outputted stats:
Bases
Total number
Base qualities, with the number of bases having that quality
Number of each nucleotide
Reads
Total number
minimum length
maximum length
A histogram of the average base qualities
The quality encoding (Sanger, solexa etc.)
A histogram of the read lengths.
Mode - Merge
This module will merge seqstat files together and keep the sample/library/readgroup structure.
If required it’s also possible to collapse this, the output file then des not have any sample/library/readgroup structure.
Mode - Validate
A file from SeqStat will validate the input files.
If aggregation values can not be regenerated the file is considered corrupt.
This should only happen when the user will edit the seqstat file manually.
SeqStat is part of BIOPET tool suite that is developed at LUMC by the SASC team.
Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a
dedicate data analysis task or added as part of a pipeline, for example the SASC team’s biowdl pipelines.
SeqStat
SeqStat is a package that contains tools to generate stats from a FastQ file, merge those stats for multiple samples, and validate the generated stats files.
Mode - Generate
Generate outputs several stats on a FASTQ file.
Outputted stats:
Mode - Merge
This module will merge seqstat files together and keep the sample/library/readgroup structure. If required it’s also possible to collapse this, the output file then des not have any sample/library/readgroup structure.
Mode - Validate
A file from SeqStat will validate the input files. If aggregation values can not be regenerated the file is considered corrupt. This should only happen when the user will edit the seqstat file manually.
Documentation
For documentation and manuals visit our github.io page.
About
SeqStat is part of BIOPET tool suite that is developed at LUMC by the SASC team. Each tool in the BIOPET tool suite is meant to offer a standalone function that can be used to perform a dedicate data analysis task or added as part of a pipeline, for example the SASC team’s biowdl pipelines.
All tools in the BIOPET tool suite are Free/Libre and Open Source Software.
Contact
For any question related to SeqStat, please use the github issue tracker or contact the SASC team directly at: sasc@lumc.nl.