A package for De-Barcoding and Error Correction of sequencing data containing molecular barcodes. For information on getting started, including installing and running Debarcer, see the wiki
A sample config file is provided in /debarcer/config/sample_config.ini, and a sample prepfile is provided in /debarcer/config/library_prep_types.ini.
Parameters provided in the config can also be provided in the command. But parameters in the config have precedence over parameters in the command if duplicated.
The prepfile contains instructions for how different library preps should be handled.
Information for several library preps is already in the prep file. Custom library prep is available but it should be noted that:
a single UMI is allowed
UMI length is constant for all reads
the same spacer sequence is expected for all reads, if present
INPUT_READS: Number of unprocessed fastq files (1-3)
OUTPUT_READS: Number of reheadered fastq files (1-2)
UMI_LOCS: Indices of unprocessed fastq files with a UMI (1-3). Single value or comma-separated
UMI_LENS: Lengths of the UMIs in each fastq with umis. Single value or comma-separated
SPACER: True if a spacer is used (TRUE/FALSE)
SPACER_SEQ: Sequence of the spacer ([A,C,G,T]+)
UMI_POS: Left-most position of the umi within read (1-based). Single value or comma-separated
UMI_INLINE: True id umis are inline with reads, False if umis are in separate input fastq (TRUE/FALSE)
Single-value of parameters UMI_LENS, UMI_POS are propagated to all fastqs with umis.
However, for comma-separated values, info must be listed in the same order as UMI_LOCS,
and the number of values must match that of UMI_LOCS
Multiple spacers are not allowed. It is assumed that the same spacer is used for all input fastqs with umis if spacer if present.
Use empty string or None to specify SPACER_SEQ when SPACER=False
LIBRARY_NAME (eg, SIMSENSEQ-PE) would then be the --prepname argument passed to debarcer.py.
Typical Workflow
Example commands. See the wiki for a full description of parameters
2. Align processed fastqs (outside of debarcer)
debarcer does not align processed fastqs
* align fastqs (eg, with bwa-mem)
* bam should then be coordinate-sorted
* index bam
bam_file.bam and bam_file.bam.bai are required for following steps
3. Error-correct and group UMIs into families
```python
debarcer group -o /path/to/output_dir -r "chrN:posA-posB" -b /path/to/bamfile.bam -d 1 -p 10 -i False
-t False
Debarcer
A package for De-Barcoding and Error Correction of sequencing data containing molecular barcodes. For information on getting started, including installing and running
Debarcer, see the wikiNote : The repository for the original release of Debarcer (V0.3.1) as described in Nature Protocols can be found under https://github.com/oicr-gsi/debarcer/releases/tag/v0.3.1
https://github.com/oicr-gsi/debarcer/tree/master-old
Configuration
A sample config file is provided in
/debarcer/config/sample_config.ini, and a sample prepfile is provided in/debarcer/config/library_prep_types.ini. Parameters provided in the config can also be provided in the command. But parameters in the config have precedence over parameters in the command if duplicated.The prepfile contains instructions for how different library preps should be handled. Information for several library preps is already in the prep file. Custom library prep is available but it should be noted that:
Example information required in the prepfile:
Single-value of parameters UMI_LENS, UMI_POS are propagated to all fastqs with umis. However, for comma-separated values, info must be listed in the same order as UMI_LOCS, and the number of values must match that of UMI_LOCS
Multiple spacers are not allowed. It is assumed that the same spacer is used for all input fastqs with umis if spacer if present. Use empty string or None to specify SPACER_SEQ when SPACER=False
LIBRARY_NAME(eg, SIMSENSEQ-PE) would then be the--prepnameargument passed to debarcer.py.Typical Workflow
Example commands. See the wiki for a full description of parameters
-p “SIMSENSEQ-PE” -pf /path/to/library_prep_types.ini -c /path/to/config.ini -px newfile_name
-r “chrN:posA-posB” -u /path/to/Umifiles/umifile.json -f “1,3,5” -ct 1 -pt 50 -p 10 -m 1000000 -t False -i False -stp nofilter
-mc 500 -rt 95
Dependencies
Debarcer was tested using Python 3.6.4 and depends on the packages pysam and pandas (among others). See
requirements.txt.