Please note that files in read_list.txt need not be the same format. Each file can independently be either FASTA or FASTQ, and can further be compressed in GNU Zip (gzip) format.
Step 2: Correct raw reads
Correct the raw noisy reads using the following command:
$ necat.pl correct ecoli_config.txt
The pipeline only corrects longest 40X (PREP_OUTPUT_COVERAGE) raw reads. The corrected reads are in the files ./ecoli/1-consensus/cns_iter${NUM_ITER}/cns.fasta. The longest 30X (CNS_OUTPUT_COVERAGE) corrected reads are extracted for assembly, which are in the file ./ecoli/1-consensus/cns_final.fasta
Step 3: Assemble contigs
After correcting the raw reads, we assemble the contigs using the following command. If the correcting-step is not done, the command automatically runs the correcting-step first.
$ necat.pl assemble ecoli_config.txt
The assembled contigs are in the file ./ecoli/4-fsa/contigs.fasta.
Step 4: Bridge contigs
After assembling the contigs, we run the bridging-step using the following command. The command checks and runs the preceding steps first.
$ necat.pl bridge ecoli_config.txt
The bridged contigs are in the file ./ecoli/6-bridge_contigs/bridged_contigs.fasta.
If POLISH_CONTIGS is set, the pipeline uses the corrected reads to polish the bridged contigs. The polished contigs are in the file ./ecoli/6-bridge_contigs/polished_contigs.fasta
Running with multiple computation nodes
On PBS and SGE systems, users may plan to run NECAT with multiple computation nodes. This is done by setting the config file (Step 1 of Quick Start) like
USE_GRID=true
GRID_NODE=4
In the above example, 4 computation nodes will be used and each computation node will run with THREADS CPU threads.
Citation
Chen Y, Nie F, Xie S Q, et al. Efficient assembly of nanopore reads via highly accurate and intact error correction[J]. Nature Communications, 2021, 12(1): 1-10.
Introduction
NECAT is an error correction and de-novo assembly tool for Nanopore long noisy reads.
If you are interested in calling Structural Variants from Nanopore reads, you are welcome to have a try our necatsv.
Installation
We have sucessfully tested
NECATonIf you meet problems in running
NECATlikePlease update your
perlto a newer version (such as v5.26).There are two ways to install
NECAT.Install from executable binaries
Build from source codes
After installation, all the executable files can be found in
NECAT/Linux-amd64/bin. The command lineabove is used for adding
NECAT/Linux-amd64/binto the systemPATH.Quick Start
Before running
NECATplease do not forget to addNECAT/Linux-amd64/binto the systemPATH.Step 1: Create a config file
Create a config file template using the following command:
The template looks like
Filling and modifying the relative information, we have
read_list.txtin the second line above contains the full paths of all read files. It looks likePlease note that files in
read_list.txtneed not be the same format. Each file can independently be eitherFASTAorFASTQ, and can further be compressed in GNU Zip (gzip) format.Step 2: Correct raw reads
Correct the raw noisy reads using the following command:
The pipeline only corrects longest 40X (
PREP_OUTPUT_COVERAGE) raw reads. The corrected reads are in the files./ecoli/1-consensus/cns_iter${NUM_ITER}/cns.fasta.The longest 30X (
CNS_OUTPUT_COVERAGE) corrected reads are extracted for assembly, which are in the file./ecoli/1-consensus/cns_final.fastaStep 3: Assemble contigs
After correcting the raw reads, we assemble the contigs using the following command. If the correcting-step is not done, the command automatically runs the correcting-step first.
The assembled contigs are in the file
./ecoli/4-fsa/contigs.fasta.Step 4: Bridge contigs
After assembling the contigs, we run the bridging-step using the following command. The command checks and runs the preceding steps first.
The bridged contigs are in the file
./ecoli/6-bridge_contigs/bridged_contigs.fasta.If
POLISH_CONTIGSis set, the pipeline uses the corrected reads to polish the bridged contigs. The polished contigs are in the file./ecoli/6-bridge_contigs/polished_contigs.fastaRunning with multiple computation nodes
On PBS and SGE systems, users may plan to run
NECATwith multiple computation nodes. This is done by setting the config file (Step 1 of Quick Start) likeIn the above example,
4computation nodes will be used and each computation node will run withTHREADSCPU threads.Citation
Chen Y, Nie F, Xie S Q, et al. Efficient assembly of nanopore reads via highly accurate and intact error correction[J]. Nature Communications, 2021, 12(1): 1-10.
Contact