KEGG Orthology Based Annotation System (KOBAS) is a standalone Python
application in Bioinformatics. KOBAS can assign appropriate KO terms for
queried sequences based on similarity search, and it can further discover
enriched KO terms among the annotation results by frequency of pathways or
statistical significance of pathways.
Installation
Please refer to install.txt under docs directory in KOBAS source
Usage
KOBAS composes of two major scripts, which are similar to usual command line
tools: the behavor can be adjusted by swiches and configuration file. The whole
work flow can be divided to two steps: sequence annotation and pathway discovery.
Sequence Annotation
Given a FASTA file of protein, you can annotate them by this command:
blast2ko.py <FASTA FILE>
All the result are printed default, you can redirect into a file. If the input
are nuclear acid sequences, you can adjust blast2ko behavor by the command
option “-p” as follows:
blast2ko.py -p blastx <FASTA FILE>
Pathway Discovery
Once the annotation is complete, pathfind can be used to discover enriched
pathways by two ways: frequency prior and statistally significance prior.
You can find the most frequent pathways among the annotation result:
pathfind.py <ANNOATION FILE>
Or, you can also find the most statistically enriched pathways:
‘-k’ option specifies the statistical method to rank, ‘b’ represents binomial
distribution, ‘c’ represents chisquare distribution and ‘h’ represents
hypergeometry distrubtion. is three-charater abbreviation
of the organism in KEGG GENES as backgroud distribution. Also, you can specifiy
an data file as backround like this:
Please refer other details to online manpage by the command “ -h”.
Contact
Please, report bugs, comments and suggestions to the email below. Also, if you
succeed in compiling this module under a platform other than Linux, please, send
us a note.
KEGG Orthology Based Annotation System (KOBAS) is a standalone Python application in Bioinformatics. KOBAS can assign appropriate KO terms for queried sequences based on similarity search, and it can further discover enriched KO terms among the annotation results by frequency of pathways or statistical significance of pathways.
Please refer to install.txt under docs directory in KOBAS source
KOBAS composes of two major scripts, which are similar to usual command line tools: the behavor can be adjusted by swiches and configuration file. The whole work flow can be divided to two steps: sequence annotation and pathway discovery.
Sequence Annotation
Given a FASTA file of protein, you can annotate them by this command:
All the result are printed default, you can redirect into a file. If the input are nuclear acid sequences, you can adjust blast2ko behavor by the command option “-p” as follows:
Pathway Discovery
Once the annotation is complete, pathfind can be used to discover enriched pathways by two ways: frequency prior and statistally significance prior. You can find the most frequent pathways among the annotation result:
Or, you can also find the most statistically enriched pathways:
‘-k’ option specifies the statistical method to rank, ‘b’ represents binomial distribution, ‘c’ represents chisquare distribution and ‘h’ represents hypergeometry distrubtion. is three-charater abbreviation of the organism in KEGG GENES as backgroud distribution. Also, you can specifiy an data file as backround like this:
Please refer other details to online manpage by the command “ -h”.
Please, report bugs, comments and suggestions to the email below. Also, if you succeed in compiling this module under a platform other than Linux, please, send us a note.
Authors: Xizeng Mao, Jianmin Wu, Tao Cai, Chen Xie, Liping Wei
Email: kobas@mail.cbi.pku.edu.cn
Web: http://kobas.cbi.pku.edu.cn
Enjoy :)