目录

Coding Potential Calculator

Introduction

Coding Potential Calculator (CPC) is a Support Vector Machine-based classifier to assess the protein-coding potential of a transcript (i.e whether a cDNA/RNA transcript could encode a peptide or not) based on six biologically meaningful sequence features. It takes nucleotide FASTA sequences as input, and generate output about the coding status and the “supporting evidence” for the sequence.

Pre-requisite:

  1. NCBI BLAST package: a local version could be downloaded from http://www.ncbi.nlm.nih.gov/blast/

  2. A relatively comprehensive protein database. UniRef90 and NCBI nr should be both okay.
    The database should be named as “prot_db”, and put under the data/ subdir.

Install:

  1. Unpack the tarball:

    tom@linux$ gzip -dc cpc-0.9-r2.tar.gz | tar xf -

  2. Build third-part packages:

    tom@linuxcdcpc0.9r2tom@linuxcd cpc-0.9-r2 tom@linux export CPC_HOME=”PWD"tom@linuxPWD" tom@linux cd libs/libsvm tom@linuxgzipdclibsvm2.81.tar.gztarxftom@linuxgzip -dc libsvm-2.81.tar.gz | tar xf - tom@linux cd libsvm-2.81 tom@linuxmake clean && make tom@linux cd ../.. tom@linuxgzipdcestate.tar.gztarxftom@linuxgzip -dc estate.tar.gz | tar xf - tom@linux cd estate tom@linux$ make clean && make

  3. Format BLAST database, named it as “prot_db”, and put under the cpc/data/.

    tom@linuxcdcdCPC_HOME/data tom@linux$ formatdb -i (your_fasta_file) -p T -n prot_db

  4. Run the predict

    tom@linuxcdcdCPC_HOME tom@linux$ bin/run_predict.sh (input_seq) (result_in_table) (working_dir) (result_evidence)

========= See the website for tutorial and more details. (http://cpc.cbi.pku.edu.cn)

Contact: cpc@mail.cbi.pku.edu.cn

关于

用于基因序列编码潜能预测的计算工具

1.2 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802047560号