目录

DNAbarcodes

Simple generator of DNA barcodes for sequencing experiments

A pair of simple scripts to generate DNA barcodes with nice properties:

  • arbitrary length of code
  • a minimum Hamming distance between each code
  • a GC content percentage upper and lower bound for each code
  • a guarantee that no code will contain runs of longer than three of the same base

Dependencies

  • Python 3.6 (though >3.4 may work)
  • Numba 0.36
  • Numpy > 1.12

Best best is to use an Anaconda python distribution. Anaconda 5.0.1 of Python 3.6 worked out of the box for me.

Usage

There are two scripts: generate_barcodes.py and validate_barcodes.py. generate_barcodes.py takes a length and filename, and outputs all possible codes of that length:

$ python generate_barcodes <length> <output file>

Once the potential barcode strings have been generated, you can run validate_barcodes.py to get your barcodes by specifying the file of all possible codes, the name where your barcodes should be written, and how many barcodes you need:

$ python validate_barcodes <input file> <output file> <number of codes>
关于

用于生成和评估DNA条形码序列,以支持高通量测序实验

50.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802047560号