目录

kcounter

PyPI GitHub Workflow Status

A simple package for counting DNA k-mers in Python. Written in Rust.

Instalation

There are two ways to install kcounter:

  • Using pip:
pip install kcounter
  • Using conda:
conda install -c conda-forge -c bioconda kcounter

Usage

Currently, kcounter provides a single function, count_kmers, that returns a dictionary containing the k-mers of the chosen size.

>>> import kcounter
>>> kcounter.count_kmers('AAACTTTTTT', 3)
{'AAA': 1.0, 'ACT': 1.0, 'AAC': 1.0, 'CTT': 1.0, 'TTT': 4.0}
>>> kcounter.count_kmers('AAACTTTTTT', 4)
{'AACT': 1.0, 'CTTT': 1.0, 'ACTT': 1.0, 'AAAC': 1.0, 'TTTT': 3.0}

The relative_frequencies parameter can be used to obtain relative k-mer frequencies:

>>> kcounter.count_kmers('AAACTTTTTT', 3, relative_frequencies=True)
{'AAC': 0.125, 'TTT': 0.5, 'CTT': 0.125, 'ACT': 0.125, 'AAA': 0.125}

The canonical_kmers parameters aggregates the counts of reverse-complement k-mers (eg.: AGC/GCT):

>>> kcounter.count_kmers('AAACTTTTTT', 3, canonical_kmers=True)
{'ACT': 1.0, 'AAA': 5.0, 'AAC': 1.0, 'AAG': 1.0}

Plans for future versions:

  • Performance improvements.
  • Add an parameter that makes the function return a sparse k-mer counts.
  • Implement a function that returns a numpy array.
关于

用 Rust 实现并提供 Python 接口的 DNA k-mer 计数工具。

56.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号