目录

cvaNMF logo

cvanmf

ci-cd

An implementation of bicrossvalidation for Non-negative Matrix Factorisation (NMF) rank selection, along with methods for analysis and visualisation of NMF decomposition.

For details on the method, please see:

Graphical Abstract

cvaNMF asbtract

The left section is a schematic depicting the procedures implemented in cvaNMF; on the right is a summary of results reported in the manuscript (in preparation).

Documentation

Documentation can be found at readthedocs.

Installation

cvanmf is available from bioconda

conda install --name {envname} -c bioconda -c conda-forge cvanmf

or pip

pip install cvanmf

Overview

NMF is an unsupervised machine learning techniques which provides a representation of a numeric input matrix XX as a mixture of kk of underlying parts. In this package we refer to each of these parts as a signature. Each signature can be described by how much each feature contributes to it. For example, we can represent the abundance of bacteria in the human gut as a mixture of 5 signatures.

The number of signatures (or rank, kk) has to specified when performing NMF, and selecting an appropriate value for kk is an important step. We implement bicrossvalidation with Gabriel style holdouts. Broadly speaking, this method holds out one block of the matrix (AA) and makes an estimate of it (AA') using the remainder of the matrix. How closely AA' resembles AA is used to identify and appropriate rank.

Input

Any numeric matrix can be used as input, with samples on columns, and features on rows. Each row should describe something similar, e.g. each is the abundance of a microbe, or abundance of a transcript. A minimum of 2 samples is required. When number of samples nn is close to the number of signatures kk, signatures are likely to represent individual samples rather than broad patterns.

Container

We provide a container image for linux/amd64 on through the Github Container Repository (GHCR), with the current version being ghcr.io/apduncan/cvanmf:latest/. This is intended either for running cvanmf command-line tools, or using as a container for using cvanmf within pipelines. Please see the documentation for more details.

References

If you use this tool please cite: For details on the method, please see:

For background on NMF see:

关于

用于非负矩阵分解的Python库,支持交叉验证和模型选择

11.7 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802047560号