目录

aricode

R-CMD-check CRAN
Status Coverage
status Lifecycle:
stable

A package for efficient computations of standard clustering comparison measures

Installation

Stable version on the CRAN.

install.packages("aricode")

The development version is available via:

devtools::install_github("jchiquet/aricode")

Description

Computation of measures for clustering comparison (ARI, AMI, NID and even the χ2\chi^2 distance) are usually based on the contingency table. Traditional implementations (e.g., function adjustedRandIndex of package mclust) are in Ω(n+uv)\Omega(n + u v) where

  • nn is the size of the vectors the classifications of which are to be compared,
  • uu and vv are the respective number of classes in each vectors.

In aricode we propose an implementation, based on radix sort, that is in Θ(n)\Theta(n) in time and space.
Importantly, the complexity does not depend on uu and vv. Our implementation of the ARI for instance is one or two orders of magnitude faster than some standard implementation in R.

Available measures and functions

The functions included in aricode are:

  • (A)RI: computes the (adjusted) Rand index
  • MARI: computes the modified adjusted rand index as defined in Sundqvist et al, 2023
  • NID: computes the normalized information distance
  • NMI: computes the normalized mutual information
  • NVI: computes the the normalized variation information
  • AMI: computes the adjusted mutual information
  • Chi2: computes the Chi-square statistics
  • Frobenius compute the Frobenius norm between two classification as defined in Arlot et al, 2019
  • entropy: computes the conditional and joint entropies
  • compare_clustering: computes all clustering comparison measures at once
  • sort_pairs: radix sort for pairs of elements

Timings

Here are some timings to compare the cost of computing the adjusted Rand Index with aricode or with the commonly used function adjustedRandIndex of the mclust package: the cost of the latter can be prohibitive for large vectors:

关于

用于计算聚类评估指标(如调整兰德指数、互信息等)的R包

1.2 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号