Beeswarm plots (aka column scatter plots or violin scatter plots) are a
way of plotting points that would ordinarily overlap so that they fall
next to each other instead. In addition to reducing overplotting, it
helps visualize the density of the data at each point (similar to a
violin plot), while still showing each data point individually.
ggbeeswarm provides two different methods to create beeswarm-style
plots using ggplot2. It does this by adding two
new ggplot geom objects:
geom_quasirandom: Uses a van der Corput
sequence or
Tukey texturing (Tukey and Tukey “Strips displaying empirical
distributions: I. textured dot strips”) to space the dots to avoid
overplotting. This uses
sherrillmix/vipor.
geom_beeswarm: Uses the
beeswarm
library to do point-size based offset.
Features:
Can handle categorical variables on the y-axis (thanks @smsaladi,
@koncina)
Automatically dodges if a grouping variable is categorical and
dodge.width is specified (thanks @josesho)
See the examples below.
Installation
This package is on CRAN so install should be a simple:
install.packages('ggbeeswarm')
If you want the development version from GitHub, you can do:
devtools::install_github("eclarke/ggbeeswarm")
Examples
Here is a comparison between geom_jitter and geom_quasirandom on the
iris dataset:
set.seed(12345)
library(ggplot2)
library(ggbeeswarm)
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()
# With categorical y-axis
ggplot(mpg,aes(hwy, class)) + geom_quasirandom(groupOnX=FALSE)
# Some groups may have only a few points. Use `varwidth=TRUE` to adjust width dynamically.
ggplot(mpg,aes(class, hwy)) + geom_quasirandom(varwidth = TRUE)
# With categorical y-axis
ggplot(mpg,aes(hwy, class)) + geom_beeswarm(size=.5)
# Also watch out for points escaping from the plot with geom_beeswarm
ggplot(mpg,aes(hwy, class)) + geom_beeswarm(size=.5) + scale_y_discrete(expand=expansion(add=c(0.5,1)))
Beeswarm-style plots with ggplot2
Introduction
Beeswarm plots (aka column scatter plots or violin scatter plots) are a way of plotting points that would ordinarily overlap so that they fall next to each other instead. In addition to reducing overplotting, it helps visualize the density of the data at each point (similar to a violin plot), while still showing each data point individually.
ggbeeswarmprovides two different methods to create beeswarm-style plots using ggplot2. It does this by adding two new ggplot geom objects:geom_quasirandom: Uses a van der Corput sequence or Tukey texturing (Tukey and Tukey “Strips displaying empirical distributions: I. textured dot strips”) to space the dots to avoid overplotting. This uses sherrillmix/vipor.geom_beeswarm: Uses the beeswarm library to do point-size based offset.Features:
dodge.widthis specified (thanks @josesho)See the examples below.
Installation
This package is on CRAN so install should be a simple:
If you want the development version from GitHub, you can do:
Examples
Here is a comparison between
geom_jitterandgeom_quasirandomon theirisdataset:geom_quasirandom()
Using
geom_quasirandom:Alternative methods
geom_quasirandomcan also use several other methods to distribute points. For example:geom_beeswarm()
Using
geom_beeswarm:Alternative methods
Different point distribution priority
Corral runaway points
Authors: Erik Clarke, Scott Sherrill-Mix, and Charlotte Dawson