## 6.9 Making a Violin Plot

### 6.9.1 Problem

You want to make a violin plot to compare density estimates of different groups.

### 6.9.2 Solution

Use `geom_violin()` (Figure 6.22):

``````library(gcookbook) # Load gcookbook for the heightweight data set

# Create a base plot using the heightweight data set
hw_p <- ggplot(heightweight, aes(x = sex, y = heightIn))

hw_p +
geom_violin()``````

### 6.9.3 Discussion

Violin plots are a way of comparing multiple data distributions. With ordinary density curves, it is difficult to compare more than just a few distributions because the lines visually interfere with each other. With a violin plot, it’s easier to compare several distributions since they’re placed side by side.

A violin plot is a kernel density estimate, mirrored so that it forms a symmetrical shape. Traditionally, they also have narrow box plots overlaid, with a white dot at the median, as shown in Figure 6.23. Additionally, the box plot outliers are not displayed, which we do by setting `outlier.colour = NA`:

``````hw_p +
geom_violin() +
geom_boxplot(width = .1, fill = "black", outlier.colour = NA) +
stat_summary(fun.y = median, geom = "point", fill = "white", shape = 21, size = 2.5)``````

In this example we layered the objects from the bottom up, starting with the violin, then the box plot, then the white dot at the median, which is calculated using `stat_summary()`.

The default range goes from the minimum to maximum data values; the flat ends of the violins are at the extremes of the data. It’s possible to keep the tails, by setting `trim = FALSE` (Figure 6.24):

``````hw_p +
geom_violin(trim = FALSE)``````

By default, the violins are scaled so that the total area of each one is the same (if `trim = TRUE`, then it scales what the area would be including the tails). Instead of equal areas, you can use `scale = "count"` to scale the areas proportionally to the number of observations in each group (Figure 6.25). In this example, there are slightly fewer females than males, so the female violin becomes slightly narrower than before:

``````# Scaled area proportional to number of observations
hw_p +
geom_violin(scale = "count")``````

To change the amount of smoothing, use the adjust parameter, as described in Recipe 6.3. The default value is 1; use larger values for more smoothing and smaller values for less smoothing (Figure 6.26):

``````# More smoothing
hw_p +