5.12 Creating a Balloon Plot

5.12.1 Problem

You want to make a balloon plot, where the area of the dots is proportional to their numerical value.

5.12.2 Solution

Use geom_point() with scale_size_area(). For this example, we’ll filter the data set countries to only include data from the year 2009, for certain countries we have specified in countrylist:

If we just map GDP to size, the value of GDP gets mapped to the radius of the dots (Figure 5.36, left), which is not what we want; a doubling of value results in a quadrupling of area, and this will distort the interpretation of the data. We instead want to map the value of GDP to the area of the dots, which we can do this using scale_size_area() (Figure 5.36, right):

Balloon plot with value mapped to radius (left); With value mapped to area (right)Balloon plot with value mapped to radius (left); With value mapped to area (right)

Figure 5.36: Balloon plot with value mapped to radius (left); With value mapped to area (right)

5.12.3 Discussion

The example here is a scatter plot, but that is not the only way to use balloon plots. It may also be useful to use balloon plots to represent values on a grid, where the x- and y-axes are categorical, as in Figure 5.37:

Balloon plot with categorical axes and text labels. Qith guide points to help position text (left); Without guide points (right)Balloon plot with categorical axes and text labels. Qith guide points to help position text (left); Without guide points (right)

Figure 5.37: Balloon plot with categorical axes and text labels. Qith guide points to help position text (left); Without guide points (right)

In this example we’ve used a few tricks to add the text labels under the circles. First, we used vjust = 1.3 to justify the top of text slightly below the y coordinate. Next, we wanted to set the y coordinate so that it is at the bottom of each circle. This requires a little wrangling and arithmetic: we need to first convert the levels of Hair and Eye into numeric values, which involves converting these variables from being a character vector to being a factor variable, and then converting them again into a numeric variable. We then take the numeric value of Hair and subtract a small value from it, where the value depends in some way on count. This actually requires taking the square root of count, since the radius has a linear relationship with the square root of count. The number that this value is divided by (34 in this case) is found by trial and error; it depends on the particular data values, radius, text size, and output image size.

To help find the correct y offset, we can add guide points in red and adjusted the value until they lined up with the bottom of each circle. Once we have the correct value, we can place the text and remove the points.

The text under the circles is in a shade of grey. This is so that it doesn’t jump out at the viewer and overwhelm the perceptual impact of the circles, but is still available if the viewer wants to know the exact values.

5.12.4 See Also

To add labels to the circles, see Recipes Recipe 5.11 and Recipe 7.1.

See Recipe 5.4 for ways of mapping variables to other aesthetics in a scatter plot.