Chapter 8 Axes

The x- and y-axes provide context for interpreting the displayed data. ggplot will display the axes with defaults that look good in most cases, but you might want to control, for example, the axis labels, the number and placement of tick marks, the tick mark labels, and so on. In this chapter, I’ll cover how to fine-tune the appearance of the axes.

8.1 Swapping X- and Y-Axes

8.1.1 Problem

You want to swap the x- and y-axes on a graph.

8.1.2 Solution

Use coord_flip() to flip the axes (Figure 8.1):

A box plot with regular axes (left); With swapped axes (right)A box plot with regular axes (left); With swapped axes (right)

Figure 8.1: A box plot with regular axes (left); With swapped axes (right)

8.1.3 Discussion

For a scatter plot, it is trivial to change what goes on the vertical axis and what goes on the horizontal axis: just exchange the variables mapped to x and y. But not all the geoms in ggplot treat the x- and y-axes equally. For example, box plots summarize the data along the y-axis, the lines in line graphs move in only one direction along the x-axis, error bars have a single x value and a range of y values, and so on. If you’re using these geoms and want them to behave as though the axes are swapped, coord_flip() is what you need.

Sometimes when the axes are swapped, the order of items will be the reverse of what you want. On a graph with standard x- and y-axes, the x items start at the left and go to the right, which corresponds to the normal way of reading, from left to right. When you swap the axes, the items still go from the origin outward, which in this case will be from bottom to top – but this conflicts with the normal way of reading, from top to bottom. Sometimes this is a problem, and sometimes it isn’t. If the x variable is a factor, the order can be reversed by using scale_x_$discrete() with limits = rev(levels(...)), as in Figure 8.2:

A box plot with swapped axes and x-axis order reversed

Figure 8.2: A box plot with swapped axes and x-axis order reversed

8.1.4 See Also

If the variable is continuous, see Recipe 8.3 to reverse the direction.

8.2 Setting the Range of a Continuous Axis

8.2.1 Problem

You want to set the range (or limits) of an axis.

8.2.2 Solution

You can use xlim() or ylim() to set the minimum and maximum values of a continuous axis. Figure 8.3 shows one graph with the default y limits, and one with manually set y limits:

Box plot with default range (left); With manually set range (right)Box plot with default range (left); With manually set range (right)

Figure 8.3: Box plot with default range (left); With manually set range (right)

The latter example sets the y range from 0 to the maximum value of the weight column, though a constant value (like 10) could instead be used as the maximum.

8.2.3 Discussion

ylim() is shorthand for setting the limits with scale_y_continuous(). (The same is true for xlim() and scale_x_continuous().) The following are equivalent:

Sometimes you will need to set other properties of scale_y_continuous(), and in these cases using xlim() and scale_y_continuous() together may result in some unexpected behavior, because only the first of the directives will have an effect. In these two examples, ylim(0, 10) should set the y range from 0 to 10, and scale_y_continuous(breaks=c(0, 5, 10)) should put tick marks at 0, 5, and 10. However, in both cases, only the second directive has any effect:

To make both changes work, get rid of ylim() and set both limits and breaks in scale_y_continuous():

In ggplot, there are two ways of setting the range of the axes. The first way is to modify the scale, and the second is to apply a coordinate transform. When you modify the limits of the x or y scale, any data outside of the limits is removed – that is, the out-of-range data is not only not displayed, it is removed from consideration entirely. (It will also print a warning when this happens.)

With the box plots in these examples, if you restrict the y range so that some of the original data is clipped, the box plot statistics will be computed based on clipped data, and the shape of the box plots will change.

With a coordinate transform, the data is not clipped; in essence, it zooms in or out to the specified range. Figure 8.4 shows the difference between the two methods:

Smaller y range using a scale (data has been dropped, so the box plots have changed shape; left); "Zooming in" using a coordinate transform (right)Smaller y range using a scale (data has been dropped, so the box plots have changed shape; left); "Zooming in" using a coordinate transform (right)

Figure 8.4: Smaller y range using a scale (data has been dropped, so the box plots have changed shape; left); “Zooming in” using a coordinate transform (right)

Finally, it’s also possible to expand the range in one direction, using expand_limits() (Figure 8.5). You can’t use this to shrink the range, however:

Box plot on which y range has been expanded to include 0

Figure 8.5: Box plot on which y range has been expanded to include 0

8.3 Reversing a Continuous Axis

8.3.1 Problem

You want to reverse the direction of a continuous axis.

8.3.2 Solution

Use scale_y_reverse() or scale_x_reverse() (Figure 8.6). The direction of an axis can also be reversed by specifying the limits in reversed order, with the maximum first, then the minimum:

Box plot with reversed y-axisBox plot with reversed y-axis

Figure 8.6: Box plot with reversed y-axis

8.3.3 Discussion

Like scale_y_continuous(), scale_y_reverse() does not work with ylim(). (The same is true for the x-axis properties.) If you want to reverse an axis and set its range, you must do it within the scale_y_reverse() statement, by setting the limits in reversed order (Figure 8.7):

Box plot with reversed y-axis with manually set limits

Figure 8.7: Box plot with reversed y-axis with manually set limits

8.3.4 See Also

To reverse the order of items on a discrete axis, see Recipe 8.4.

8.4 Changing the Order of Items on a Categorical Axis

8.4.1 Problem

You want to change the order of items on a categorical axis.

8.4.2 Solution

For a categorical (or discrete) axis – one with a factor mapped to it – the order of items can be changed by setting limits in scale_x_discrete() or scale_y_discrete().

To manually set the order of items on the axis, specify limits with a vector of the levels in the desired order. You can also omit items with this vector, as shown in Figure 8.8, left:

8.4.3 Discussion

You can also use this method to display a subset of the items on the axis. This will show only ctrl and trt1 (Figure 8.8, right). Note that because data is removed, it will emit a warning when you do this.

Box plot with manually specified items on the x-axis (left); With only two items (right)Box plot with manually specified items on the x-axis (left); With only two items (right)

Figure 8.8: Box plot with manually specified items on the x-axis (left); With only two items (right)

To reverse the order, set limits = rev(levels(...)), and put the factor inside. This will reverse the order of the PlantGrowth$group factor, as shown in Figure 8.9:

Box plot with order reversed on the x-axis

Figure 8.9: Box plot with order reversed on the x-axis

8.4.4 See Also

To reorder factor levels based on data values from another column, see Recipe 15.9.

8.5 Setting the Scaling Ratio of the X- and Y-Axes

8.5.1 Problem

You want to set the ratio at which the x- and y-axes are scaled.

8.5.2 Solution

Use coord_fixed(). This will result in a 1:1 scaling between the x- and y-axes, as shown in Figure 8.10:

8.5.3 Discussion

The marathon data set contains runners’ marathon and half-marathon times. In this case it might be useful to force the x- and y-axes to have the same scaling.

It’s also helpful to set the tick spacing to be the same, by setting breaks in scale_y_continuous() and scale_x_continuous() (also in Figure 8.10):

Scatter plot with equal scaling of axes (left); With tick marks at specified positions (right)Scatter plot with equal scaling of axes (left); With tick marks at specified positions (right)

Figure 8.10: Scatter plot with equal scaling of axes (left); With tick marks at specified positions (right)

If, instead of an equal ratio, you want some other fixed ratio between the axes, set the ratio parameter. With the marathon data, we might want the axis with half-marathon times stretched out to twice that of the axis with the marathon times (Figure 8.11). We’ll also add tick marks twice as often on the x-axis:

Scatter plot with a 1/2 scaling ratio for the axes

Figure 8.11: Scatter plot with a 1/2 scaling ratio for the axes

8.6 Setting the Positions of Tick Marks

8.6.1 Problem

You want to set where the tick marks appear on the axis.

8.6.2 Solution

Usually ggplot does a good job of deciding where to put the tick marks, but if you want to change them, set breaks in the scale (Figure 8.12):

Box plot with automatic tick marks (left); With manually set tick marks (right)Box plot with automatic tick marks (left); With manually set tick marks (right)

Figure 8.12: Box plot with automatic tick marks (left); With manually set tick marks (right)

8.6.3 Discussion

The location of the tick marks defines where major grid lines are drawn. If the axis represents a continuous variable, minor grid lines, which are fainter and unlabeled, will by default be drawn halfway between each major grid line.

You can also use the seq() function or the : operator to generate vectors for tick marks:

If the axis is discrete instead of continuous, then there is by default a tick mark for each item. For discrete axes, you can change the order of items or remove them by specifying the limits (see Recipe 8.4). Setting breaks will change which of the levels are labeled, but will not remove them or change their order. Figure 8.13 shows what happens when you set limits and breaks (the warning is because we’re using only two of the three levels for group and therefore are dropping some rows):

For a discrete axis, setting limits reorders and removes items, and setting breaks controls which items have labels

Figure 8.13: For a discrete axis, setting limits reorders and removes items, and setting breaks controls which items have labels

8.6.4 See Also

To remove the tick marks and labels (but not the data) from thegraph, see Recipe 8.7.

8.7 Removing Tick Marks and Labels

8.7.1 Problem

You want to remove tick marks and labels.

8.7.2 Solution

To remove just the tick labels, as in Figure 8.14 (left), use theme(axis.text.y = element_blank()) (or do the same for axis.text.x). This will work for both continuous and categorical axes:

To remove the tick marks, use theme(axis.ticks=element_blank()). This will remove the tick marks on both axes. (It’s not possible to hide the tick marks on just one axis.) In this example, we’ll hide all tick marks as well as the y tick labels (Figure 8.14, center):

To remove the tick marks, the labels, and the grid lines, set breaks to NULL (Figure 8.14, right):

(ref:cap-FIG-AXES-SET-TICKS-NONE) No tick labels on y-axis (left); No tick marks and no tick labels on y-axis (middle); With breaks=NULL (right)
(ref:cap-FIG-AXES-SET-TICKS-NONE)(ref:cap-FIG-AXES-SET-TICKS-NONE)(ref:cap-FIG-AXES-SET-TICKS-NONE)

Figure 8.14: (ref:cap-FIG-AXES-SET-TICKS-NONE)

This will work for continuous axes only; if you remove items from a categorical axis using limits, as in Recipe 8.4, the data with that value won’t be shown at all.

8.7.3 Discussion

There are actually three related items that can be controlled: tick labels, tick marks, and the grid lines. For continuous axes, ggplot() normally places a tick label, tick mark, and major grid line at each value of breaks. For categorical axes, these things go at each value of limits.

The tick labels on each axis can be controlled independently. However, the tick marks and grid lines must be controlled all together.

8.8 Changing the Text of Tick Labels

8.8.1 Problem

You want to change the text of tick labels.

8.8.2 Solution

Consider the scatter plot in Figure 8.15, where height is reported in inches:

To set arbitrary labels, as in Figure 8.15 (right), pass values to breaks and labels in the scale. One of the labels has a newline (\n) character, which tells ggplot to put a line break there:

Scatter plot with automatic tick labels (left); With manually specified labels on the y-axis (right)Scatter plot with automatic tick labels (left); With manually specified labels on the y-axis (right)

Figure 8.15: Scatter plot with automatic tick labels (left); With manually specified labels on the y-axis (right)

8.8.3 Discussion

Instead of setting completely arbitrary labels, it is more common to have your data stored in one format, while wanting the labels to be displayed in another. We might, for example, want heights to be displayed in feet and inches (like 5’6“) instead of just inches. To do this, we can define a formatter function, which takes in a value and returns the corresponding string. For example, this function will convert inches to feet and inches:

Here’s what it returns for values 56–64 (the backslashes are there as escape characters, to distinguish the quotes in a string from the quotes that delimit a string):

Now we can pass our function to the scale, using the labels parameter (Figure 8.16, left):

Here, the automatic tick marks were placed every five inches, but that looks a little off for this data. We can instead have ggplot set tick marks every four inches, by specifying breaks (Figure 8.16, right):

Scatter plot with a formatter function (left); With manually specified breaks on the y-axis (right)Scatter plot with a formatter function (left); With manually specified breaks on the y-axis (right)

Figure 8.16: Scatter plot with a formatter function (left); With manually specified breaks on the y-axis (right)

Another common task is to convert time measurements to HH:MM:SS format, or something similar. This function will take numeric minutes and convert them to this format, rounding to the nearest second (it can be customized for your particular needs):

Running it on some sample numbers yields:

The scales package, which is installed with ggplot2, comes with some built-in formatting functions:

  • comma() adds commasto numbers, in the thousand, million, billion, etc. places.
  • dollar() adds a dollar sign and rounds to the nearest cent.
  • percent() multiplies by 100, rounds to the nearest integer, and adds a percent sign.
  • scientific() gives numbers in scientific notation, like 3.30e+05, for large and small numbers.

If you want to use these functions, you must first load the scales package, with library(scales).

8.9 Changing the Appearance of Tick Labels

8.9.1 Problem

You want to change the appearance of tick labels.

8.9.2 Solution

In Figure 8.17 (left), we’ve manually set the labels to be long-long enough that they overlap:

To rotate the text 90 degrees counterclockwise (Figure 8.17, middle), use:

Rotating the text 30 degrees (Figure 8.17, right) uses less vertical space and makes the labels easier to read without tilting your head:

X-axis tick labels rotated 0 (left), 90 (middle), and 30 degrees (right)X-axis tick labels rotated 0 (left), 90 (middle), and 30 degrees (right)X-axis tick labels rotated 0 (left), 90 (middle), and 30 degrees (right)

Figure 8.17: X-axis tick labels rotated 0 (left), 90 (middle), and 30 degrees (right)

The hjust and vjust settings specify the horizontal alignment (left/center/right) and vertical alignment (top/middle/bottom).

8.9.3 Discussion

Besides rotation, other text properties, such as size, style (bold/italic/normal), and the font family (such as Times or Helvetica) can be set with element_text(), as shown in Figure 8.18:

X-axis tick labels with manually specified appearance

Figure 8.18: X-axis tick labels with manually specified appearance

In this example, the size is set to rel(0.9), which means that it is 0.9 times the size of the base font size for the theme.

These commands control the appearance of only the tick labels, on only one axis. They don’t affect the other axis, the axis label, the overall title, or the legend. To control all of these at once, you can use the theming system, as discussed in Recipe 9.3.

8.9.4 See Also

See Recipe 9.2 for more about controlling the appearance of the text.

8.10 Changing the Text of Axis Labels

8.10.1 Problem

You want to change the text of axis labels.

8.10.2 Solution

Use xlab() or ylab() to change the text of the axis labels (Figure 8.19):

Scatter plot with the default axis labels (left); Manually specified labels for the x- and y-axes (right)Scatter plot with the default axis labels (left); Manually specified labels for the x- and y-axes (right)

Figure 8.19: Scatter plot with the default axis labels (left); Manually specified labels for the x- and y-axes (right)

8.10.3 Discussion

By default the graphs will just use the column names from the data frame as axis labels. This might be fine for exploring data, but for presenting it, you may want more descriptive axis labels.

Instead of xlab() and ylab(), you can use labs():

Another way of setting the axis labels is in the scale specification, like this:

This may look a bit awkward, but it can be useful if you’re also setting other properties of the scale, such as the tick mark placement, range, and so on.

This also applies, of course, to other axis scales, such as scale_y_continuous(), scale_x_discrete(), and so on.

You can also add line breaks with \n, as shown in Figure 8.20:

X-axis label with a line break

Figure 8.20: X-axis label with a line break

8.11 Removing Axis Labels

8.11.1 Problem

You want to remove the label on an axis.

8.11.2 Solution

For the x-axis label, use xlab(NULL). For the y-axis label, use ylab(NULL).

We’ll hide the x-axis in this example (Figure 8.21):

8.11.3 Discussion

Sometimes axis labels are redundant or obvious from the context, and don’t need to be displayed. In the example here, the x-axis represents group, but this should be obvious from the context. Similarly, if the y tick labels had kg or some other unit in each label, the axis label “weight” would be unnecessary.

Another way to remove the axis label is to set it to an empty string. However, if you do it this way, the resulting graph will still have space reserved for the text, as shown in the graph on the right in Figure 8.21:

X-axis label with `NULL` (left); With the label set to `""` (right)X-axis label with `NULL` (left); With the label set to `""` (right)

Figure 8.21: X-axis label with NULL (left); With the label set to "" (right)

When you use theme() to set axis.title.x = element_blank(), the name of the x or y scale is unchanged, but the text is not displayed and no space is reserved for it. When you set the label to "", the name of the scale is changed and the (empty) text does display.

8.12 Changing the Appearance of Axis Labels

8.12.1 Problem

You want to change the appearance of axis labels.

8.12.3 Discussion

For the y-axis label, it might also be useful to display the text unrotated, as shown in Figure 8.23 (left). The \n in the label represents a newline character:

When you call element_text(), the default angle is 0, so if you set axis.title.y but don’t specify the angle, it will show in this orientation, with the top of the text pointing up. If you change any other properties of axis.title.y and want it to be displayed in its usual orientation, rotated 90 degrees, you must manually specify the angle (Figure 8.23, right):

Y-axis label with angle = 0 (left); With angle = 90 (right)Y-axis label with angle = 0 (left); With angle = 90 (right)

Figure 8.23: Y-axis label with angle = 0 (left); With angle = 90 (right)

8.12.4 See Also

See Recipe 9.2 for more about controlling the appearance of the text.

8.13 Showing Lines Along the Axes

8.13.1 Problem

You want to display lines along the x- and y-axes, but not on the other sides of the graph.

8.13.3 Discussion

If you are starting with a theme that has a border around the plotting area, like theme_bw(), you will also need to unset panel.border (Figure 8.24, right):

Scatter plot with axis lines (left); With theme_bw(), panel.border must also be made blank (right)Scatter plot with axis lines (left); With theme_bw(), panel.border must also be made blank (right)

Figure 8.24: Scatter plot with axis lines (left); With theme_bw(), panel.border must also be made blank (right)

If the lines are thick, the ends will only partially overlap (Figure 8.25, left). To make them fully overlap (Figure 8.25, right), set lineend = "square":

With thick lines, the ends don't fully overlap (left); Full overlap with `lineend="square"` (right)With thick lines, the ends don't fully overlap (left); Full overlap with `lineend="square"` (right)

Figure 8.25: With thick lines, the ends don’t fully overlap (left); Full overlap with lineend="square" (right)

8.13.4 See Also

For more information about how the theming system works, see Recipe 9.3.

8.14 Using a Logarithmic Axis

8.14.1 Problem

You want to use a logarithmic axis for a graph.

8.14.3 Discussion

With a log axis, a given visual distance represents a constant proportional change; for example, each centimeter on the y-axis might represent a multiplication of the quantity by 10. In contrast, with a linear axis, a given visual distance represents a constant quantity change; each centimeter might represent adding 10 to the quantity.

Some data sets are exponentially distributed on the x-axis, and others on the y-axis (or both). For example, the Animals data set from the MASS package contains data on the average brain mass (in g) and body mass (in kg) of various mammals, with a few dinosaurs thrown in for comparison:

As shown in Figure 8.26, we can make a scatter plot to visualize the relationship between brain and body mass. With the default linearly scaled axes, it’s hard to make much sense of this graph. Because of a few very large animals, the rest of the animals get squished into the lower-left corner-a mouse barely looks different from a triceratops! This is a case where the data is distributed exponentially on both axes.

ggplot will try to make good decisions about where to place the tick marks, but if you don’t like them, you can change them by specifying breaks and, optionally, labels. In the example here, the automatically generated tick marks are spaced farther apart than is ideal. For the y-axis tick marks, we can get a vector of every power of 10 from 100 to 103 like this:

The x-axis tick marks work the same way, but because the range is large, R decides to format the output with scientific notation:

And then we can use those values as the breaks, as in Figure 8.27 (left):

To instead use exponential notation for the break labels (Figure 8.27, right), use the trans_format() function, from the scales package:

Scatter plot with log~10~ x- and y-axes, and with manually specified breaks (left); With exponents for the tick labels (right)Scatter plot with log~10~ x- and y-axes, and with manually specified breaks (left); With exponents for the tick labels (right)

Figure 8.27: Scatter plot with log10 x- and y-axes, and with manually specified breaks (left); With exponents for the tick labels (right)

Another way to use log axes is to transform the data before mapping it to the x and y coordinates (Figure 8.28). Technically, the axes are still linear – it’s the quantity that is log-transformed:

Plot with log transform before mapping to x- and y-axes

Figure 8.28: Plot with log transform before mapping to x- and y-axes

The previous examples used a log10 transformation, but it is possible to use other transformations, such as log2 and natural log, as shown in Figure 8.29. It’s a bit more complicated to use these – scale_x_log10() is shorthand, but for these other log scales, we need to spell them out:

Plot with exponents in tick labels. Notice that different bases are used for the x and y axes.

Figure 8.29: Plot with exponents in tick labels. Notice that different bases are used for the x and y axes.

It’s possible to use a log axis for just one axis. It is often useful to represent financial data this way, because it better represents proportional change. Figure 8.30 shows Apple’s stock price with linear and log y-axes. The default tick marks might not be spaced well for your graph; they can be set with the breaks in the scale:

Top: a stock chart with a linear x-axis and log y-axis; bottom: with manual breaksTop: a stock chart with a linear x-axis and log y-axis; bottom: with manual breaks

Figure 8.30: Top: a stock chart with a linear x-axis and log y-axis; bottom: with manual breaks

8.15 Adding Ticks for a Logarithmic Axis

8.15.1 Problem

You want to add tick marks with diminishing spacing for a logarithmic axis.

8.15.3 Discussion

The tick marks created by annotation_logticks() are actually geoms inside the plotting area. There is a long tick mark at each power of 10, and a mid-length tick mark at each 5.

To get the colors of the tick marks and the grid lines to match up a bit better, you can use theme_bw().

By default, the minor grid lines appear visually halfway between the major grid lines, but this is not the same place as the “5” tick marks on a logarithmic scale. To get them to be the same, we can supply a function for the scales minor_breaks.

We’ll define breaks_5log10(), which returns 5 times powers of 10 that encompass the values passed to it.

Then we’ll use that function for the minor breaks (Figure 8.32):

Log axes with ticks at each 5, and fixed coordinate ratio

Figure 8.32: Log axes with ticks at each 5, and fixed coordinate ratio

8.16 Making a Circular Plot

8.16.1 Problem

You want to make a circular plot.

8.16.2 Solution

Use coord_polar(). For this example we’ll use the wind data set from gcookbook. It contains samples of wind speed and direction for every 5 minutes throughout a day. The direction of the wind is categorized into 15-degree bins, and the speed is categorized into 5 m/s increments:

We’ll plot a count of the number of samples at each SpeedCat and DirCat using geom_histogram() (Figure 8.33). We’ll set binwidth to 15 and make the origin of the histogram start at –7.5, so that each bin is centered around 0, 15, 30, etc.:

Polar plot

Figure 8.33: Polar plot

8.16.3 Discussion

Be cautious when using polar plots, since they can perceptually distort the data. In the example here, at 210 degrees there are 15 observations with a speed of 15–20 and 13 observations with a speed of >20, but a quick glance at the picture makes it appear that there are more observations at >20. There are also three observations with a speed of 10–15, but they’re barely visible.

In this example we can make the plot a little prettier by reversing the legend, using a different palette, adding an outline, and setting the breaks to some more familiar numbers (Figure 8.34):

Polar plot with different colors and breaks

Figure 8.34: Polar plot with different colors and breaks

It may also be useful to set the starting angle with the start argument, especially when using a discrete variable for theta. The starting angle is specified in radians, so if you know the adjustment in degrees, you’ll have to convert it to radians:

Polar coordinates can be used with other geoms, including lines and points. There are a few important things to keep in mind when using these geoms. First, by default, for the variable that is mapped to y (or r), the smallest actual value gets mapped to the center; in other words, the smallest data value gets mapped to a visual radius value of 0. You may be expecting a data value of 0 to be mapped to a radius of 0, but to make sure this happens, you’ll need to set the limits.

Next, when using a continuous x (or theta), the smallest and largest data values are merged. Sometimes this is desirable, sometimes not. To change this behavior, you’ll need to set the limits.

Finally, the theta values of the polar coordinates do not wrap around-it is presently not possible to have a geom that crosses over the starting angle (usually vertical).

I’ll illustrate these issues with an example. The following code creates a data frame from the mdeaths time series data set and produces the graph shown on the left in Figure 8.35:

The first problem is that the data values (ranging from about 1000 to 2100) are mapped to the radius such that the smallest data value is at radius 0. We’ll fix this by setting the y (or r) limits from 0 to the maximum data value, as shown in the graph on the right in Figure 8.35:

Polar plot with line (notice the data range of the radius) (left); With the radius representing a data range starting from zero (right)Polar plot with line (notice the data range of the radius) (left); With the radius representing a data range starting from zero (right)

Figure 8.35: Polar plot with line (notice the data range of the radius) (left); With the radius representing a data range starting from zero (right)

The next problem is that the lowest and highest month values, 1 and 12, are shown at the same angle. We’ll fix this by setting the x limits from 0 to 12, creating the graph on the left in Figure 8.36 (notice that using xlim() overrides the scale_x_continuous() in p, so it no longer displays breaks for each month; see Recipe 8.2 for more information):

There’s one last issue, which is that the beginning and end aren’t connected. To fix that, we need to modify our data frame by adding one row with a month of 0 that has the same value as the row with month 12. This will make the starting and ending points the same, as in the graph on the right in Figure 8.36 (alternatively, we could add a row with month 13, instead of month 0):

Polar plot with theta representing x values from 0 to 12 (left); The gap is filled in by adding a dummy data point for month 0 (right)Polar plot with theta representing x values from 0 to 12 (left); The gap is filled in by adding a dummy data point for month 0 (right)

Figure 8.36: Polar plot with theta representing x values from 0 to 12 (left); The gap is filled in by adding a dummy data point for month 0 (right)

Note

Notice the use of the %+% operator. When you add a data frame to a ggplot object with %+%, it replaces the default data frame in the ggplot object. In this case, it changed the default data frame for p from md to mdnew.

8.16.4 See Also

See Recipe 10.4 for more about reversing the direction of a legend.

See Recipe 8.6 for more about specifying which values will have tick marks (breaks) and labels.

8.17 Using Dates on an Axis

8.17.1 Problem

You want to use dates on an axis.

8.17.3 Discussion

ggplot handles two kinds of time-related objects: dates (objects of class Date) and date-times (objects of class POSIXt). The difference between these is that Date objects represent dates and have a resolution of one day, while POSIXt objects represent moments in time and have a resolution of a fraction of a second.

Specifying the breaks is similar to with a numeric axis – the main difference is in specifying the sequence of dates to use. We’ll use a subset of the economics data, ranging from mid-1992 to mid-1993. If breaks aren’t specified, they will be automatically selected, as shown in Figure 8.38 (top):

The breaks can be created by using the seq() function with starting and ending dates, and an interval (Figure 8.38, bottom):

Top: with default breaks on the x-axis; bottom: with breaks specifiedTop: with default breaks on the x-axis; bottom: with breaks specified

Figure 8.38: Top: with default breaks on the x-axis; bottom: with breaks specified

Notice that the formatting of the breaks changed. You can specify the formatting by using the date_format() function from the scales package. Here we’ll use "%Y %b", which results in a format like "1992 Jun", as shown in Figure 8.39:

Line graph with date format specified

Figure 8.39: Line graph with date format specified

Common date format options are shown in Table 8.1. They are to be put in a string that is passed to date_format(), and the format specifiers will be replaced with the appropriate values. For example, if you use "%B %d, %Y", it will result in labels like “June 01, 1992”.

Table 8.1: Date format options
Option Description
%Y Year with century (2012)
%y Year without century (12)
%m Month as a decimal number (08)
%b Abbreviated month name in current locale (Aug)
%B Full month name in current locale (August)
%d Day of month as a decimal number (04)
%U Week of the year as a decimal number, with Sunday as the first day of the week (00–53)
%W Week of the year as a decimal number, with Monday as the first day of the week (00–53)
%w Day of week (0–6, Sunday is 0)
%a Abbreviated weekday name (Thu)
%A Full weekday name (Thursday)

Some of these items are specific to the computer’s locale. Months and days have different names in different languages (the examples here are generated with a US locale). You can change the locale with Sys.setlocale(). For example, this will change the date formatting to use an Italian locale:

Note that the locale names may differ between platforms, and your computer must have support for the locale installed at the operating system level.

8.17.4 See Also

See ?Sys.setlocale for more about setting the locale.

See ?strptime for information about converting strings to dates, and for information about formatting the date output.

8.18 Using Relative Times on an Axis

8.18.1 Problem

You want to use relative times on an axis.

8.18.2 Solution

Times are commonly stored as numbers. For example, the time of day can be stored as a number representing the hour. Time can also be stored as a number representing the number of minutes or seconds from some starting time. In these cases, you map a value to the x- or y-axis and use a formatter to generate the appropriate axis labels (Figure 8.40):

Top: relative times on x-axis; bottom: with formatted timesTop: relative times on x-axis; bottom: with formatted times

Figure 8.40: Top: relative times on x-axis; bottom: with formatted times

8.18.4 See Also

See Recipe 15.21 for information about converting time series objects to data frames.