This handout accompanies the workshop given on September 4, 2019 at UGA’s DigiLab in the Main Library. There is some overlap with a blog post I did a couple years ago, but this is the first time this material has been presented in a workshop format. As always, please visit joeystanley.com/r for the latest materials.


As I was preparing a workshop on custom themes in ggplot2, I got a little carried away illustrating all the componenets of the theme function. I decided to simplify that portion of the workshop and create this separate handout that just focuses on theme. It is not yet finished, but it may be of some help to people (including myself!).

1 Data prep and sample plots

First off, let’s load ggplot2 before we get carried away.

library(ggplot2)

I’m going to use similar plots to the ones I used in previous workshops. I’ve got four different plots because when creating a theme, it’s good to try it out on several kinds of plots to make sure they all integrate well.

1.1 Amount of sugar per McDonald’s category

For some of the workshop, I’ll work with the McDonald’s menu items dataset, which you can access from my wesite. To simplify things a little bit, I’ll take a subset—just four of the nine categories. To make that subset, I’ll use the subset function.

menu <- read.csv("http://joeystanley.com/downloads/menu.csv")
menu_subset <- subset(menu, Category %in% c("Smoothies & Shakes", "Desserts", "Beverages", "Snacks & Sides"))

The default plot will just be the distribution of the number of sugars in each of the four remaining categories. For the color, I’ll use Paul Tol’s themes, which I access using the package ggthemes.

m <- ggplot(menu_subset, aes(Category, Sugars, fill = Category)) +
    geom_boxplot(size = 0.75) +
    geom_jitter(color = "gray15") + 
    ggthemes::scale_fill_ptol()

So now, to plot it, all I need to do is call m.

m

Conveniently for us, if I want to make changes to the plot, I can just add additional lines of ggplot2 code to the p and it’ll work out like normal. In other words…

m + ggtitle("A Default Plot")

…is shorthand for…

ggplot(menu_subset, aes(Category, Sugars, fill = Category)) +
    geom_boxplot(size = 0.75) +
    geom_jitter(color = "gray15") + 
    ggthemes::scale_fill_ptol() + 
    ggtitle("A Default Plot")

Both will produce this plot:

1.2 Stranger Things ratings

Another dataset I’ll use is the Stranger Things dataset that was used in a previous workshop. Like before, I’ll change the season column to a factor too. For the default plot here, s, I’ll make the same scatterplot of the number of votes the episode got on IMDB by the average rating. I’ll make the dots bigger so they’re easier to see. And for fun, I’ll use a Wes Anderson theme inspired by the movie Fantastic Mr. Fox.

stranger <- read.csv("http://joeystanley.com/data/stranger.csv")
stranger$season = factor(stranger$season)
s <- ggplot(stranger, aes(votes, rating, color = season)) + 
    geom_point(size = 4) + 
    scale_color_manual(values = wesanderson::wes_palette("FantasticFox1")) + 
    labs(title = "Stranger Things episodes",
         subtitle = "Average rating by number of votes on IMDB")
s

1.3 Girlnames

The last is a dataset of the top 25 most common baby girl names in the US in 2017. I’ll read it in and prep it as I did in the intro workshop, except I’ll reverse the order of the girlnames and will make a horizontal bar chart. I’ll also only keep the top 15 just so there aren’t so many bars and color it using Color Brewer.

girlnames <- read.delim("http://joeystanley.com/data/girlnames.txt", sep = "\t")
girlnames <- girlnames[1:15,]
girlnames$name = factor(girlnames$name, levels = rev(girlnames$name))
g <- ggplot(girlnames, aes(name, n, fill = n)) + 
    geom_bar(stat = "identity") + 
    scale_fill_distiller(type = "seq", palette = "Purples") + 
    coord_flip() + 
    labs(title = "Top 15 baby girl names in the US in 2017",
         caption = "Data Source: Social Security data via the babynames package")
g

1.4 Cereal nutritional facts

Finally, the last dataset tht we’ll use—which was also made available through Kaggle.com—contain nutritional information from about 80 different kinds of cereal. We’ll load it in, make a few changes as before, and make a faceted plot showing the amount of fiber per cereal and its rating, split up by brand.

cereal <- read.csv("http://joeystanley.com/data/cereal.csv")
cereal$mfr <- forcats::fct_recode(cereal$mfr,
                                  "Kellogg's" = "K",
                                  "General Mills" = "G",
                                  "Post" = "P",
                                  "Quaker Oats" = "Q",
                                  "Ralston Purina" = "R",
                                  "Nabisco" = "N")
c <- ggplot(cereal, aes(fiber, rating)) + 
    geom_text(aes(label = name),
              check_overlap = TRUE, vjust = "inward", hjust = "inward") + 
    facet_wrap(~mfr)
c

Note that within facet_wrap, I’ve added a couple extra arguments (check_overlap, vjust, and hjust). I recently learned about these from Hadley Wichkham’s book-in-progress, ggplot2 (version 3) which you can view here. They ensure that labels don’t overlap (removing some if necessary), and then make sure they don’t spill off over the edges of the plot. Very handy.

2 Theme elements

Before we get too carried away with modifying a plot, we’ll have to take a look at the basic building blocks of a theme: element_line, element_text, element_rect, and element_blank. These are called “theme elements.” Many of the arguments of theme require the output of these functions as their value. This means that you can’t simply change the size of a line with a number like you can in a regular ggplot function (as in geom_line(size = 2)—you can’t do that here). So, we’re going to have to look at some of these elements to be able to use them in theme.

2.1 Drawing lines with element_line

One of the most fundamental concepts in a plot is a line. Assuming we have a starting and stopping point set already by the data we’re plotting, we can think of properties of a line that are purely aesthetic: color, thickness, and what type of line it is (solid, dotted, dashed, etc). In ggplot2, we modify these attributes (and a few others, with element_line). One examlpe of a line in a ggplot is the faint white grid underneath the data. We can modify that grid with the panel.grid argument. We’ll get into more detail about modifying the grid later, but for now we’ll use it as an example of a line. If we want to modify the color of the lines, we add element_line and specify that the color is light blue

m + theme(panel.grid = element_line(color = "lightblue"))

We can add other arguments too:

m + theme(panel.grid = element_line(color = "lightblue", size = 4, linetype = "dotted"))

In fact, if you inspect the element_line function itself, you can see that you can also add things like lineend. With this argument, you can change the ends of the line to can make the ends "round", "butt", or "square". In this example, it’s not very illustrative since the lines extend past the edge of the plotting area, but later when we come across other lines you’d like to modify, you can use that option if you’d like.

However, if you like arrows, you’ll also see that you can add them with the arrow argument. But if you thought you were done with basic ggplot2 elements, it turns out the arrow argument takes the output of the arrow() function, which allows you to modify things like the angle, length, type, and which sides of the line should get the arrow. And if that wasn’t Inception enough for you, the length argument requires the output of the unit function, in which you specify the length and the units.

m + theme(panel.grid = element_line(color = "lightblue", 
                                    arrow = arrow(angle = 20, 
                                                  length = unit(0.2, "inches"), 
                                                  ends = "both", 
                                                  type = "closed")))

This may seem like an overly complex network of functions, but the benefit is that it allows you to modify whatever you want.

2.2 Drawing rectangles with element_rect

One step up from drawing lines is to draw rectanges with element_rect. Because they consist of four lines, many of the arguments are the same as element_line.

c + theme(panel.border = element_rect(color = "darkred", size = 2, linetype = "dashed"))

Oops. Where did the plot go? So turns out rectangles aren’t just lines anymore because they create the space in the middle. So in element_rect we need to worry about what goes in the middle. By default, it seems like the color is white. What we probably want to do is remove it, or rather, make it transparent. We can do that with the fill argument:

c + theme(panel.border = element_rect(color = "darkred", size = 2, linetype = "dashed",
                                      fill = "transparent"))

So in the theme function, there are several things that are rectangles that you can modify. You can probably think of a few now like the legend or the strips in facets. We’ll use element_rect to modify those.

2.3 Drawing text with element_text

The last major element in ggplot is to draw text, which we do with element_text. As you can imagine, there are a lot of things you can change with text, like the font (“family”), it’s “face” ("bold", "italic", "bold.italic"), color, and size. In this example, we’ll modify a plot’s title.

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20))

In addition to these attributes, we can also adjust other things. I’m also going to add the extremely handy debug = TRUE argument. This will make the plot title area yellow and will show a yellow dot where the text is anchored. So, for example, if I want to modify the angle to something onconventional, I can do so, but the result might not be very pretty.

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    angle = 3, debug = TRUE))

You can see that the anchor point is the top left corner of the plotting area. We may want to adjust the vertical position of this title. Both the hjust and vjust arguments take a number, ranging from 0 (left/bottom) to 1 (right/top). When we set the vertical adjustment to 0, it’ll anchor it to the bottom, which is probably what we wanted.

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    angle = 3, vjust = 0, debug = TRUE))

If you want to center your plot, you can do so by setting hjust to 0.5.

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    hjust = 0.5, debug = TRUE))

Note that you can use values outside of the range [0,1] and they’ll move around just as you expect:

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    vjust = -2, hjust = 2, debug = TRUE))

Finally, you may want to adjust the spacing around the text. To illustrate lineheight, I’ll change the title to something with two lines so you can get a better picture of what it’s doing. So if I really want to add double spacing, I can set it to 2. If I want to do something more subtle, maybe like 1.15.

g + labs(title = "Here is a\nTwo-lined Title",
         subtitle = "This is the subtitle") + 
    theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    lineheight = 2, debug = TRUE))

For titles, this often won’t be necessary, but for other text elements in your plot, the lineheight may be useful.

The last thing is the margin, which takes the output of the margin function. In it, you specify how large the margins should be around your title area. The order is top, right, bottom, left, which you can maybe remember with the mneumonic “trouble”. For the title, left and right margins don’t mean much, but you can adjust how far the title is from the top of the graph or from the plotting area. Of course, if you want it right up against the plotting area, you can set the bottom margin to zero.

g + theme(plot.title = element_text(family = "Palatino", face = "bold", color = "mediumpurple4", size = 20,
                                    margin = margin(1, 0, 0, 0, unit = "cm"), debug = TRUE))

So that’s the element_text function. It’s powerful because you can control a lot of things with it.

2.4 Drawing nothing with element_blank

The final category of ggplot elements is element_blank and it’s pretty simple: it draws nothing. An important detail though is that not only does draw nothing, but it doesn’t even reserve space for that thing. So in the Stranger Things plot, the x- and y-axis labels are useful, but if we wanted to purposely make the plot confusing, we could remove them.

s + theme(axis.title = element_blank())

So if there’s something in your plot by default and you want to remove it, you just simply use element_blank and it’ll zap it from existence.

3 Exploring theme()

Note: This section is not yet complete. As it turns out, there is far more to theme than I anticipated. We would not have had time to cover it in the workshop anyway. I’ll try to fill in the rest of these sections later.

So now that we’ve seen the basic elements of ggplot, we can now use them in practice. The main function that has been driving all the changes in the various themes is the theme function. In previous workshops, I’ve mentioned in it passing and I’ve used it to rotate the text in the x-axis labels, remove the legend, and mess with the background grid. All very different things, but all controlled by one function. In the previous section we saw that theme is used to modify the title, plotting area, and axes too. Let’s take a dive into the function and see what’s going on.

If you look at the help page for theme, you’ll see that there are dozens and dozens of components:

?theme

# theme(line, rect, text, title, aspect.ratio, axis.title, axis.title.x,
#   axis.title.x.top, axis.title.x.bottom, axis.title.y, axis.title.y.left,
#   axis.title.y.right, axis.text, axis.text.x, axis.text.x.top,
#   axis.text.x.bottom, axis.text.y, axis.text.y.left, axis.text.y.right,
#   axis.ticks, axis.ticks.x, axis.ticks.x.top, axis.ticks.x.bottom,
#   axis.ticks.y, axis.ticks.y.left, axis.ticks.y.right, axis.ticks.length,
#   axis.ticks.length.x, axis.ticks.length.x.top, axis.ticks.length.x.bottom,
#   axis.ticks.length.y, axis.ticks.length.y.left, axis.ticks.length.y.right,
#   axis.line, axis.line.x, axis.line.x.top, axis.line.x.bottom, axis.line.y,
#   axis.line.y.left, axis.line.y.right, legend.background, legend.margin,
#   legend.spacing, legend.spacing.x, legend.spacing.y, legend.key,
#   legend.key.size, legend.key.height, legend.key.width, legend.text,
#   legend.text.align, legend.title, legend.title.align, legend.position,
#   legend.direction, legend.justification, legend.box, legend.box.just,
#   legend.box.margin, legend.box.background, legend.box.spacing,
#   panel.background, panel.border, panel.spacing, panel.spacing.x,
#   panel.spacing.y, panel.grid, panel.grid.major, panel.grid.minor,
#   panel.grid.major.x, panel.grid.major.y, panel.grid.minor.x,
#   panel.grid.minor.y, panel.ontop, plot.background, plot.title,
#   plot.subtitle, plot.caption, plot.tag, plot.tag.position, plot.margin,
#   strip.background, strip.background.x, strip.background.y,
#   strip.placement, strip.text, strip.text.x, strip.text.y,
#   strip.switch.pad.grid, strip.switch.pad.wrap, ..., complete = FALSE,
#   validate = TRUE)

That’s a lot. And if you scroll through the help page, there is lots of description about how to use these arguments. Admittedly, the help files can take some practice to understand, but with practice you’ll be able to decipher these without any problems. If you look closely at the arguments though, most of them are pretty well-organized into major categories: axis, legend, panel, plot, and strip. We’ll look at each one of these categories in this section.

3.1 Axis

The arguments that control a plot’s axis are the following.

  1. axis.title
    • axis.title.x
      • axis.title.x.top
      • axis.title.x.bottom
    • axis.title.y
      • axis.title.y.left,
      • axis.title.y.right,
  2. axis.text
    • axis.text.x
      • axis.text.x.top
      • axis.text.x.bottom
    • axis.text.y
      • axis.text.y.left
      • axis.text.y.right
  3. axis.ticks
    • axis.ticks.x
      • axis.ticks.x.top
      • axis.ticks.x.bottom
    • axis.ticks.y
      • axis.ticks.y.left
      • axis.ticks.y.right,
    • axis.ticks.length
      • axis.ticks.length.x
        • axis.ticks.length.x.top
        • axis.ticks.length.x.bottom,
      • axis.ticks.length.y
        • axis.ticks.length.y.left
        • axis.ticks.length.y.right
  4. axis.line
    • axis.line.x
      • axis.line.x.top
      • axis.line.x.bottom
    • axis.line.y
      • axis.line.y.left
      • axis.line.y.right

As you can see, they fall nicely into a bit of a hierarchy. The main ones are the title, text, ticks, and line. Within each of those, there are separate functions for the x- and y-axis. Within those, you can modify the top/bottom or left/right properties.

This hierarchy is not just there to make things easier to remember. Instead, the hierarchy is built into the functions themselves. So if you modify something in a higher level, it’ll percolate to the lower arguments of the hierarchy automatically. Let’s see how that works.

3.1.1 axis.title

Let’s change the axis titles from the Stranger Things plot. Since this is a text element, can do that by using element_text on axis.title.

s + theme(axis.title = element_text(color = "brown", face = "bold", size = 20))

As you can see the x-axis label, “votes”, and the y-axis label, “rating” are both brown, in bold face, and size 20. Let’s say though that we’re fine with the size and bold face, but we want the x-axis title to be one color and the x-axis title to be another. There are ways of producing identical output, but since we have properties that are in common between the two, it might most elegant to take advantage of the hierarchy:

s + theme(axis.title = element_text(face = "bold", size = 20),
          axis.title.x = element_text(color = "red"),
          axis.title.y = element_text(color = "blue"))

When you specify properties in a more specific argument (axis.title.x) it’ll add to and, if applicable, override the more general one (axis.title).

Finally, to dive even deeper and use additional axis elements, we’ll need to add secondary axes to our plot. I won’t get into those details, but I encourage you to look at the help page for the function sec_axis. Here I’m adding an identical secondary axis on the top, on the right I’ll add an axis with additional tick marks. Then, with the help of all the left/right/top/bottom functions, I can adjust those individually with anything from element_text.

s_sec_axis <- s + scale_x_continuous(sec.axis = dup_axis()) + 
    scale_y_continuous(sec.axis = sec_axis(~ ., name = "rating", breaks = seq(0, 10, 0.5)))

s_sec_axis + 
    # All axes titles should be bold and size 20 by default
    theme(axis.title = element_text(face = "bold", size = 20),
          
          # The x axis titles will be red
          axis.title.x = element_text(color = "red"),
          # Remove the bottom one entirely
          axis.title.x.bottom = element_blank(),
          # The top one should be smaller, but still red and bold
          axis.title.x.top = element_text(size = 15),
          
          # The y axis titles will be blue
          axis.title.y = element_text(color = "blue"),
          # The left one will be horizontal and top-aligned
          axis.title.y.left = element_text(angle = 0, vjust = 1),
          # The right one will be bottom aligned and have the debug option on
          axis.title.y.right = element_text(hjust = 1, debug = TRUE))

The plot looks awful. But it illustrates the point that you can adjust anything you want. Generally, you’ll use these quite infrequently, if ever at all. Especially since secondary axes aren’t terribly common. But knowing that you can adjust them is handy.

It’s also important to keep in mind that you don’t need to use the more broad argument to use the more specific one. If you’re fine with the defult and just want to change one thing, using a very specific argument is perfectly fine. Here, I’m happy with everything but I just want to add a smidge of extra spacing between the plot and the x-axis title.

s + theme(axis.title.x = element_text(margin = margin(t = 0.5, unit = "cm")))

The properties of hierarchy are super common in the theme function. I won’t do such detailed examples much now that you’ve seen how they’re done.

3.1.2 axis.text

The next element of a plot’s axis that you might want to conrol are the actual values going along the x- and y-axes. In other words, the highlighted portion in this plot.

s + theme(axis.text = element_text(debug = TRUE))

The kind of things you can control are very similar to what you could control with the axis titles. x- and y-axes can be independent, and if you have secondary axes, the top/bottom and left/right options are available too. Here is the same ridiculus plot as above, only instead of modifying the titles I’ve modified the axes text.

s + scale_x_continuous(sec.axis = dup_axis()) + 
    scale_y_continuous(sec.axis = sec_axis(~ ., name = "rating", breaks = seq(0, 10, 0.5))) + 
    
    # All axes texts should be bold and size 20 by default
    theme(axis.text = element_text(face = "bold", size = 20),
          
          # The x axis texts will be red
          axis.text.x = element_text(color = "red"),
          # Remove the bottom one entirely
          axis.text.x.bottom = element_blank(),
          # The top one should be smaller, but still red and bold
          axis.text.x.top = element_text(size = 15),
          
          # The y axis texts will be blue
          axis.text.y = element_text(color = "blue"),
          # The left one will be horizontal and top-aligned
          axis.text.y.left = element_text(angle = 0, vjust = 1),
          # The right one will be bottom aligned and have the debug option on
          axis.text.y.right = element_text(hjust = 1, debug = TRUE))

3.1.3 axis.ticks

Todo.

3.1.4 axis.line

The axis.line family of arguments modifies the actual line that runs along the bottom and side of your plot.

m + theme(axis.line = element_line(color = "orange", size = 2))

Because these are lines rather than text, they take the element_line function.

m + theme(axis.line = element_line(color = "orange", size = 2),
          axis.line.y = element_line(linetype = "dashed"))

As always, you can remove the lines you don’t want with element_blank:

m + theme(axis.line = element_line(color = "orange", size = 2),
          axis.line.y = element_line(linetype = "dashed"),
          axis.line.x = element_blank())

And in the off chance that you do have secondary axes, you can modify those indendently with axis_line.x.top, axis.line.y.right, etc.

3.2 Legend

The next plot element that we can control with theme is the legend. Looking at the list of arguments that pertain to the legend, you can see that there are a lot:

  1. legend.background
  2. legend.margin
  3. legend.spacing
    • legend.spacing.x
    • legend.spacing.y
  4. legend.key
    • legend.key.size
      • legend.key.height
      • legend.key.width
  5. legend.text
    • legend.text.align
  6. legend.title
    • legend.title.align
  7. legend.position
  8. legend.direction
  9. legend.justification
  10. legend.box
    • legend.box.just
    • legend.box.margin
    • legend.box.background
    • legend.box.spacing

I go through these in what seem to me to be a somewhat logical order, starting with ones that people might use the most and moving toward those that require certain plot elements for them to show up.

3.2.1 Legend Titles

The first thing you may find yourself wanting to change is the title of the legend. We’ve seen before how to change the text of the title, but these functions will let you change the properties of that text. The first is legend.title which takes an element_text function, so you can change things like the color, size, font, and anything else. The second is legend.title.align, which is simply a number that says where to align it from 0 (left) to 1 (right). In this example, I make a bigger, dark gray, bold title that is centered.

m + theme(legend.title = element_text(face = "bold", color = "gray20", size = 15),
          legend.title.align = 0.5)

3.2.2 Legend Text

Once the title has been modified, you may want to change the text itself. There are two functions and they behave just like the title ones. So here’s a more whimsical example, just to show the possibilities that ggplot2 has to offer.

m + theme(legend.text = element_text(family = "Courier", color = "magenta", angle = 10),
          legend.text.align = 1)

But of course you may want to do something a little less silly in your plots, possibly in conjunction with the legend title.

m + theme(legend.title = element_text(family = "Palatino", face = "bold", color = "gray20", size = 15),
          legend.title.align = 1,
          legend.text = element_text(family = "Palatino", color = "gray30", size = 12),
          legend.text.align = 1)

3.2.3 Legend Keys

Now that we’ve changed the text, the next thing you may want to do is change the size of the symbols that are on the legend, known as the “keys”. There are four elements. The first is legend.key which takes an element_rect output and controls the rectangle that the keys themselves sit in. By default, there is no border, but if you wanted to add one, for example, you could:

m + theme(legend.key = element_rect(color = "salmon"))

The other three key properties are lengend.key.size, which has two more specific functions, legend.key.height and legend.key.width.

m + theme(legend.key.size = unit(2, "cm"))