15.6 Reordering Columns in a Data Frame

15.6.1 Problem

You want to change the order of columns in a data frame.

15.6.2 Solution

Use the select() from dplyr.

ToothGrowth %>%
  select(dose, len, supp)
#>    dose  len supp
#> 1   0.5  4.2   VC
#> 2   0.5 11.5   VC
#>  ...<56 more rows>...
#> 59  2.0 29.4   OJ
#> 60  2.0 23.0   OJ

The new data frame will contain the columns you specified in select(), in the order you specified. Note that select() returns a new data frame, so if you want to change the original variable, you’ll need to save the new result over it.

15.6.3 Discussion

If you are only reordering a few variables and want to keep the rest of the variables in order, you can use everything() as a placeholder:

ToothGrowth %>%
  select(dose, everything())
#>    dose  len supp
#> 1   0.5  4.2   VC
#> 2   0.5 11.5   VC
#>  ...<56 more rows>...
#> 59  2.0 29.4   OJ
#> 60  2.0 23.0   OJ

See ?select_helpers for other ways to select columns. You can, for example, select columns by matching parts of the name.

Using base R, you can also reorder columns by their name or numeric position. This returns a new data frame, which can be saved over the original.

ToothGrowth[c("dose", "len", "supp")]

ToothGrowth[c(3, 1, 2)]

In these examples, I used list-style indexing. A data frame is essentially a list of vectors, and indexing into it as a list will return another data frame. You can get the same effect with matrix-style indexing:

ToothGrowth[c("dose", "len", "supp")]   # List-style indexing

ToothGrowth[, c("dose", "len", "supp")] # Matrix-style indexing

In this case, both methods return the same result, a data frame. However, when retrieving a single column, list-style indexing will return a data frame, while matrix-style indexing will return a vector:

ToothGrowth["dose"]
#>    dose
#> 1   0.5
#> 2   0.5
#>  ...<56 more rows>...
#> 59  2.0
#> 60  2.0
ToothGrowth[, "dose"]
#>  [1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
#> [19] 1.0 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 0.5 0.5 0.5 0.5 0.5 0.5
#> [37] 0.5 0.5 0.5 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0
#> [55] 2.0 2.0 2.0 2.0 2.0 2.0

You can use drop=FALSE to ensure that it returns a data frame:

ToothGrowth[, "dose", drop=FALSE]
#>    dose
#> 1   0.5
#> 2   0.5
#>  ...<56 more rows>...
#> 59  2.0
#> 60  2.0