## 15.14 Recoding a Continuous Variable to a Categorical Variable

### 15.14.1 Problem

You want to recode a continuous variable to another variable.

### 15.14.2 Solution

Use the `cut()` function. In this example, we’ll use the `PlantGrowth` data set and recode the continuous variable `weight` into a categorical variable, `wtclass`, using the `cut()` function:

``````pg <- PlantGrowth
pg\$wtclass <- cut(pg\$weight, breaks = c(0, 5, 6, Inf))
pg
#>    weight group wtclass
#> 1    4.17  ctrl   (0,5]
#> 2    5.58  ctrl   (5,6]
#>  ...<26 more rows>...
#> 29   5.80  trt2   (5,6]
#> 30   5.26  trt2   (5,6]``````

### 15.14.3 Discussion

For three categories we specify four bounds, which can include `Inf` and `-Inf`. If a data value falls outside of the specified bounds, it’s categorized as `NA`. The result of `cut()` is a factor, and you can see from the example that the factor levels are named after the bounds.

To change the names of the levels, set the labels:

``````pg\$wtclass <- cut(pg\$weight, breaks = c(0, 5, 6, Inf),
labels = c("small", "medium", "large"))
pg
#>    weight group wtclass
#> 1    4.17  ctrl   small
#> 2    5.58  ctrl  medium
#>  ...<26 more rows>...
#> 29   5.80  trt2  medium
#> 30   5.26  trt2  medium``````

As indicated by the factor levels, the bounds are by default open on the left and closed on the right. In other words, they don’t include the lowest value, but they do include the highest value. For the smallest category, you can have it include both the lower and upper values by setting `include.lowest=TRUE`. In this example, this would result in 0 values going into the small category; otherwise, 0 would be coded as `NA`.

If you want the categories to be closed on the left and open on the right, set right = FALSE:

``````cut(pg\$weight, breaks = c(0, 5, 6, Inf), right = FALSE)
#>   [0,5)   [5,6)   [5,6)   [6,Inf) [0,5)   [0,5)   [5,6)   [0,5)   [5,6)
#>  [5,6)   [0,5)   [0,5)   [0,5)   [0,5)   [5,6)   [0,5)   [6,Inf) [0,5)
#>  [0,5)   [0,5)   [6,Inf) [5,6)   [5,6)   [5,6)   [5,6)   [5,6)   [0,5)
#>  [6,Inf) [5,6)   [5,6)
#> Levels: [0,5) [5,6) [6,Inf)``````