1.5 Loading Data from an Excel File

1.5.1 Problem

You want to load data from an Excel file.

1.5.2 Solution

The readxl package has the function read_excel() for reading .xls and .xlsx files from Excel. This will read the first sheet of an Excel spreadsheet:

# Only need to install once
install.packages("readxl")

library(readxl)
data <- read_excel("datafile.xlsx", 1)

1.5.3 Discussion

With read_excel(), you can load from other sheets by specifying a number for sheetIndex or a name for sheetName:

data <- read_excel("datafile.xls", sheet = 2)

data <- read_excel("datafile.xls", sheet = "Revenues")

read_excel() uses the first row of the spreadsheet for column names. If you don’t want to use that row for column names, use col_names = FALSE. The columns will instead be named X1, X2, and so on.

By default, read_excel() will infer the type of each column, but if you want to specify the type of each column, you can use the col_types argument. You can also drop columns if you specify the type as "blank".

# Drop the first column, and specify the types of the next three columns
data <- read_excel("datafile.xls", col_types = c("blank", "text", "date", "numeric"))

1.5.4 See Also

See ?read_excel for more options controlling the reading of these files.

There are other packages for reading Excel files. The gdata package has a function read.xls() for reading in .xls files, and the xlsx package has a function read.xlsx() for reading in .xlsx files. They require external software to be installed on your computer: read.xls() requires Java, and read.xlsx() requires Perl.