When the first edition of this book was published five years ago, the phrase “data science” had only recently entered the popular lexicon. Today, the phrase is unavoidable if you’re involved with the sciences, journalism, or high-tech industries. Many interrelated developments have made this possible: there’s a general awareness that understanding quantitative data has tangible benefits; there are better and more widely-available educational resources about how to do data science; and finally, the tools have evolved, becoming easier to use and get started with.
The goal of this book is to help you understand your data by visualizing it, and to help you convey that understanding to others. You can think of data analysis as the process of transforming raw data into ideas in somebody’s mind. One of the key techniques for doing this is to create visualizations of the data. Our brains have very highly-developed visual pattern detection systems, and data visualizations are a way to efficiently use those visual systems to get quantitative information into a person’s mind.
Each recipe in this book lists a problem and a solution. In most cases, the solutions I offer aren’t the only way to do things in R, but they are, in my opinion, the best way. One of the reasons for R’s popularity is that there are many available add-on packages, each of which provides some functionality for R. There are many packages for visualizing data in R, but this book primarily uses ggplot2.
This book isn’t meant to be a comprehensive manual of all the different ways of creating data visualizations in R, but hopefully it will help you figure out how to make the graphics you have in mind. Or, if you’re not sure what you want to make, browsing its pages may give you some ideas about what’s possible.