1  The R Environment

R is an environment for statistical modeling and graphics. That it is an environment instead of a package is one of the main reasons why R is distinct from other statistical programs. A package only gives us a fixed set of tools. An environment allows us to modify, combine and even create tools to suit our specific needs.

1.1 Why becoming a useR?

R is gratis and runs on Windows, MacOS, and several Unix platforms. With R you can start with a data set like this:

  treat nitrogen block height weight leafarea shootarea flowers
1   tip   medium     1    7.5   7.62     11.7      31.9       1
2   tip   medium     1   10.7  12.14     14.1      46.0      10
3   tip   medium     1   11.2  12.76      7.1      66.7      10
4   tip   medium     1   10.4   8.78     11.9      20.3       1
5   tip   medium     1   10.4  13.58     14.5      26.9       4

and, in 8 lines of code or less, make a plot like this:

You can also find many powerful tools to fit statistical models (both bayesian and frequentist), including:

  • Generalized linear models (including linear regression)
  • Survival analysis
  • Time series analysis
  • Multilevel models (aka hierarchical models, aka Random and Mixed effects models)
  • Classification and clustering
  • Sample size and power calculations
  • Multivariable analysis (e.g., factor analysis, principal component analysis, and structural equations modeling)

Even better, new tools become available in R all the time. As with other open source programming languages, everyone can examine and contribute to R’s code. Users constantly publish their own code packages to expand R’s capabilities. As of March 2019, users have contributed over 13,700 packages to Comprehensive R Archive Network (CRAN), many of which perform complex statistical routines that are not (and may never be) available in other statistical software systems.

In Windows, there are several ways to use R. The standard R graphical unit interface (GUI) allows you to point and click to do many basic tasks. Another GUI is R Commander, developed by John Fox at McMaster University. R Commander displays the underlying R code for each analysis to help the user learn the programming language. Tinn-R is another GUI from Jose Claudio Faria.

These GUIs are friendly and easy to grasp if you’re a beginner. But to use all of R’s capabilities you will need to do more than point and click. A more complete way of using R is through an integrated development environment IDE), which, in short, helps you code. The most popular IDE for R is RStudio, which organizes the user’s screen into panes that display scripts, objects, graphics, and the R console.

In these notes, we will use RStudio a lot. The goal is for you to start taking full advantage of R’s capabilities.

1.2 Why Isn’t Everyone a UseR?

Many users of statistics don’t use R because they only know how to use one statistical software, often the one taught in their first statistics course. In the past, R rarely was this first language, but nowadays more schools are teaching how to use it.

Some people have used R, but struggle to get comfortable and productive with it, especially if they had little coding experience. Typing commands explicitly is more difficult than pointing and clicking. Also, each package has its own rules to learn. We can find a lot of good help for popular packages written by professional developers, but not so much for smaller packages written by other common users. Worst of all, some of the messages R displays if we make a mistake are uninformative, so fixing the problem can be difficult.

Don’t get frustrated! You don’t have to be an expert programmer to use R. The benefits are worth spending some time up front.

1.3 Suggestions for Learning R

1.4 How to get R

At the R Project Web Page the most important link is at the left hand side of the screen, under the “Download” heading. Click on the CRAN link (Comprehensive R Archive Network), and, after you choose one of the U.S. mirrors, you will be taken to the page that you will use to download everything R-related.

Once you find the CRAN web page, take the following steps to obtain R:

  1. Click on “Download R for X” that best describes your operating system (Linux, OS X, Windows).
  2. When using Windows, click on the “base” subdirectory. This will allow you to download the base R packages.
  3. Click the “Download R 4.X.X for Windows” link. R is updated quite frequently. At the time of this printing, version 4.4.1 is available. Save the .exe file somewhere on your computer.
  4. Double-click on the .exe file once it is downloaded. An installation window will appear to guide you through the setup of R in your machine.
  5. Once you finish, you should have an R icon on your desktop that gives you a shortcut to the R system.

1.4.1 How to get RStudio

RStudio is already installed on the lab workstations. The following information is useful if you need to install RStudio on another machine. You must install R before you install RStudio. Otherwise, RStudio will not work.

Visit https://posit.co/downloads/ and click on “Download RStudio”. Choose the version for your operating system (Linux, OS X, Windows) and download the installer. Then double click on the installer .exe file and follow the instructions on the screen to install RStudio.