Introduction to R and RStudio
Table of Contents
Most of your statistics modules will require the use of the R programming language. This document offers a brief introduction to R.
1. What is R?
R is a programming language for data analysis and graphics (Download). It is the leading programming language among academic statisticians.
Some of the key benefits of using R are:
- A large ecosystem of
packages implementing the
latest research. There are currently more than 20K contributed packages on
CRAN (the comprehensive R archive network) available in an open-source
format. This facilitates the study, testing and comparison of the various
methods for your data analysis project. Note, however, that there is no
rigorous testing or verification of the distributed packages. This means
that some packages might contain bugs (I have found many bugs myself), so
use with care. If you do find a bug, then contact the package's maintainer
to let them know. Some examples of widely used packages are:
Matrix
Sparse matrix representation and computation.dplyr
Data manipulation.ggplot2
Elegant data visualisations.Rcpp
C++ integration.rmarkdown
Literate programming using R.
- Interoperability with compiled languages C, C++, Fortran. R provides convenient functions for calling code written in those languages, thus speeding up calculations. R also allows access to some of its functions within C, e.g., random number generators. See Chapters 5 and 6 of the document Writing R extensions on how to integrate compiled code with R.
- Strong community support. There are many resources for learning R at various levels written by R users (see 3 below). Furthermore, you can ask questions and get help at various forums such as StackOverflow and the r-help mailing list.
2. RStudio
It is not advisable to write code directly in the R console as this makes saving and editing code difficult. RStudio is an integrated development environment (IDE) for R (and Python). If you are a beginner R user, then it is a great environment to use, as it offers a rich set of features to help you get going. In my opinion, it is not an ideal environment for serious R programming as some of its features soon become a distraction and does not allow running multiple R sessions in the same window.
The RStudio interface includes:
- syntax-highlighting editor that supports direct code execution,
- built-in R console,
- workspace viewer,
- integrated help and plotting windows.
Other notable features of RStudio are:
- quickly jump to function definitions,
- interactive debugger to diagnose and fix errors,
- management of multiple files using projects,
- integrated version control (enable when creating a project or through project management),
- allows interacting with RStudio directly with R code through the rstudioapi package.
You can configure RStudio to suit your preferences via the menu: Tools >
Global Options
.