Introduction to R and RStudio

Table of Contents

Most of your statistics modules will require the use of the R programming language. This document offers a brief introduction to R.

1. What is R?

R is a programming language for data analysis and graphics (Download). It is the leading programming language among academic statisticians.

Some of the key benefits of using R are:

  1. A large ecosystem of packages implementing the latest research. There are currently more than 20K contributed packages on CRAN (the comprehensive R archive network) available in an open-source format. This facilitates the study, testing and comparison of the various methods for your data analysis project. Note, however, that there is no rigorous testing or verification of the distributed packages. This means that some packages might contain bugs (I have found many bugs myself), so use with care. If you do find a bug, then contact the package's maintainer to let them know. Some examples of widely used packages are:
    • Matrix Sparse matrix representation and computation.
    • dplyr Data manipulation.
    • ggplot2 Elegant data visualisations.
    • Rcpp C++ integration.
    • rmarkdown Literate programming using R.
  2. Interoperability with compiled languages C, C++, Fortran. R provides convenient functions for calling code written in those languages, thus speeding up calculations. R also allows access to some of its functions within C, e.g., random number generators. See Chapters 5 and 6 of the document Writing R extensions on how to integrate compiled code with R.
  3. Strong community support. There are many resources for learning R at various levels written by R users (see 3 below). Furthermore, you can ask questions and get help at various forums such as StackOverflow and the r-help mailing list.

2. RStudio

It is not advisable to write code directly in the R console as this makes saving and editing code difficult. RStudio is an integrated development environment (IDE) for R (and Python). If you are a beginner R user, then it is a great environment to use, as it offers a rich set of features to help you get going. In my opinion, it is not an ideal environment for serious R programming as some of its features soon become a distraction and does not allow running multiple R sessions in the same window.

The RStudio interface includes:

  • syntax-highlighting editor that supports direct code execution,
  • built-in R console,
  • workspace viewer,
  • integrated help and plotting windows.

Other notable features of RStudio are:

  • quickly jump to function definitions,
  • interactive debugger to diagnose and fix errors,
  • management of multiple files using projects,
  • integrated version control (enable when creating a project or through project management),
  • allows interacting with RStudio directly with R code through the rstudioapi package.

You can configure RStudio to suit your preferences via the menu: Tools > Global Options.

3. Further reading

Author: Vangelis Evangelou

Created: 2024-10-01 Tue 08:08