Chapter 1 Fundamentals

In this Chapter you will learn the user interface and basic functions in R. The user interface, as well as the basic functions for mathematical calculations. Further, you will be introduced to one of the most important logical operations: the if else function. The goal of this chapter is to give you an overview of RStudio and to give you a feeling and intuition for R by introducing you to mathematical concepts you already know. Do not forget, nobody is perfect, everyone has problems in the beginning and they can be frustrating, but that is most normal thing, when learning a new programming language. Have fun!

1.1 Getting familiar with RStudio and establishing a Workflow

1.1.1 The user interface

RStudio has four main components and each of them has a purpose:

Figure 1: Standard RStudio Interface
Figure 1: Standard RStudio Interface
  • On the top left you can see the Source window. In this window you write and run your code. The console opens after you choose, which type of Script you want to use:

  • The standard R Script (Ctrl + Shift + N): In this file you can just write your code and run it. You can also make commenting lines with putting a # in front of it.

  • The R Markdown File: In contrast to the standard R Script not everything written in a R Markdown File is automatically considered as Code (if not written after an #). A R Markdown File has options to design an HTML Output and a PDF Output. This can increase your efficiency in terms of working with partners. Further, you can write your code in chunks and have plenty options to work with those chunks. In QM and AQM you will exclusively use R Markdown and over time you will see the advantages of R Markdown.

  • On the top right you can see the Environment. Here you have an overview over all the objects currently loaded in your environment. You will learn more about objects later in the course.

  • On the bottom left you have the Console: This is where the results appear once you execute your R-code. You can also directly type R-code into the console and execute it. However, we cannot save this code which is why we usually work in the Editor.

  • On the bottom right you can see the Output: Plots (if you use the standard R Script) and other things will appear here, don’t worry too much about it for the moment.

1.1.2 How to design it according to your preferences (optional)

  • You can change the appearance of RStudio: Tools > Global Options > Appearance. Here you can change the zoom of the Console, the font size of the your letters and the style of your code. Further, you can change RStudio to a dark theme. Play around with it and find out how it is the most comfortable for you and of course you can change it over time.

  • You can change the Pane Layout meaning where the four components of RStudio should be: RStudio: Tools > Global Options > Pane Layout.

  • You should use key shortcuts. There are pre-installed short-cuts of RStudio, which are really helpful. You should get familiar with them. Tools > Key Shortcut Helps or directly with Ctrl + Shift + K. You can add your own Key Shortcuts and we will do that in this course.

1.2 Lets get started: R as a fancy calculator

1.2.1 Mathmetical Operations in R

You can use R for basic calculations:

#You can use hashtags to comment

1 + 1 #Addition
## [1] 2
1 - 1 #Substraction
## [1] 0
1 * 1 #Multiplication 
## [1] 1
1 / 1 #Division
## [1] 1
2^(1 / 2) #Mixed Terms
## [1] 1.414214

You can use R also for TRUE/FALSE statements for logical statements via the comparison operators:

  • > greater than
  • < smaller than
  • == equal
  • != not equal
  • >= greater than or equal to
  • <= less than or equal to
1 < 3 #TRUE
## [1] TRUE
5 >= 8 #FALSE
## [1] FALSE
11 != 10 #TRUE
## [1] TRUE
22 == 22 #TRUE
## [1] TRUE
7 < 3 #FALSE
## [1] FALSE
5 <= 2+3 #TRUE
## [1] TRUE

You can also use logical operators: - & element-wise AND operator. It returns TRUE if both elements are true - | element-wise OR operator. It returns TRUE if one of the statements is TRUE - ! Logical NOT operator. It returns FALSE if statement is TRUE

5 & 4 < 8 #TRUE
## [1] TRUE
5 | 4 < 8 #TRUE
## [1] TRUE
!5 > 2 #FALSE
## [1] FALSE

1.2.2 Using Commands

For more advanced operations you can and should use functions. Functios are mostly a word with brackets. Within a function, there are so called arguments. These arguments specify the function and give it the information it needs + optional information.

e.g.

  • sqrt(x) taking the square root, x is any number.
  • exp(x) the constant e, x is any number.
  • mean(x) for the mean, x is any number.
  • median(x) for the median, x is any number.
sqrt(x = 36) #square root

exp(x = 0) # exponential of 1

print("U can stay under my umbrella") #with this command you can print what you want 

I explicitly choose easy examples, but sometimes commands can be complicated, because they demand special inputs. To get help, our first step should be to ask R itself:

  • You can put a question mark in front of a function and execute it. In your output under the tab “Help” an explanation with examples will pop up and explain the function.

  • You can also call the help() function and put the function you want to learn about without brackets in the help() function.

?exp() #questionmark
help(exp) #help command

1.2.3 Assigning objects and printing them

Most of the time you will store your results in objects.

  • You can do so by using the <- operator. Afterwards you can work with the objects. Let us assign numbers to out objects.

  • The objects will be saved and you can see them in your environment.

Pizza <- 7.50 #pizza object

Cola <- 3.50 #cola object

Pizza + Cola #addition of objects
## [1] 11

We can then go on work with the content of the objects.

  • For beginners, let us add the object Pizza and the object Cola together.

  • We can also save the result of those object in a new object and work with this object and this can go on forever technically.

Offer <- Pizza + Cola #assigning addition 

Offer #printing the object
## [1] 11
Offer^2 #square the term with ^2
## [1] 121

1.2.4 Vectors

In R a vector contains more than one information.

  • You use the c() command, and divide the information with a ,. Let us compare food prices:
food <- c("Pizza", "Kebab", "Curry", 
          "Fish", "Burrito") #food vector

print(food) #printing it 
## [1] "Pizza"   "Kebab"   "Curry"   "Fish"    "Burrito"
prices <- c(7.50, 6.00, 8.50, 3.00, 11.00) #price vector

print(prices) #printing it
## [1]  7.5  6.0  8.5  3.0 11.0
cola_prices <- c(3.50, 3, 4, 2.50, 3) #cola prices vector

print(cola_prices) #printing it
## [1] 3.5 3.0 4.0 2.5 3.0

Now we can calculate the prices for a decent meal in one step by adding the two vectors together. The vector prices_combined will give us the prices for a meal plus a cola:

prices_combined <- prices + cola_prices #prices combined

print(prices_combined) #printing it
## [1] 11.0  9.0 12.5  5.5 14.0

1.2.5 Object Classes

Objects can contain information of different data types:

Numeric Numbers c(1, 2.4, 3.14, 4)
Character Text c("1", "blue", "fun", "monster")
Logical True or false c(TRUE, FALSE, TRUE, FALSE)
Factor Category c("Strongly disagree",``"Neutral",``"Agree")

For data analysis commands sometimes require special object classes. With the class() command we can find out the class. And with as.numeric for example we can change classes by assigning it to itself, by it is common to assign it to a new object:

#Let us find out the classes 
class(prices) #numeric
## [1] "numeric"
class(food) #character
## [1] "character"
class(cola_prices) #numeric
## [1] "numeric"

We can also change the classes of variables. To do so, we can use as.factor(), as.numeric(), as.character() and so forth. You can do that for every class. Let us change the cola_prices to a vector.

  • To do so, we change call as.character() and put the object in it. Then we assign it to another object called cola_prices_character. This object will have the class "character".
# We want the cola_prices vector to be a character 
cola_prices_character <- as.character(cola_prices)

#Checking it
class(cola_prices_character)
## [1] "character"
print(cola_prices_character)
## [1] "3.5" "3"   "4"   "2.5" "3"

1.2.6 Matrices

1.2.6.1 Making Matrices

There are different ways of building a matrix. Let us start by just binding the vectors as columns together. You can do that cbind() if you want to bind columns together. rbind() is therefore the command to bind rows together.

  • You have to call cbind() and include the vectors you want to bind together

  • The same with rbind()

price_index <- cbind(food, 
                     prices,
                     cola_prices) #We bind it together

print(price_index) #We print it 
##      food      prices cola_prices
## [1,] "Pizza"   "7.5"  "3.5"      
## [2,] "Kebab"   "6"    "3"        
## [3,] "Curry"   "8.5"  "4"        
## [4,] "Fish"    "3"    "2.5"      
## [5,] "Burrito" "11"   "3"
#Let's do the same by binding the rows together

price_index2 <- rbind(food, 
                     prices,
                     cola_prices) #We bind it together

print(price_index2) #We print it 
##             [,1]    [,2]    [,3]    [,4]   [,5]     
## food        "Pizza" "Kebab" "Curry" "Fish" "Burrito"
## prices      "7.5"   "6"     "8.5"   "3"    "11"     
## cola_prices "3.5"   "3"     "4"     "2.5"  "3"

We can also generate a matrix by simulating it. There are a lot of things to care about, simply because a lot is possible:

  • You first call the matrix() command.

  • The first argument is an interval of numbers. These are our total observations if you want.

  • The second and third argument are our number of rows nrow() and our number of columns ncol(). If you multiply them, they have to result in the number of observations you defined before. We have 20 numbers and 4 multiplied by 5 is 20, thus this is fine.

  • Lastly you have to define if the numbers should be included from left to right, thus by row or if they should be ordered from top to bottom. We go through both examples and then it should become clear.

  • The dim() command is helpful, because it shows us to inspect the dimensions.

# Create a matrix
matrix_example <- matrix(1:20, nrow = 4, ncol = 5, byrow = T) #

# Checking it
print(matrix_example)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20
# Checking the dimensions
dim(matrix_example)
## [1] 4 5
# What happens if byrow is set to FALSE?
matrix_example2 <- matrix(1:20, nrow = 4, ncol = 5, byrow = F)

# Checking it 
print(matrix_example2)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    5    9   13   17
## [2,]    2    6   10   14   18
## [3,]    3    7   11   15   19
## [4,]    4    8   12   16   20
dim(matrix_example2)
## [1] 4 5

1.2.6.2 Working with Matrices

We want to work with matrices. The first tool to learn is how to inspect the matrices:

  • First, you call the matrix. In our example, matrix_example with square brackets.

  • In these brackets you can call single rows by entering the number of the row you want to inspect, and then you put a comma behind it to signal R that you want to have the row.

  • If you put a number behind the comma, you tell R to give you the column with the number.

  • If you want a single number then you have to define a row and a column.

#Let us get used to work with objects
row <- 1 
column <- 1 

#Printing it
print(object1 <- matrix_example[row, ]) #printing the first row
## [1] 1 2 3 4 5
print(object2 <- matrix_example[, column]) #printing the first column 
## [1]  1  6 11 16
print(object3 <- matrix_example[row, column]) #printing first row and column
## [1] 1
print(matrix_example) #printing the matrix
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20
#More Information 

nrow(matrix_example) #How many rows
## [1] 4
ncol(matrix_example) #How many columns
## [1] 5
dim(matrix_example) #Overall dimensions
## [1] 4 5

1.2.7 Data Frames

The next type of data storage are data frames. These are the standard storage objects for data in R. The reason is simple, matrices are only able to contain one type of variables (numeric, character, etc…).

  • In this example, we have a dataset with the variable country (character), the capital (character), the population in mio (numeric), and a if the country is in europe (logical)
# making an example df
df_example <- data.frame(
    country = c("Austria", "England", 
                "Brazil", "Germany"), 
    capital = c("Vienna", "London", 
                "Brasilia", "Berlin"), 
    pop = c(9.04, 55.98, 215.3, 83.8),
    europe = c(TRUE, FALSE, TRUE, TRUE)
  )

#Checking it
print(df_example)
##   country  capital    pop europe
## 1 Austria   Vienna   9.04   TRUE
## 2 England   London  55.98  FALSE
## 3  Brazil Brasilia 215.30   TRUE
## 4 Germany   Berlin  83.80   TRUE
  • If you want to work with data frames you can call columns by calling the name of the data frame putting a dollar sign behind it and then calling the name of the column, you want to inspect:
df_example$country  
## [1] "Austria" "England" "Brazil"  "Germany"

You see that you get the vector of the column. We can go further and work more with data frames

  • Let us call a single observation in the data frame: You call the column you want to inspect the same way as before, this time you also put square brackets behind and call the number of the observation in the column.

  • We want to have “Brazil”. “Brazil” is the third element of the df_example$country, thus we include 3 in the square brackets:

df_example$country[3]
## [1] "Brazil"

The last thing to do with data frames is to get columns based on conditions. In the next part of this chapter we will get a method to do so, but we can do so as well with the data frame. Imagine you want to have a vector of df$country, but only with countries that have a df$pop bigger than 60. Meaning a population bigger than 60.

  • To do so we call the df$country column and put square brackets behind it

  • Further we need to call in the square brackets the condition. Thus, the variable df_example$pop and set it to bigger than 60. Et voila, we get the columns of df_example$country bigger than 60.

df_example$country[df_example$pop > 60]
## [1] "Brazil"  "Germany"

1.3 ifelse statements and the ifelse() function

One of the most frequently used and therefore most important logical operators in programming in general are ifelse commands. You probably know them from Excel, but every programming language includes them, because of their usefulness. A quick reminder of their logic.

1.3.1 If else statement with only one condition

First, you define an if statement. The if statement is a logical statement for example bigger, smaller than X. The logical statement is your test expression. With this you tell the program to test the condition for an object, in excel a cell or whatever object in your programming language can be tested. The program then checks if the condition is TRUE or FALSE. Until this point every if else is the same and now we will look at some variations:

Figure 2: Logic of if-else statements
Figure 2: Logic of if-else statements

Figure 2 shows the logic of an if else statement with one condition. We say that if the test expression is true something should happen. If not nothing happens, because nothing was defined.

  • We want R to judge our grades in school. For this reason we define an object called grade and assign 1.7 to it.

  • In the next step we call if and open a bracket. We include the test expression in the bracket, which shall be if grade, our grade is smaller than 2

  • Then we open curly brackets to define what should happen if the test expression is TRUE. Meaning what should happen if the grade is better than 2. We define print(“Good Job”).

  • Here is the general logic of a if else statement:

    if (test expression) {

    Body of if

    }

grade <- 1.7 

if(grade < 2) {
  print("Good Job")
} # You write down if(test expression), and then the {body expression}, thus the body expression in fancy brackets.
## [1] "Good Job"
grade <- 2.5 

if(grade < 2) {
  print("Good Job")
} #Since the condition is not met, nothing happens

As you can see if the grade is bigger than 2 nothing happens. Otherwise “Good Job” is printed as intended.

1.3.2 if statements with else condition

An if command on its own is quite useless. Things get interesting, when we also have a body for the else command. What happens now, is that instead of nothing being printed when the test expression is FALSE. Then the body of else is printed.

Figure 3: Standard RStudio Interface
Figure 3: Standard RStudio Interface

Now we just add a body of else into the equation, in R that means we need to define it:

  • We take the if else statement of the chunk before and now we add an else and open curly brackets, where we define the body of else.
grade <- 3.3 #assigning a grade

if (grade <= 2) {
  print("Good Job")
} else {
  print("Life goes on")
} 
## [1] "Life goes on"
grade <- 1.3 #assigning a grade

if (grade <= 2) {
  print("Good Job")
} else {
  print("Life goes on")
}
## [1] "Good Job"

We see that in any case a result is printed.

1.3.3 The ifelse() command

What you see above is the manual way of coding an if else condition. But there is also an ifelse() function in R, where you do not have different colors and fancy brackets everywhere.

  • you call ifelse()

  • The first argument is your test expression

  • The second the body of if

  • The third the body of else

ifelse(test expression, body of if, body of else)

ifelse(grade <=2, "Good Job", "Life goes on") #ifelse command
## [1] "Good Job"

1.3.4 if else ladders/ if else with multiple conditions

The world is a complex place. Sometimes things are not black and white. Thus we have to be prepared for different scenarios. To make it less melodramatic, if else statements are also possible with several bodies of else. Let us take the example that we want to have a statement of R regarding our grade for different number intervals. What R does is to check the first condition, afterward he checks the second test expression and so forth until he finds a hit and prints it. If no expression is found, you have to define a last else expression.

  • You add else if with curly brackets where the test expression is included.

  • Lastly, you define an else with curly brackets for the case there is not test expression

grade <- 3.3 #Assigning a grade

if (grade == 1.0) {
  print("Amazing") 
} else if (grade > 1.0 & grade <= 2.0) {
  print("Good Job")
} else if (grade > 2.0 & grade <= 3.0) {
  print("OK")
} else if (grade > 3.0 & grade <= 4.0) {
  print("Life goes on")
} else {
  print("No expression found")
}
## [1] "Life goes on"
grade <- 1.7 #Assigning a grade

if (grade == 1.0) {
  print("Amazing") 
} else if (grade > 1.0 & grade <= 2.0) {
  print("Good Job")
} else if (grade > 2.0 & grade <= 3.0) {
  print("OK")
} else if (grade > 3.0 & grade <= 4.0) {
  print("Life goes on")
} else {
  print("No expression found")
}
## [1] "Good Job"
grade <- 5.0 #Assigning a grade

if (grade == 1.0) {
  print("Amazing") 
} else if (grade > 1.0 & grade <= 2.0) {
  print("Good Job")
} else if (grade > 2.0 & grade <= 3.0) {
  print("OK")
} else if (grade > 3.0 & grade <= 4.0) {
  print("Life goes on")
} else {
  print("No expression found")
}
## [1] "No expression found"

Again we can do so-called nested ifelse function, using the ifelse() command:

  • Instead of defining an else condition, we define another ifelse() command and do as long as we need to.

  • The last ifelse() is then the only one with a clear else condition.

grade <- 1.7 #Assigning a grade

ifelse(grade == 1.0, "Amazing", 
       ifelse(grade > 1 & grade <= 2, "Good Job", 
              ifelse(grade > 2 & grade <= 3, "OK", 
                     ifelse(grade > 3 & grade <=4, "Life goes on", 
                            "No expression found"
                            )
                     )
              )
       )
## [1] "Good Job"
grade <- 3.3 #Assigning a grade

ifelse(grade == 1.0, "Amazing", 
       ifelse(grade > 1 & grade <= 2, "Good Job", 
              ifelse(grade > 2 & grade <= 3, "OK", 
                     ifelse(grade > 3 &
                            grade <=4, "Life goes on", 
                            "No expression found")
                     )
              )
       )
## [1] "Life goes on"
# The same logic: ifelse(test expression, body expression if, ifelse(test expression 2, body expression if 2)) etc..

1.4 Outlook Fundamentals

This section was a brief introduction to the fundamentals of R. It was kept simple to give you a feeling for a R. These are the absolute basics, which are needed to understand everything, which will be built based on it.

1.5 Exercises Fundamentals

1.5.1 Exercise 1: Making your first Vector

Create a vector called my_vector with the values 1,2,3 and check is class.

#create the vector 
my_vector <- 

#check the class

1.5.2 Exercise 2: Making your first matrix

Create a Matrix called student. This should contain information about the name, age and major. Make three vectors wiht three entries and bind them together to a the matrix student. Print the matrix.

#Create the vectors 

name <- 
age <- 
major <- 

#Create the matrix
student <- 
  
#Print the matrix

1.5.3 Exercise 3: ifelse function

Write an ifelse statement that checks if a given number is positive or negative. If the number is positive, print “Number is positive”, otherwise print “Number is negative”. Feel free to decide if you want to use the ifelse function or the ifelse condition.

#Assigning the number to the object "number"
number <- -4

1.5.4 Exercise 4: ifelse ladders

Write an if-else ladder that categorizes a student’s grade based on their score. The grading criteria are as follows:

Score >= 90: “A” Score >= 80 and < 90: “B” Score >= 70 and < 80: “C” Score >= 60 and < 70: “D” Score < 60: “F”