Lab 1

Due Thursday September 4 at 11:59 PM

This class will introduce you to the basics of the R programming language, but in contrast to courses like STA 101 or 1991, we will not primarily be using R for data analysis. Instead, we will be using it to run simulations. That is, we will use R to build little computational “laboratories” where we can poke and prod a random system and explore its properties. This skill complements a lot of the math we will be doing. In the modern era, mathematical reasoning and computer simulation are mutually reinforcing; you can check whether you did a calculation correctly by comparing with a simulation, and you can verify that a simulation result is legit by doing a proper calculation. Furthermore, in complex systems where the math is intractable, you can use simulation to explore what’s going on.

During the semester, I will summarize the basic R skills we need in a series of concise “explainers” posted to the course webpage. Here’s what we have so far:

  1. Getting into RStudio;
  2. Understanding the RStudio layout;
  3. Creating your lab write-ups with Quarto;
  4. Typing up pretty equations with Quarto;
  5. Using R as a big calculator;
  6. Introduction to vectors;
  7. Plotting line graphs with the curve function;
  8. Simulating random experiments with the sample function;
  9. if-else and logical operations;
  10. for loops;
  11. Vectorization.

Please dip into those as needed.

In today’s lab, you will use R to simulate some basic random experiments and approximate probabilities. Along the way, you will get a light workout with if-else statements and for loops.

Download the submission template

Go to the files in Canvas and download the blank Quarto file you will use to create your write-up. Once you have this file, upload it to your container and get crackin’!

Task 1 (I do it)

We want to approximate the probability that a fair coin lands on Heads. We already know it’s 0.5. Does the computer agree?

This code flips the coin 30 times:

flips <- sample(c("H", "T"), size = 30, replace = TRUE)
flips
 [1] "T" "H" "T" "H" "T" "H" "H" "H" "T" "H" "H" "H" "T" "H" "H" "H" "T" "H" "T"
[20] "H" "T" "T" "T" "T" "T" "H" "T" "H" "T" "H"

This code computes the proportion of flips that came up heads:

mean(flips == "H")
[1] 0.5333333

It is equivalent to counting the number of heads and dividing by the total number of flips:

sum(flips == "H") / 30
[1] 0.5333333

Do you see why?

Task 2 (we do it)

Imagine you roll two fair dice. What is the probability that the sum of the two numbers you roll is even? To approximate this, work together with Aurora to complete the following steps:

  • Use a for loop to repeat the experiment 1,000 times:
    • Roll two dice using sample();
    • Add them together;
    • Use if else to record a 1 if the sum is even, and a 0 if the sum is odd.
  • At the end, compute the proportion of 1 you simulated to estimate the probability.
set.seed(123)

outcomes <- c()

for (i in 1:1000) {
  rolls <- sample(1:6, size = 2, replace = TRUE)
  
  if (sum(rolls) %% 2 == 0) {
    outcomes <- c(outcomes, 1)
  } else {
    outcomes <- c(outcomes, 0)
  }
}

mean(outcomes)
[1] 0.511

Task 3 (you do it)

Here is a vector containing all of the cards in a standard 52-card deck:

deck_of_cards <- c("AH", "2H", "3H", "4H", "5H", "6H", "7H", "8H", "9H", "10H", "JH", "QH", "KH",
                   "AD", "2D", "3D", "4D", "5D", "6D", "7D", "8D", "9D", "10D", "JD", "QD", "KD",
                   "AC", "2C", "3C", "4C", "5C", "6C", "7C", "8C", "9C", "10C", "JC", "QC", "KC",
                   "AS", "2S", "3S", "4S", "5S", "6S", "7S", "8S", "9S", "10S", "JS", "QS", "KS")

The first character is the rank of the card: Ace (A), 2, 3, 4, …, Jack (J), Queen (Q), King (K). The second character is the suit of the card: hearts (H), diamonds (D), clubs (C), spades (S).

Now, imagine that a five-card hand is randomly dealt to you from a well-shuffled deck. What is the probability that the hand contains at least one ace? Very soon in our course you’ll learn how to do the math and compute the number exactly, but until then, you can approximate it with simulation. Write a for loop that simulates 5,000 five-card hands, and use the results to approximate the probability that your hand contains at least one ace.

There are many correct ways to do this, but you might find the any function helpful.

set.seed(8675309)

outcomes <- c()

for (i in 1:5000) {
  hand <- sample(deck_of_cards, size = 5, replace = FALSE)
  if (any(hand %in% c("AH","AD","AC","AS"))) {
    outcomes <- c(outcomes, 1)
  } else {
    outcomes <- c(outcomes, 0)
  }
}

mean(outcomes)
[1] 0.3328

Footnotes

  1. The wise among us know that these are pronounced STAAAWANOWA and STAAAWANANA, respectively.↩︎