Lab: Probability distributions

Load packages and data

Exercise 1 (Having 80 or less LSAS at screening)  

Check the mean and SD of LSAS scores at screening. Assuming it is normally distributed, what is the probability that a person will have a score of 80 or less?

Use the mean()and sd() functions to find the values in the dataset df_clean

lsas_mean <- mean(df_clean$lsas_screen) lsas_sd <- sd(df_clean$lsas_screen) pnorm(q = 80, mean = lsas_mean, sd = lsas_sd)
lsas_mean <- mean(df_clean$lsas_screen)
lsas_sd <- sd(df_clean$lsas_screen)
pnorm(q = 80, mean = lsas_mean, sd = lsas_sd)

Exercise 2 (Probability of having a LSAS score of 100 or more)  

What is the probability that a person will have a LSAS score at screening of 100 or more, assuming the scores are normally distributed?

Use the mean()and sd() functions to find the values in the dataset df_clean.

Also, since we are looking at a value equal or greater to a certain point of the cumulative probability distribution, we can use 1-pnorm().

lsas_mean <- mean(df_clean$lsas_screen) lsas_sd <- sd(df_clean$lsas_screen) 1 - pnorm(q = 100, mean = lsas_mean, sd = lsas_sd)
lsas_mean <- mean(df_clean$lsas_screen)
lsas_sd <- sd(df_clean$lsas_screen)
1 - pnorm(q = 100, mean = lsas_mean, sd = lsas_sd)

Exercise 3 (Inclusion rate in the STEpS study)  

Let’s say that the average inclusion rate in the STEpS study was 5 participants per week. What is the probability that they’ll include 3 or less people the coming week?

Since we are looking at a rate we can use the function for the cumulative probabililty of the Poisson distribution, using the ppois() function

ppois(q = 3, lambda = 5)
ppois(q = 3, lambda = 5)

Exercise 4 (Probability of selecting only women in a sample of 5 participants)  

In the STEpS study, 78% of the participants were women. What is the probability that 5 randomly chosen participants would all be women?

Since we are looking at a binary variable, we can get the probability density function for the binomial distribution, using the dbinom() function

dbinom(x = 5, size = 5, prob = 0.78)
dbinom(x = 5, size = 5, prob = 0.78)

Or since these are random events, we could get the joint probability by multiplying the probabilities 5 times

0.78 * 0.78 * 0.78 * 0.78 * 0.78
[1] 0.2887174
# which is the same as
0.78^5
[1] 0.2887174

Summary

In this lab, we learned about the probability distributions and how to use them to calculate the probability of different events. We used helpful base R functions to calculate the probability of variables being over or under a certain value.