Description
-
Chapter number: 15
-
Section number: Lab: Foundations for statistical inference - Sampling distributions
-
Other location identifier, if any (e.g., figure number, table number, footnote number, etc.):
Section Titled:
Interlude: Sampling distributions -
Original text:
global_monitor %>% rep_sample_n(size = 50, reps = 15000, replace = TRUE) %>% count(scientist_work)
-
Suggestion for corrected text:
global_monitor %>% rep_sample_n(size = 50, reps = 15000, replace = TRUE) %>% mutate(scientist_work = as.factor(scientist_work)) |> count(scientist_work, .drop = FALSE)
-
Justification for suggestion:
As currently written, you will filter out samples with a 0% success rate. In order to include counts of 0, you need to code the variable of interest as a factor and use the.drop
argument of thecount()
function to retain all levels. Leaving out counts of 0 means graphing the sampling distribution will omit the left-most values. While this doesn't often occur with samples of size 50 and ample proportion p, can be an issue with small samples such as in Exercise 8 of this lab.