I have a set of 37 products that I want people to rate 20 times, and I want each person to rate three different products. I think I need somewhere in the ballpark of 250 people to rate these products because (37 * 20) / 3 = 246.67.
Is there a way to sample three products from the set of products without replacement and to do so multiple times in a way that each product gets rated 20 times?
We can use the mpg dataset from the ggplot2 package for demonstration purposes. The mpg dataset has 38 distinct car models in it. I can sample three car models from the dataset. Would you happen to know how I can do this approximately 246 more times, ensuring that each car model is sampled 20 times?
library(tidyverse)
# get a character vector of 38 distinct car models from the mpg dataset
mpg %>%
distinct(model) %>%
pull(model) ->
car_model
# sample three car models without replacement
sample(
x = car_model,
size = 3,
replace = FALSE
)
Basically, take_one is a function that takes in a tibble with bags (i.e., each model starts with a bag of 20 which is the maximum number this model can be chosen). The function takes three cars and modifies bags to decrement times in corresponding rows.
You then iteratively go over 250 people and create 3 choices for each person to rate. The error is there because at some point you have a situation where you want to pick 3 things, but you only have 2 rows left (since 247 is not divisible by 3). But good thing is that you'll still get your choices object that you can then use.