if I had this code,
library(tidymodels)
library(epiDisplay)
data(Marryage)
null_dist <- Marryage %>%
specify(response = birthyr) %>%
hypothesize(null = "point", mu = 1960) %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "mean")
null_dist %>%
visualize()
There are things I don't understand.
In bootstrap, we create fake data.
If I collect the fake data and calculate the average, it should converge to the average of the first SAMPLE data.
Marryage %>%
specify(birthyr ~ sex) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "bootstrap") %>%
group_by(replicate) %>% summarise(nn=n(),su=sum(birthyr),mm=mean(birthyr),ss=sd(birthyr))%>%
ggplot()+
aes(x=mm)+
geom_histogram()
However, the histgram created by first code converges to 1960(center), which is specified by null.
Is this histgram being created by type="simurate"?
Or, is this histgram obtained because I discarded all but the bootstrap sample, which has a mean value of 1960?
thank you for read this line.