Hi, Are crimes and storms independent? I was expecting chi-square to tell me no, that a disproportionate number of crimes occur when there are storms. Where is my work incorrect? Thank you.
# Are crimes and storms independent?
df <- data.frame(no_crime=c(11,32), crime=c(40,167))
rownames(df) = c("no_storm", "storm")
df
chi <- chisq.test(df)
p <- chi$p.value
cat("p = ",round(p,3), "\n")
ifelse(p < .05, ("Reject the null hypothesis, and conclude that the two variables are in fact dependent."), ("Fail to reject the null hypothesis, and conclude that the two variables are in fact independent." ))
I am fascinated by your question. Looks like your R code is right. You pass a contingency table to chisq.test.
Two things. First, why do you expect a lower p-value? When there is no storm, there is no crime 11 times and crime 40 times. That’s about 20% no crime. When there is a storm, there is no crime 32 times and crime 167 times, about 15%. These are not that different, so the p-value of .472 seems to be accurate.
We can also debate if using a dichotomous cut-off for p-values is a good way to determine significance. Either way, the effect size matters as much or more than the p-value.
Tom, thanks for your note. Now that I think about it,
P(crime) = 207/250 = .828
P(storm) = 199/250 = .796
P(crime & storm) = 167/250 = .668
P(crime | storm) = P(crime & storm) / P(storm) = .668/.796 = .839, which is pretty close to P(crime) = .828, so I guess the variables are pretty close to independent.