Chi-square test of independence

fcas80 · July 17, 2022, 6:39pm

Hi, Are crimes and storms independent? I was expecting chi-square to tell me no, that a disproportionate number of crimes occur when there are storms. Where is my work incorrect? Thank you.

# Are crimes and storms independent?
df <- data.frame(no_crime=c(11,32), crime=c(40,167))
rownames(df) = c("no_storm", "storm")
df
chi <- chisq.test(df) 
p <- chi$p.value 
cat("p = ",round(p,3), "\n")
ifelse(p < .05, ("Reject the null hypothesis, and conclude that the two variables are in fact dependent."), ("Fail to reject the null hypothesis, and conclude that the two variables are in fact independent." ))

TJH-research · July 17, 2022, 7:21pm

Hi Jerry!

I am fascinated by your question. Looks like your R code is right. You pass a contingency table to chisq.test.

Two things. First, why do you expect a lower p-value? When there is no storm, there is no crime 11 times and crime 40 times. That’s about 20% no crime. When there is a storm, there is no crime 32 times and crime 167 times, about 15%. These are not that different, so the p-value of .472 seems to be accurate.

We can also debate if using a dichotomous cut-off for p-values is a good way to determine significance. Either way, the effect size matters as much or more than the p-value.

Best,
Tom

fcas80 · July 18, 2022, 1:51am

Tom, thanks for your note. Now that I think about it,

P(crime) = 207/250 = .828
P(storm) = 199/250 = .796
P(crime & storm) = 167/250 = .668
P(crime | storm) = P(crime & storm) / P(storm) = .668/.796 = .839, which is pretty close to P(crime) = .828, so I guess the variables are pretty close to independent.

rwalker · July 18, 2022, 8:25am

The Yates correction is doing quite a bit of work here. The uncorrected statistic is 0.859 while the Yates corrected statistic is only 0.516.

TJH-research · July 18, 2022, 11:48am

I always learn something new here. For all my p-value doubts, I did just look at .516 and say, "Yep, that looks high. Probably correct."

You can see chisq.test did use Yates correction by inspecting 'method':

df <- 
  data.frame(no_crime=c(11,32), crime=c(40,167))
  rownames(df) = c("no_storm", "storm")
df
chi <- chisq.test(df) 

chi$method

But is there a way to ask it not to perform the Yates correction?

Andrzej · July 18, 2022, 12:06pm

Here you are:

chi2 <- chisq.test(df, correct=FALSE)

rwalker · July 18, 2022, 2:56pm

It tells you in the output, "with Yates correction" but the OP's custom decision rule based on p-values masks that output as a choice.

system · July 25, 2022, 2:57pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.