Can anyone explain this madness to me?
I am performing a t-test using the t.test function with default parameters.
My two data arrays are this
control test
2.75E-05 0.000395
3.7E-05 0.000429
1.78E-05
2.51E-05
2.05E-05
2.19E-05
2.52E-05
3.15E-05
2.21E-05
Here is the result:
Welch Two Sample t-test
data: control and test
t = -22.589, df = 1.0272, p-value = 0.02602
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.0005909187 -0.0001822813
sample estimates:
mean of x mean of y
2.54e-05 4.12e-04
How is this possible to get such a high p-value with such data?
By contrast, if I use this dataset:
control test
0.000187 0.000346
7.41E-05 0.000368
4.27E-05
0.000125
4.92E-05
0.000114
6.56E-05
9.44E-05
7.52E-05
I get
Welch Two Sample t-test
data: as.numeric(mmm) and as.numeric(lll)
t = 14.261, df = 5.7007, p-value = 1.123e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.0002190199 0.0003111578
sample estimates:
mean of x mean of y
3.570000e-04 9.191111e-05
It looks like sorcery to me. The first dataset ought to give a much smaller p-value than the upper one.
Moreover, when I do the t-test in libreoffice, I get the expected result, i.e. p=2.3E-12 for the first dataset and p=2.35E-05 for the second.
Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff, and you're so close right now!
There's also a nice FAQ on how to do a minimal reprex for beginners, below:
What to do if you run into clipboard problems
If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.
#here is the first dataset (the one that doesn't yield the expected p-value)
c(2.75e-05,3.70e-05,1.78e-05,2.51e-05,2.05e-05,2.19e-05,2.52e-05,3.15e-05,2.21e-05)->control
c(0.000395,0.000429)->test
t.test(test,control)
#>
#> Welch Two Sample t-test
#>
#> data: test and control
#> t = 22.589, df = 1.0272, p-value = 0.02602
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> 0.0001822813 0.0005909187
#> sample estimates:
#> mean of x mean of y
#> 4.12e-04 2.54e-05
#here is the second dataset, which yields an expected p-value
c(1.87e-04,7.41e-05,4.27e-05,1.25e-04,4.92e-05,1.14e-04,6.56e-05,9.44e-05,7.52e-05)->control
c(0.000346,0.000368)->test
t.test(test,control)
#>
#> Welch Two Sample t-test
#>
#> data: test and control
#> t = 14.261, df = 5.7007, p-value = 1.123e-05
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> 0.0002190199 0.0003111578
#> sample estimates:
#> mean of x mean of y
#> 3.570000e-04 9.191111e-05
While the size of the difference in means affects the test statistic and therefore the size of the p-value, sample size (degrees of freedom) and variance also have an impact. The big difference for your example when using a Welch t-test (which allows for unequal variances) is in the degrees of freedom.
For a Welch t-test, the degrees of freedom are based on the variance of each group as well as the sample size in each group. This leads to your first example having extremely limited degrees of freedom, so even though the first test statistic is larger the resulting p-value is also larger. (You can see the equation for the degrees of freedom calculation for a Welch t-test on Wikipedia).
If you do a standard t-test (using var.equal = TRUE) , the degrees of freedom for the two examples will be identical and since the pooled variance isn't so dissimilar the larger difference in means leads to a smaller p-value.
Thanks! You're right that if I put unequal variance in libreoffice, I get results consistent with R.
But I still do not understand why a t-test should output something like this. The test and controls in scenario 1 appear much more different than in scenario 2.
Thanks aosmith for pointing me into the right direction. I will read into the Welch t-test, and try to understand this phenomenon that puzzles me.