Beginner question with running a t-test

Hi all! I'm new here and new to R as well. I've run into some trouble with running a t-test. I have the following data set where T represents the test group, C represents the Control and Before and After represents test scores before and after treatment. I am trying to run a t-test to determine whether the treatment caused greater improvement than the control. My problem is that I am unsure how to go about doing the test comparing these two groups. My first instinct was to create a group called "diff" where diff=After-Before. But how do I separate those differences by treatment? I have only been able to guess at how to do this; I tried to create a group I called G1 by using G1=diff[which(Group=T)], but that doesn't work and I've tried a couple other things besides that, but none have been successful. Am I overthinking this? >_<
Group Before After
1 T 18 24
2 T 18 25
3 T 21 33
4 T 18 29
5 T 18 33
6 T 20 36
7 T 23 34
8 T 23 36
9 T 21 34
10 T 17 27
11 C 18 29
12 C 24 29
13 C 20 24
14 C 18 26
15 C 24 38
16 C 22 27
17 C 15 22
18 C 19 31

Welcome to the community!

I think you're over-complicating things. When you find the differences between the responses before and after the treatment, the group of the individual (whether case or control) doesn't change. So, you can apply t-test on the diff variable using the existing Group variable.

I'll suggest you to check the function by, if you're planning to test manually. Otherwise,t.test is great.

It looks like a assignment problem, so I'll suggest you to get familiar with the homework policy:

Thank you!
I did feel like I was overcomplicating it, I just wanted to be sure. :slight_smile:
And it's not a verbatim question from any assignment, I just needed general insight into whether I was overthinking the situation.

Just for future reference, when you post a question here, is better if you do it in the form of a REPRoducible EXample (reprex), to exemplify, in your case your question should look similar to this (not an answer).

# Sample data on a copy/paste friendly format
df <- data.frame(
    Before = c(18L, 18L, 21L, 18L, 18L, 20L, 23L, 23L, 21L, 17L, 18L, 24L,
               20L, 18L, 24L, 22L, 15L, 19L),
    After = c(24L, 25L, 33L, 29L, 33L, 36L, 34L, 36L, 34L, 27L, 29L, 29L,
              24L, 26L, 38L, 27L, 22L, 31L),
    Group = as.factor(c("T", "T", "T", "T", "T", "T", "T", "T", "T", "T",
                        "C", "C", "C", "C", "C", "C", "C", "C"))
)
# Code relevant to the issue
t.test(Before ~ Group, data = df)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  Before by Group
#> t = 0.22743, df = 12.116, p-value = 0.8239
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -2.571012  3.171012
#> sample estimates:
#> mean in group C mean in group T 
#>            20.0            19.7
t.test(After ~ Group, data = df)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  After by Group
#> t = -1.2744, df = 14.482, p-value = 0.2226
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -7.631457  1.931457
#> sample estimates:
#> mean in group C mean in group T 
#>           28.25           31.10

Created on 2019-03-17 by the reprex package (v0.2.1)

A reprex is always useful, but I think the OP asked this question more from the theoretical approach.

I hope you don't mind, but I think your code is a little misleading. I don't think that applying t-test separately on Before and After makes much sense, as the variable of interest is the effect of the treatment, not the responses. So, probably t.test((After - Before) ~ Group, data = df) would be preferable.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.