Tidyverse group by data wrangling

Hello. I have a dataset that looks like the following.

data <- data.frame(
  Resp = rep(c(4,5,1,3,4,2,4,5,1,2,4,5), 4) ,
  Letters = rep(c("A", "A", "B", "B"), each = 12),
  Groups = rep(c("Group 1", "Group 2"))
)

What I am looking to do, is find the difference in my outcome (Resp) between the two levels of Groups ( Group 1 and Group 2) within each of a grouping variable (Letters).

In other words, my desired output here would be 2 vectors. One with the differences between group 1 and group 2 when Letters = A and one with the differences between group 1 and group 2 when letters = B.

I was looking to do this in tidyverse somehow using group_by but I couldn't figure it out.

Many thanks for any help here!

@Robin_W Welcome to RStudio Community!

Your example data contains duplicates; this probably does not mimic your problem.

I have created a new toy data set and calculated the difference. See if this is what you wanted. Otherwise, provide minimal data and also the expected outcome.

library(tidyverse)
# Original toy data contains duplicates----
data <- data.frame(
  Resp = rep(c(4,5,1,3,4,2,4,5,1,2,4,5), 4) ,
  Letters = rep(c("A", "A", "B", "B"), each = 12),
  Groups = rep(c("Group 1", "Group 2"))
)


# New toy data----
new_toy_data <- data %>% 
  distinct() %>% 
  group_by(Letters, Groups) %>% 
  slice(1) %>% 
  ungroup()

new_toy_data
#> # A tibble: 4 × 3
#>    Resp Letters Groups 
#>   <dbl> <chr>   <chr>  
#> 1     4 A       Group 1
#> 2     5 A       Group 2
#> 3     4 B       Group 1
#> 4     5 B       Group 2


# Find difference-----
new_toy_data %>% 
  # reshape
  pivot_wider(names_from = "Groups",
              values_from = "Resp") %>% 
  # find difference
  mutate(difference = `Group 1` - `Group 2`)
#> # A tibble: 2 × 4
#>   Letters `Group 1` `Group 2` difference
#>   <chr>       <dbl>     <dbl>      <dbl>
#> 1 A               4         5         -1
#> 2 B               4         5         -1

Created on 2022-11-01 with reprex v2.0.2

Thanks for the welcome and thanks for your help! Sorry about the duplicates in my example data, I probably should have just generated some random numbers instead.

Anyhow, the solution was using pivot_wider. I've been able to do what I needed to do.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.