Manipulating stacked datasets

yellowcab74 · November 6, 2024, 5:16pm

For the below stacked 5 datasets, how can I get the average of y by common values of x1 and x2? Note that the size of each data vary (either 4 or 5). This is like merging the 5 datasets by x1, x2 and evaluating the mean of y
Thanks

Data	x1	x2	y
1	0	0	1
1	1	0	0.96580995
1	4	1	0.9488297
1	6	1	0.93190954
2	0	0	1
2	2	0	0.96580995
2	4	1	0.9488297
2	5	1	0.93190954
2	6	1	0.85664971
3	0	0	0.96580995
3	2	0	0.79060542
3	4	0	0.77424697
3	5	1	0.76613367
3	7	10	0.75801814
4	0	0	1
4	2	0	0.9488297
4	4	1	0.93190954
4	5	1	0.70995093
5	0	0	1
5	2	0	0.67061919
5	4	1	0.66277152
5	5	1	0.65492144
5	6	1	0.63914986

prubin · November 6, 2024, 5:33pm

Assuming that your data is in a data frame named df, you can load the dplyr library and run the following:

df2 <- df |> group_by(x1, x2) |> summarize(y_mean = mean(y))

The resulting data frame df2 looks like this:

# A tibble: 8 × 3
# Groups:   x1 [7]
     x1    x2 y_mean
  <dbl> <dbl>  <dbl>
1     0     0  0.993
2     1     0  0.966
3     2     0  0.844
4     4     0  0.774
5     4     1  0.873
6     5     1  0.766
7     6     1  0.809
8     7    10  0.758

yellowcab74 · November 6, 2024, 5:55pm

Thanks so much for the fast response. This is doing exactly what I wanted

yellowcab74 · November 7, 2024, 3:25pm

As a follow-up to my previous question, how can I get the minimum of y within groups of x1 and x2? for example, the results should be as follows

     x1          x2        min_y
    0-2          0          x.xxx
    3-4          0          x.xxx
    0-2          1          x.xxx
   etc           etc

prubin · November 7, 2024, 4:09pm

First create a new column with the x1 grouping (let's call it x1g). Then repeat the previous solution but group by x1g and x2 (or x1g and x2g if you do groupings on both variables) and change mean(y) to min(y).

system · November 14, 2024, 4:09pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.