Big dataset - calculation based on group

Hi
Ive got a huge dataframe with about 280 000 rows. Im now trying to do some basic calculations based on what iteration the data comes from.

For every row from iteration 1 I need to calculate the mean difference, standard deviation, and do a t-test. Then do the same for data from iteration nr 2 etc, until Ive done this for all 100 iterations.
Given that theres almost 28 000 rows for each iterations I cant do this manually 100 times. There is probably a faster way to go about this. Hope anyone can help me out. Thank you :slight_smile:

   Iteration       Volume     Volume_pred
1     1           5.835843      4.056030
2     1           5.142234      4.933910
3     1           2.752558      2.733790
4     1           8.253517      7.869620
5     1           9.671570      8.172544
6     1           8.370689      8.707087
..    ..
280K 100         8.9912312     9.011232

You can read about how to do calculations by group here:
5 Data transformation | R for Data Science (had.co.nz)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.