Bin datagram into groups of 20

I have a dataframe that is sorted on one column variable high to low.

I then want to take each 20 rows, and add a column of metadata that identifies those top 20 rows. Then the next 20 rows, then the next 20 rows etc.. as groups.

Heres an example: This data frame is ordered by column B highest to lowest.

A <- c(1:10)
B <- c(10:1)

df <- data.frame(A, B)

print (df)
   A  B
1   1 10
2   2  9
3   3  8
4   4  7
5   5  6
6   6  5
7   7  4
8   8  3
9   9  2
10 10  1

Now I want to take each 2 rows, which would then be bins of 2 of the highest column B numbers. And I want to label each bin of 2 so it ends up like this:

A <- c(1:10)
B <- c(10:1)
C <- c(1,1,2,2,3,3,4,4,5,5)

df <- data.frame(A, B, C)

print (df)
    A  B C
1   1 10 1
2   2  9 1
3   3  8 2
4   4  7 2
5   5  6 3
6   6  5 3
7   7  4 4
8   8  3 4
9   9  2 5
10 10  1 5

Except that I want each bin to be 20 elements not 2 but this is the idea...

Thanks!

Here is one method. To get to groups of 20, you would just change the %/% 2 to %/% 20.

DF <- data.frame(A = 1:10, B = 10:1)
library(dplyr)
DF
    A  B
1   1 10
2   2  9
3   3  8
4   4  7
5   5  6
6   6  5
7   7  4
8   8  3
9   9  2
10 10  1
DF <- DF %>% mutate(C = (row_number() + 1) %/% 2)
DF
    A  B C
1   1 10 1
2   2  9 1
3   3  8 2
4   4  7 2
5   5  6 3
6   6  5 3
7   7  4 4
8   8  3 4
9   9  2 5
10 10  1 5

It works but the first group starts with "0" instead of 1 and is only 18 elements long. All the other ones have 20 elements?

Yes, I didn't test the case of groups of 20 and made a mistake. Use this version of that line:

DF <- DF %>% mutate(C = (row_number() + 19) %/% 20)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.