Trying to select rows of a df within pipes

Anisha · June 5, 2019, 1:32am

Hello,

I am trying to create a new df that contains the top 15 varieties of wines that show up the most in the data set that I am working with.

I am able to get the count of each variety, and arrange in descending order, but am having issues selecting just those top 15. Can someone help?

When I run the code below I get this error: Error: Column indexes must be at most 2 if positive, not 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 Call rlang::last_error() to see a backtrace

Code:

WineReviewDataVar <- WineReviewData %>%
group_by(Variety) %>%
count()

WineReviewData_Top15Var <- WineReviewDataVar %>%
arrange(desc(n)) %>%
WineReviewData_Top15Var[1:15, ]

technocrat · June 5, 2019, 1:51am

I would be more confident in my answer with a reproducible example, called a reprex, but try replacing the last line with top_n(15)

FJCC · June 5, 2019, 1:54am

Here is an example with invented data. I used head instead of top_n().

library(dplyr)
set.seed(67489)
df <- data.frame(Variety = sample(LETTERS, 200, replace = TRUE),
                 Score = sample(1:10, 200, replace = TRUE))
head(df)
#>   Variety Score
#> 1       X     3
#> 2       K    10
#> 3       M     8
#> 4       U     3
#> 5       Z     2
#> 6       W    10
dfVar <- df %>% count(Variety)
head(dfVar)
#> # A tibble: 6 x 2
#>   Variety     n
#>   <fct>   <int>
#> 1 A           8
#> 2 B           6
#> 3 C           5
#> 4 D           8
#> 5 E           9
#> 6 F           5
dfTop15 <- dfVar %>%  arrange(desc(n)) %>% head(15)
dfTop15
#> # A tibble: 15 x 2
#>    Variety     n
#>    <fct>   <int>
#>  1 Q          14
#>  2 S          12
#>  3 O          11
#>  4 N          10
#>  5 T          10
#>  6 X          10
#>  7 E           9
#>  8 P           9
#>  9 W           9
#> 10 A           8
#> 11 D           8
#> 12 G           8
#> 13 Z           8
#> 14 L           7
#> 15 Y           7

^{Created on 2019-06-04 by the reprex package (v0.2.1)}

Anisha · June 5, 2019, 11:26pm

Thank You!
I've never see the top_n() function before.

Anisha · June 5, 2019, 11:27pm

Thank you! I didn't think of using the head() function in this way, and it does work!

system · June 26, 2019, 11:27pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.