omario
1
I have the following dataset:
library(dplyr)
library(gtools)
set.seed(123)
my_data = data.frame(var1 = rnorm(10000,100,100))
I am trying to create quantiles (e.g. by 100) for this variable:
final = my_data %>%
summarise(quants = quantcut(my_data$var1, 100), count = n())
Now, I am trying to add a "rank" variable that will assign the smallest quantile range as 1 ... all the way to the max value:
final = my_data %>%
summarise(quants = quantcut(my_data$var1, 100), count = n()) %>%
group_by(quants) %>%
mutate(rank = row_number())
The code ran - but all ranks are 1
> table(final$rank)
1
100
Can someone please show me how to fix this?
Thanks!
I think you want to use mutate and not summarise.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(gtools)
set.seed(123)
my_data = data.frame(var1 = rnorm(10000,100,100)) %>%
as_tibble()
final = my_data %>%
mutate(quants = quantcut(var1, 100),
rank=as.numeric(quants))
final %>% count(quants)
#> # A tibble: 100 × 2
#> quants n
#> <fct> <int>
#> 1 [-285,-132] 100
#> 2 (-132,-105] 100
#> 3 (-105,-91.3] 100
#> 4 (-91.3,-75.4] 100
#> 5 (-75.4,-64.2] 100
#> 6 (-64.2,-55.7] 100
#> 7 (-55.7,-48.1] 100
#> 8 (-48.1,-41] 100
#> 9 (-41,-34.3] 100
#> 10 (-34.3,-27.9] 100
#> # … with 90 more rows
final %>% count(quants, rank)
#> # A tibble: 100 × 3
#> quants rank n
#> <fct> <dbl> <int>
#> 1 [-285,-132] 1 100
#> 2 (-132,-105] 2 100
#> 3 (-105,-91.3] 3 100
#> 4 (-91.3,-75.4] 4 100
#> 5 (-75.4,-64.2] 5 100
#> 6 (-64.2,-55.7] 6 100
#> 7 (-55.7,-48.1] 7 100
#> 8 (-48.1,-41] 8 100
#> 9 (-41,-34.3] 9 100
#> 10 (-34.3,-27.9] 10 100
#> # … with 90 more rows
Created on 2023-01-19 with reprex v2.0.2
1 Like
omario
3
Thank you so much! Is it possible to have the "count" variable in the same step?
system
Closed
4
This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.