I have a dataset of 66,527,460. They are sorted in ascending order based on count values. Is there a way to write a function or loop in R that will tell me what the 1st through 10th decile is and what the min and max value for 'count' variable?
I am breaking up my deciles like this so it knows how to be defined in a loop, etc..
Ok-- if I need to create my own breaks where decile 1 is taking the values from the range stated below, is there a quick function or command to compute that for all the remaining groups ?
Deciles would be a population broken up into 10 equally populated categories.
You have asked instead to divide a population into 10 equal ranges within which populations may vary.
You might approach your task in the following way.
# some example data to work with
(plraw <- sort(iris$Petal.Length))
# get parameters of our operation
(range_plraw <- range(plraw))
# 10 partitions fall between 11 positions
(breaks_i_want <- seq(from=range_plraw[1],
to=range_plraw[2],
length.out = 11))
#apply the breaks to the main population
(plcut <- cut(plraw,breaks = breaks_i_want,include.lowest = TRUE))
#summary
table(plcut)
plot(plcut)