Hi there,
I want to calculate the 5%, 25%, 50%, 75% and 95% quantiles (percentiles) for my data.table for every datetime (yellow). So I get five values for every time in a day.
I tried:
quantile(dt.data, c(.05, .25, .5, .75, .95))
but I get this Error: Error in [.data.frame
(x, i) : undefined columns selected
Maybe I have to use something like this:
out <- dt.data[ ,lapply(.SD, quantile, prob = c(.05, .25, .5, .75, .95), na.rm = TRUE), by = c("date")]
What's wrong?
Greetings, Christina
The error tells you the problem: you need to specify over which columns to apply the function.
You have edited the code while have been writing this, so in the absence of a reproducible example I think something like this should work:
out <- dt.data[ ,lapply(.SD, quantile, prob = c(.05, .25, .5, .75, .95), na.rm = TRUE), .SDcols = grep("V", names(dt.data), value = TRUE)]
There may be a shorter version of the .SDcols specification.
hohenc
July 1, 2021, 10:38am
4
martin.R:
out <- dt.data[ ,lapply(.SD, quantile, prob = c(.05, .25, .5, .75, .95), na.rm = TRUE), .SDcols = grep("V", names(dt.data), value = TRUE)]
Hm, with your code I get a table for the v's from 1 to 31.
I need it per quantiles, so it looks like this:
hohenc
July 1, 2021, 10:55am
6
Thanks, it worked with:
dt.data[, as.list(quantile(.SD, c(.05, .25, .5, .75, .95), na.rm = TRUE)), by c("date")]
system
Closed
July 22, 2021, 10:56am
7
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.