I am working with the R programming language. Suppose I have the following data frame:
var_1 = rnorm(100,10,10)
var_2 = rnorm(100,10,10)
var_3 = rnorm(100,10,10)
d = data.frame(var_1, var_2, var_3)
head(d)
var_1 var_2 var_3
1 14.251923 14.877801 22.636207
2 7.325137 8.513718 21.021522
3 3.400001 -3.400397 11.274797
4 16.400597 8.623980 9.366115
5 7.065583 13.155570 17.891432
6 21.297912 4.341385 -11.337330
My Question: For each element in each variable, I want to replace the element with the percentile (e.g. 5th, 10th, 15th, etc.) it belongs to.
For example:
a = quantile(d$var_1, c(0.05, 0.10, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1))
b = quantile(d$var_2, c(0.05, 0.10, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1))
c = quantile(d$var_3, c(0.05, 0.10, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1))
new = data.frame(a,b,c)
a b c
5% -0.8806901 -7.40560488 -4.7353920
10% 0.3595086 -3.77910527 -0.6874766
15% 1.1201300 -2.91946322 0.9584040
20% 3.0581928 0.05127097 2.1457693
25% 5.0901641 1.91719913 4.6997966
30% 7.0056228 2.56215345 6.2691894
35% 7.6089831 3.58688942 7.1900823
40% 8.9853805 5.00957881 7.8488446
45% 9.9264540 5.73653135 8.6135093
50% 10.2235212 7.43425669 9.6063344
55% 11.5707533 8.54160196 10.9239040
60% 13.2422940 9.65006232 11.7036647
65% 15.1076889 11.07081528 13.2440004
70% 16.5354881 12.38804922 15.2585324
75% 17.9336020 13.16121940 17.6656208
80% 19.5312682 15.31472178 18.4820207
85% 21.9264905 17.99689941 19.3347983
90% 24.4511364 20.47478783 22.0647173
95% 26.6820271 25.27082341 24.4473033
100% 41.4419744 39.75848302 34.5105183
Now, each time a variable is between each percentile range, I would like to make the following replacement:
- if d$var_1 < -0.8806901, then d$var_1 == as.factor("5th percentile")
- if d$var_1 > -0.8806901 d$var_1 < 0.3595086, then d$var_1 == as.factor("10th percentile")
...
- if d$var_1 > 15.1076889 d$var_1 < 16.5354881 , then d$var_1 == as.factor("65th percentile")
etc
- if d$var_2 < -7.40560488, then d$var_2 == as.factor("5th percentile")
etc
- if d$var_3 < -4.7353920, then d$var_3 == as.factor("5th percentile")
etc
Can someone please show me how to do this?
Thanks!