I am trying to using tapply to calculate several statistics of one variable (Diameter) classified by another variable (Specie).
For mean I write
r <- tapply(X = Diameter, INDEX = list(Specie), FUN = mean)
and it works, but when I try to calculate quantile(.10) -for example- I don't find a right way to do it.
Please, any help will be wellcome.
Thanks a lot.
Hi @JMSantiago ,
Could you please provide a sample dataset, which we can use to help you?
Thank you.
Thank you very much. Following, I include a short dataset:
Bank_Distance h_Bank Band Specie Diameter
0 0 C3 Salix_salvifolia 23
0 0 C3 Salix_salvifolia 14
0.1 0.001 C3 Alnus_glutinosa 20
0.1 0.001 C3 Alnus_glutinosa 28
1 0.023 C3 Alnus_glutinosa 23
1 0.023 C3 Salix_salvifolia 5
1.2 0.037 C3 Alnus_glutinosa 13
1.2 0.037 C3 Alnus_glutinosa 11
1.4 0.052 C3 Alnus_glutinosa 9
1.4 0.052 C3 Alnus_glutinosa 10
1.4 0.052 C3 Alnus_glutinosa 5.5
2.2 0.119 C3 Alnus_glutinosa 17
2.2 0.119 C3 Alnus_glutinosa 17
2.2 0.119 C3 Alnus_glutinosa 7.5
2.2 0.119 C3 Alnus_glutinosa 7.5
2.2 0.119 C3 Alnus_glutinosa 11.5
2.5 0.148 C3 Salix_salvifolia 5
5.5 0.609 C3 Salix_salvifolia 16
6 0.678 C3 Fraxinus_angustifolia 12
6.5 0.743 C3 Fraxinus_angustifolia 13.5
7.7 0.746 C3 Fraxinus_angustifolia 11.5
8.4 0.779 C3 Fraxinus_angustifolia 9
8.8 0.855 C3 Fraxinus_angustifolia 8
10 1.097 C3 Fraxinus_angustifolia 9.5
11.5 1.303 C3 Fraxinus_angustifolia 9.5
13.2 1.671 C3 Fraxinus_angustifolia 9.7
13.3 1.697 C3 Fraxinus_angustifolia 4
13.5 1.75 C3 Fraxinus_angustifolia 5.5
14 1.889 C3 Fraxinus_angustifolia 2.5
15.3 2.344 C3 Fraxinus_angustifolia 9.5
15.7 2.487 C3 Fraxinus_angustifolia 3
16.3 2.653 C3 Fraxinus_angustifolia 2.3
17.9 2.973 C3 Fraxinus_angustifolia 8.5
18.4 3.08 C3 Fraxinus_angustifolia 10
19.3 3.207 C3 Fraxinus_angustifolia 4
19.5 3.227 C3 Fraxinus_angustifolia 7.5
21.3 3.545 C3 Salix_angustifolia 1
22 3.641 C3 Fraxinus_angustifolia 6.5
22.5 3.725 C3 Fraxinus_angustifolia 6
23 3.816 C3 Salix_angustifolia 4
34 4.812 C3 Fraxinus_angustifolia 4.5
34 4.812 C3 Fraxinus_angustifolia 4
42 5.412 C3 Fraxinus_angustifolia 5.5
42 5.412 C3 Fraxinus_angustifolia 5
0 0 C3 Populus_nigra 7
0 0 C3 Salix_salvifolia 10
0 0 C3 Salix_salvifolia 6.5
0 0 C3 Populus_nigra 17.5
0.5 0.019 C3 Salix_salvifolia 4
0.5 0.019 C3 Populus_nigra 11.5
1 0.037 C3 Populus_nigra 13
1 0.037 C3 Salix_salvifolia 20
2.3 0.113 C3 Fraxinus_angustifolia 6
3 0.16 C3 Salix_salvifolia 6
3.7 0.191 C3 Fraxinus_angustifolia 10.5
4 0.215 C3 Populus_nigra 19
Your code suggests that you are using attach(your_data)
in your code and this is not an advisable practice. It takes a bit more typing, but always reference your dataset whenever you want to access data in it. You may want to refer to the answers here: dataframe - Why is it not advisable to use attach() in R, and what should I use instead? - Stack Overflow
The solution to your problem is to define a custom function inside tapply()
. Assuming that your dataset is inside raw_data
, you can do it like this:
tapply(X = raw_data$Diameter, INDEX = raw_data$Specie, FUN = function(x) quantile(x, 0.1))
Alnus_glutinosa Fraxinus_angustifolia Populus_nigra Salix_angustifolia Salix_salvifolia
7.5 3.5 8.8 1.3 4.9
Thanks a lot, your comments have been very useful to me.
I'm glad I could help. Consider marking my answer as the solution if it solved your problem.
system
Closed
March 21, 2022, 11:40am
7
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.