I have a very simple question but I cannot find the answer.
I have a data containing 135 observations with 5 variables to describe them. one of them is the "Sample" variable containing three levels: month 0, month 2 and month 4.
The other variable is a value (proportion of gonads)
I would like to create a for loop selecting, for each sample level (month 0, month 2 and month 4), the above-average observations during the sample (month 0, month 2 and month 4).
In other words, for each sample (month 0, month 2 and month 4), I have to average the proportion of gonads, then select the observations greater than this average in each sample.
I manage to create a for loop that answers me, for each sample, if yes or no, its proportion of gonads is greater than the average:
for (sample in unique (data$Sample)) {
print (data$GT[data$Sample == sample]> mean(data$GT [data$Sample == sample]))
}
but I am unable to collect the other information on each of these observations (the 4 other variables). In fact, I would like R to show me all the information about "the above-average observations in each sample.". I would like R to show me all the information about "True Value"
First thank you for answer.
Yes I know it will be better with tidyverse, but I have to deal with for loop - because it's in my homework answer ("Please use for loop"...).
So, I repeat my question but in the famous reprex Data iris :
for (species in unique(data$Species)) {
print(data$Petal.Length[data$Species==species]>mean(data$Petal.Length[data$Species==species]))
}
R answer :
data[data$Petal.Length > mean(data$Petal.Length),]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
51 7.0 3.2 4.7 1.4 versicolor
52 6.4 3.2 4.5 1.5 versicolor
53 6.9 3.1 4.9 1.5 versicolor
54 5.5 2.3 4.0 1.3 versicolor
55 6.5 2.8 4.6 1.5 versicolor
.......
I would like this kind of answer, but with observations greater than the mean of Petal.Length in each species (And not for all observations..)
Homework is usually out of bounds. But your error is simple so go back and look at the non-tidyverse solution I suggested. Simply but BOTH your criteria in the [square brackets] so:
That is a hideous line which will be prone to error. So you might want to pull the mean out before hand like I did in my example, and you may want to use attach() (tidyverse just screamed at me in shock that I would even say that!) - remember to detach()
data <- iris
for (species in unique(data$Species)) {
speciesMean = mean(data$Petal.Length[data$Species== species])
print (data[data$Petal.Length > speciesMean & data$Species == species,])
print ("========")
}
# tidyverse way
require(tidyverse)
for (species in unique(data$Species)) {
iris %>%
filter(Species == species) %>%
filter(Petal.Length > mean(Petal.Length)) %>% print()
print ("========")
}
Or - if tidyverse was allowed - you can still loop it in a for statement and filter.