Hello,
your error occurs because you wrote if (myData[,1] ...)
. This basically means, compare the column 1
of myData
with a value. The result is a vector with TRUE
/FALSE
entries, but not ONE TRUE
or FALSE
value (which is necessary for if()
to work. Regardless, here is a solution you could use in base R
# create the data
myData <- iris[,1:5]
myData$meanSplit <- myData[,1] # these are replicates of column 1, not 2 (indexing starts at 1)
myData$Quantiles <- myData[,1]
mean_val <- mean(myData[,1])
# insert values in meanSplit with if - else
for (i in seq.default(1,nrow(myData))){
if (myData[[i,1]] > mean_val){
# mean is less then value
myData$meanSplit[[i]] <- 'Less'
} else {
# mean is greater then value
myData$meanSplit[[i]] <- 'More'
}
}
# quantiles
myData$Quantiles <- cut(
myData[,1],
breaks = quantile(myData[,1])[c(1,2,4,5)],
labels = c('Low','middle','High'))
head(myData, n = 10)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species meanSplit
#> 1 5.1 3.5 1.4 0.2 setosa More
#> 2 4.9 3.0 1.4 0.2 setosa More
#> 3 4.7 3.2 1.3 0.2 setosa More
#> 4 4.6 3.1 1.5 0.2 setosa More
#> 5 5.0 3.6 1.4 0.2 setosa More
#> 6 5.4 3.9 1.7 0.4 setosa More
#> 7 4.6 3.4 1.4 0.3 setosa More
#> 8 5.0 3.4 1.5 0.2 setosa More
#> 9 4.4 2.9 1.4 0.2 setosa More
#> 10 4.9 3.1 1.5 0.1 setosa More
#> Quantiles
#> 1 Low
#> 2 Low
#> 3 Low
#> 4 Low
#> 5 Low
#> 6 middle
#> 7 Low
#> 8 Low
#> 9 Low
#> 10 Low
Created on 2022-09-01 by the reprex package (v2.0.1)
The second one includes base::cut()
, which is a more appropriate way then using a bunch of if
and else
statements in a chain. But you can figure it out by yourself how to rewrite it with if
and else
statements if necessary, given the way of filling the meanSplit
column I provided.
You might consider changing your statement for the meanSplit
however, since More
is kind of confusing if More
means "mean is greater then value".
I hope this answers a) the error message and b) how to solve your issue to get working code with the expected results.
Kind regards