I'm pretty new to R and I am a bit stuck. I have to create a variable to categorise study hours into low medium and high. I already have a column with the study hours and I created a new variable called Study_Effort but now I don't know how to categorize the hours into low medium and high
You can use the cut
function for this. For example:
library(tidyverse)
# Fake data
set.seed(3)
d = tibble(study.hours = rnorm(100, 20, 8))
hist(d$study.hours)
d = d %>%
mutate(study.hours.category = cut(study.hours,
breaks=c(0,15,30,Inf),
labels=c("low","medium","high"),
include.lowest=TRUE))
d
#> # A tibble: 100 × 2
#> study.hours study.hours.category
#> <dbl> <fct>
#> 1 12.3 low
#> 2 17.7 medium
#> 3 22.1 medium
#> 4 10.8 low
#> 5 21.6 medium
#> 6 20.2 medium
#> 7 20.7 medium
#> 8 28.9 medium
#> 9 10.2 low
#> 10 30.1 high
#> # … with 90 more rows
ggplot(d, aes(study.hours.category, study.hours)) +
geom_point() +
expand_limits(y=0)
Created on 2022-01-05 by the reprex package (v2.0.1)
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.