Bar chart for data in case of duplicates in a variable

Hi,
I have a dataset of a training program where we want to know when school teachers have started using a particular pedagogic method. There are three blocks- circle time, language and numeracy. The response is a categorical variable (1 or 0). But the challenge I am facing is that the variable "teacherid" has duplicates. I want to get a bar chart in such a way that the X-axis has the "week_no_p" variable. But this variable "week_no_p" has to be mutated depending on the "day_no_block". So if the it is day 1 to 5, then week 1, day 6 to 10, then week 2 and so on.
The Y-axis will show the number of teachers taking the class (i.e, if response is 1 it means the teacher is participating). So basically the Bar chart has to show the number of number of teachers adopting the teaching practices. To sum up, there are 2 issues here:

  1. Create the "week_no_p" variable depending on the "day number" variable.
  2. Get a bar chart of number of teachers participating week-wise.
library(tidyverse)
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test
data<-tibble::tribble(
                 ~starttime, ~mastertraid,                ~en_name,                                         ~schoolid,               ~teacherid, ~day_no_block, ~week_no_p, ~circle_time, ~language, ~numeracy,
  "2022-09-25 20:19:14 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101", "Premalata karibhimagol",           14L,   "week 3",           1L,        0L,        1L,
  "2022-09-25 20:24:21 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101", "Premalata karibhimagol",           15L,   "week 3",           0L,        0L,        0L,
  "2022-09-25 20:37:20 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101",    "Shashikala Kulkarni",           14L,   "week 3",           1L,        1L,        0L,
  "2022-09-25 20:44:15 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101",    "Shashikala Kulkarni",           15L,   "week 3",           0L,        0L,        0L,
  "2022-09-26 20:18:21 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            1L,   "week 1",           1L,        0L,        0L,
  "2022-09-26 20:40:30 UTC", "Madhumathi", "Pushpalatha Chinagudi", "SCH004-GLPS S.M.KRISHNA NAGAR HUBLI-29090606704", "Jayashree S Shiraguppi",            1L,   "week 1",           0L,        0L,        1L,
  "2022-09-27 13:58:07 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            2L,   "week 1",           0L,        1L,        0L,
  "2022-09-27 14:07:37 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            3L,   "week 1",           1L,        0L,        0L,
  "2022-09-27 14:12:42 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            4L,   "week 1",           1L,        0L,        1L,
  "2022-09-27 14:17:00 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            5L,   "week 1",           0L,        1L,        0L
  )
Created on 2022-10-14 by the reprex package (v2.0.1)

Hi @kuttan98,
In case you have not solved this problem, here is some code that (I think) does what you want:

suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(janitor))
suppressPackageStartupMessages(library(lubridate))

# Desirable output for first stage of OPs question

my_data <- tibble::tribble(
                 ~starttime, ~mastertraid,                ~en_name,                                         ~schoolid,               ~teacherid, ~day_no_block, ~week_no_p, ~circle_time, ~language, ~numeracy,
  "2022-09-25 20:19:14 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101", "Premalata karibhimagol",           14L,   "week 3",           1L,        0L,        1L,
  "2022-09-25 20:24:21 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101", "Premalata karibhimagol",           15L,   "week 3",           0L,        0L,        0L,
  "2022-09-25 20:37:20 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101",    "Shashikala Kulkarni",           14L,   "week 3",           1L,        1L,        0L,
  "2022-09-25 20:44:15 UTC",      "Nagma",     "Sushmita Bankapur",            "SCH303-KPS SCHOOL BIDNAL-29090201101",    "Shashikala Kulkarni",           15L,   "week 3",           0L,        0L,        0L,
  "2022-09-26 20:18:21 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            1L,   "week 1",           1L,        0L,        0L,
  "2022-09-26 20:40:30 UTC", "Madhumathi", "Pushpalatha Chinagudi", "SCH004-GLPS S.M.KRISHNA NAGAR HUBLI-29090606704", "Jayashree S Shiraguppi",            1L,   "week 1",           0L,        0L,        1L,
  "2022-09-27 13:58:07 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            2L,   "week 1",           0L,        1L,        0L,
  "2022-09-27 14:07:37 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            3L,   "week 1",           1L,        0L,        0L,
  "2022-09-27 14:12:42 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            4L,   "week 1",           1L,        0L,        1L,
  "2022-09-27 14:17:00 UTC",      "Nagma",      "Shruthi Annigeri",               "SCH076-GHPS MALLIGWAD-29090606102",       "Hanamvva Talawar",            5L,   "week 1",           0L,        1L,        0L
  )

# str(my_data)

my_data %>%
  mutate(start_date = as.Date(starttime),
         week_num = cut(day_no_block,
                        breaks=c(0, 5.5, 10.5, 15.5, 20.5),
                        labels=c("1","2","3","4"))) %>%
  select(starttime, start_date, week_num, everything()) -> new_data

new_data %>%
  pivot_longer(cols = c(circle_time, language, numeracy),
               names_to = "method",
               values_to = "Using_method") %>%
  group_by(week_num, method) %>%
  summarise(using_freq = sum(Using_method)) %>%
  mutate(week_num = factor(week_num),
         method = factor(method)) -> summ_data
#> `summarise()` has grouped output by 'week_num'. You can override using the
#> `.groups` argument.

summ_data
#> # A tibble: 6 × 3
#> # Groups:   week_num [2]
#>   week_num method      using_freq
#>   <fct>    <fct>            <int>
#> 1 1        circle_time          3
#> 2 1        language             2
#> 3 1        numeracy             2
#> 4 3        circle_time          2
#> 5 3        language             1
#> 6 3        numeracy             1

# Second part: draw a grouped bar chart
ggplot(summ_data) +
  aes(x=week_num, y=using_freq, fill=method) +
  geom_bar(stat= "identity", position="dodge")

Created on 2022-10-18 with reprex v2.0.2

Another way is case_when()

data %>% 
  mutate(week_no_p = case_when(
    day_no_block <= 5 ~ "week 1",
    day_no_block >5 & <= 10 ~ "week 2"
  ))

etc...

Thank you.. This works.

Thank you so much for the response. But just to get to my right requirement, it would be better if I get 3 different graphs for circle time, literacy and numeracy separately. Is that possible to do?

Like this:

ggplot(summ_data) +
  aes(x=method, y=using_freq, fill=week_num) +
  geom_bar(stat= "identity", position="dodge")

Or this:

ggplot(summ_data) +
  aes(x=week_num, y=using_freq) +
  geom_bar(stat= "identity", position="dodge") +
  facet_grid(~ method)

Awesome.. Thanks a lot for this..

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.