Hi everyone,
I'm trying to find more streamlined ways of doing things but don't know how to:
- add up several columns ready for plotting into a bar graph;
- create a new variable using conditional logic across multiple columns;
- combine the results of 1 and 2 in a stacked bar graph/geom_bar (although, I think I might have this one and just want to check that I'm on the right track)
My sample dataframe is:
                    ID = c(1L, 2L, 3L, 4L, 5L, 6L),
             More.Info = c(1L, 1L, NA, 1L, NA, 1L),
  Better.Understanding = c(NA, 1L, 1L, 1L, NA, 1L),
            Reg.Reform = c(NA, 1L, 1L, NA, NA, 1L),
         More.capacity = c(NA, 1L, NA, NA, 1L, 1L),
                 Other = c(1L, NA, NA, NA, NA, NA),
               Group.A = c(1L, 3L, 2L, NA, 3L, 2L),
               Group.B = c(2L, 1L, NA, 2L, 2L, 2L),
               Group.C = c(1L, 1L, 3L, 1L, NA, NA),
               Group.D = c(1L, 1L, 2L, 1L, 3L, 1L),
               Group.E = c(2L, 3L, 3L, NA, NA, 3L),
               Group.F = c(3L, 3L, 3L, NA, 2L, 1L),
               Group.G = c(1L, 2L, 1L, 1L, 3L, 1L),
               Group.H = c(3L, 3L, 1L, 1L, 2L, 2L),
               Group.I = c(3L, 3L, NA, 1L, 1L, 3L),
               Group.J = c(3L, 2L, 2L, 3L, 2L, NA),
               Group.K = c(1L, 1L, 2L, 1L, 3L, 2L),
               Group.L = c(1L, 1L, 2L, 3L, NA, 3L),
               Group.M = c(1L, 3L, 3L, NA, 1L, 2L),
               Group.N = c(3L, 1L, NA, 3L, 3L, 1L),
               Group.O = c(3L, 3L, 2L, 2L, 2L, 3L)
                    )                    
The columns "More.Info", "Better.Understanding", "Reg.Reform", "More.capacity" and "Other" the types of assistance survey respondents selected. To plot this information, I would normally switch over to Excel and tally up each of these columns and create a manual dataframe e.g.
cols<-c("More.Info", "Better.Understanding", "Reg.Reform", "More.Capacity", "Other")
responses<-c(7, 6, 5, 4, 1)
df<-tibble(cols, responses)
head(df)
df %>%
  ggplot(aes(x = cols, y = responses))+
  geom_col()
Is there an easier way to do this in R? I have tried the following code (different iterations, not all one code) - none of them work:
tally<-sdat%>%
  summarise(count(Better.Understanding))%>%
  summarise(count(More.capacity))%>%
  summarise(count(More.Info))%>%
  summarise(count(Other))%>%
  summarise(count(Reg.Reform))
tally<-sdat%>%
  count(Better.Understanding)%>%
  count(More.capacity)%>%
  count(More.Info)%>%
  count(Other)%>%
  count(Reg.Reform)
tally<-sdat%>%
  summarise(sum(Better.Understanding))%>%
  summarise(sum(More.capacity))%>%
  summarise(sum(More.Info))%>%
  summarise(sum(Other))%>%
  summarise(sum(Reg.Reform))
tally<-sdat%>%
  cumsum(Better.Understanding)%>%
  cumsum(More.capacity)%>%
  cumsum(More.Info)%>%
  cumsum(Other)%>%
  cumsum(Reg.Reform)
I then want to calculate (from the remaining variables) the respondents level of engagement in professional organisations (Group A-0). To calculate a new column 'Engagement', a respondent would need to have a "1" (=Well Connected) or "2"(=Somewhat Connected) in any of the columns GroupA:GroupO.
I tried the following code, but it does not work:
df <- within(df, {
  Engagement <- NA
  Engagement["Group.A":"Group.O" < 3] <- "Engaged"
  Engagement["Group.A":"Group.O" >= 3] <- "Not Engaged"
})
Once I've calculated the Engagement variable, could I plot the types of assistance (i.e. "More.Info", "Better.Understanding", "Reg.Reform", "More.capacity" and "Other") along the x axis, grouped by "Engagement" using the following code:
df %>% 
  ggplot(aes(x = 2:6, y = tally, group = Engagement, fill = Engagement)) +
  geom_histogram(stat='identity', alpha = 0.4, width = 0.9)+
  theme(axis.text.x = element_text(angle = 60, hjust = 1))
Apologies for the multiple questions - I thought it might be more efficient than multiple posts given it is all related.
Thank you, in advance, for any help or guidance you can provide.


 .
.