Separate "check all that apply" in specific columns

Hi! I have a "check all that apply" question and am looking to sort the answers into specific columns. I saw the post that uses seperate_row but I would like to maintain the rows as they are. I have provided a working example below.

Students <- c(001, 002, 003)

Answer_1 <- c( "Understanding spatial relationships;Understanding text;Remembering yesterday's lessons ")
Answer_2 <- c("Understanding text;Remembering yesterdays lesson")
Answer_3 <- c("None of the above")

Answers <- c(Answer_1, Answer_2, Answer_3)

student_data <- tibble (Students, Answers )
 # A tibble: 3 × 2
 # Students Answers                                                              
#     <dbl> <chr>                                                                
#       1 "Understanding spatial relationships;Understanding text;Remembering …
#        2 "Understanding text;Remembering yesterdays lesson"                   
#        3 "None of the above" 

## The tibble below I need to extract the answers into appropriate column headers
## for instance:
## "Understanding spatial relationships" needs to be in a column titled "Skill_1"
## "Understanding text" in a column titled "Skill_2"
## "Remembering yesterdays lesson" in a column titled "Skill_3"
## "None of the above" in a column titled "Skill_4" 
## but when I use seperate I get:

student_data_seperate <- student_data %>% separate(Answers, c("Skill_1", "Skill_2", "Skill_3", "Skill_4"), sep = ";")

student_data_seperate

# A tibble: 3 × 5
##  Students Skill_1                             Skill_2                       Skill_3                    Skill_4
#     <dbl> <chr>                               <chr>                         <chr>                      <chr>  
#        1 Understanding spatial relationships Understanding text            "Remembering yesterday's … NA     
#        2 Understanding text                  Remembering yesterdays lesson  NA                        NA     
#        3 None of the above                   NA                             NA                        NA  

## When I print seperate splits the information in the order it comes, I'd like to be able to  
## flag that specific text belongs in a specific column OR take this result and reconfigure the
### the answers to the correct columns - which ever is best s
student_data_seperate

Hi there,

This would give you a deduplicated vector of all possible answers.

# Generate a list of possible answers
PotentialAnswers <- strsplit(Answers, ";") %>%
  unlist() %>% 
  trimws() %>%
  unique()

(I added the trimws() because there was a trailing space on one of the answers.)

From there you could you a combination of mutate() and grepl() to generate a TRUE/FALSE for each answer.

As an aside, how is this data coming to you? I'd be anxious with the person and their answers being separate and not keyed together, but that's a separate issue. :slight_smile:

Anyway, I hope that helps!

Hi!

Thank you! However I am not sure this is quite addressing the solution I am after - or at least I can't see it.

I'm not having trouble with duplicates.

I can get the answers to separate just fine but when they do (as seen in the above example) they do not land in the appropriate columns. For instance , I want to uncouple the answers in such a way that all "Understanding text" answers stay in the same row but land in "Skill_2" column. Right now they are getting separated based on sequence and that is causing an issue because not every student selected the same number of answers and separate doesn't understand that. So while they are all in line with the correct student they are not in the appropriate columns.

This data was initially generated from a google form collection , however I was not involved in the creation or export (coming in mid game ).

library(tidyverse)

(student_data <- tibble(
  Students = c(1, 2, 3),
  Answers = c(
    "Understanding spatial relationships;Understanding text;Remembering yesterday's lessons ", "Understanding text;Remembering yesterdays lesson",
    "None of the above"
  )
))

PotentialAnswers <- strsplit(Answers, ";") %>%
  unlist() %>%
  trimws() %>%
  unique()

names(PotentialAnswers) <- paste0("Skill_", seq_along(PotentialAnswers))
PotentialAnswers

(student_data_seperate_long <- student_data %>%
  separate(Answers, paste0("col_", seq_along(PotentialAnswers)), sep = ";") %>%
  pivot_longer(
    cols = -Students
  ) %>% mutate(vfac = factor(trimws(value),
    levels = PotentialAnswers
  )))

(student_data_wide <- student_data_seperate_long %>%
  select(Students, vfac) %>% filter(!is.na(vfac)) %>%
  pivot_wider(names_from = "vfac", values_from = "vfac") %>%
  rename_with(
    .cols = all_of(PotentialAnswers),
    .fn = function(x) names(which(x == PotentialAnswers))
  ))
1 Like

OMG. ^ brilliant - thank you. Just took me a bit to see it. Thank you so much.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.