I come here so that you can assist me with this problem please, I have a large data table containing data of vegetables coded as character.
I want to perform computations on it with a constraint.
Here is the data structure:
My problem is that the table contains several values that I would like to concatenate. For example there are two types of "Salade" ("Salade - laitue" + ''Salade plein air ou abris bas'' ") and I would like to concatenate these data by adding the values included in each column to end with a Salad only modality.
My dataset being ver large I would like to find an easy way to do it cleanly
Roughly speaking, this would amount to ''Salade'' = ''Salade - laitue'' + ''Salade plein air ou abris bas''
Here's what I've already started:
# The computation I want and it does works (filtered by the vegetable i want)
Merge_table %>% group_by(CODE) %>%
filter (CODE %in% c("Concombre plein air ou abri bas",
"Concombre sous serre ou abri haut",
"Courgette plein air ou abri bas",
"Courgette sous serre ou abri haut",
"Salade - laitue",
"Melon plein air ou abri bas",
"Melon sous serre ou abri haut",
"Salade plein air ou abris bas")
) %>%
summarise(nb = n()) %>% as.data.frame()
I saw it was possible to do something with the package {stringr} but i did not achieve to realise the condition I want
library(tidyverse)
df_0 <- data.frame(
stringsAsFactors = FALSE,
CODE = c(
"Courgette sous serre ou abri haut", "Concombre sous serre ou abri haut",
"Courgette sous serre ou abri haut", "Salade plein air ou abris bas",
"Courgette plein air ou abri bas",
"Concombre sous serre ou abri haut", "Concombre plein air ou abri bas",
"Concombre sous serre ou abri haut",
"Salade plein air ou abris bas", "Melon plein air ou abri bas",
"Courgette plein air ou abri bas", "Melon sous serre ou abri haut",
"Melon plein air ou abri bas", "Concombre sous serre ou abri haut",
"Salade plein air ou abris bas",
"Courgette sous serre ou abri haut", "Courgette sous serre ou abri haut",
"Melon plein air ou abri bas",
"Concombre sous serre ou abri haut", "Salade - laitue",
"Courgette sous serre ou abri haut", "Salade - laitue",
"Courgette sous serre ou abri haut", "Concombre sous serre ou abri haut",
"Salade plein air ou abris bas", "Concombre sous serre ou abri haut",
"Courgette plein air ou abri bas",
"Salade plein air ou abris bas", "Melon sous serre ou abri haut",
"Concombre plein air ou abri bas"
)
)
group_by(df_0,
grp = case_when(
stringr::str_detect(CODE,
pattern = "Salade"
) ~ "Salade",
TRUE ~ CODE
)
) |> summarise(n = n())
And if, for example I wanted to it for multiple product once, like for example rows containing "courgette" and rows containing "salade" ? It would be just perfect !