Counting element in a string from a data frame

Hello R community,

I have the following data frame:

x <- tibble(
  name = c("A", "B", "C"),
  type = c("34;30;100;1;1;9;9;100;200;100;1;1","34;30;34","30;45;1;45;30;30")
)

How can I count the individual element (separated by ';') in the column 'type' so that I can know the total of each 'type' in the data frame?
For example, I'm looking to get a result like this:

type                                      totals
<chr>                                      <int>
'34'                                          3
'30'                                          5
'45'                                          2
'1'                                           5
'100'                                         3
'9'                                           2
'200'                                         1

I was wondering if there is any function(s) in R that makes this operation.

Many thanks in advance

# Given vector
type <- c("34;30;100;1;1;9;9;100;200;100;1;1", "34;30;34", "30;45;1;45;30;30")

# Split elements by semicolon and combine into a single vector
split_elements <- unlist(lapply(type, function(x) strsplit(x, ";")[[1]]))

# Count the frequency of each element
element_count <- table(split_elements)

# Convert the table to a data frame
element_count_df <- as.data.frame(element_count)

# Rename the columns
colnames(element_count_df) <- c("Element", "Count")

# Display the data frame
element_count_df
#>   Element Count
#> 1       1     5
#> 2     100     3
#> 3     200     1
#> 4      30     5
#> 5      34     3
#> 6      45     2
#> 7       9     2

Created on 2023-07-03 with reprex v2.0.2

Would this work for you?

x |>
  mutate(type_split = str_split(type, ";")) |>
  unnest(type_split) |>
  count(type_split)
1 Like

If the goal is simply to know, I.e. the info, and not so much the presentation or form then I think

table(unlist(strsplit(x$type,";")))

is nice and succinct code

  1 100 200  30  34  45   9 
  5   3   1   5   3   2   2
1 Like

...and you can add as.data.frame.table(), like so:

as.data.frame.table(table(unlist(strsplit(x$type,";"))))

and then get:

  Var1 Freq
1    1    5
2  100    3
3  200    1
4   30    5
5   34    3
6   45    2
7    9    2

Could also be written as:

x |>
  _[["type"]] |>
  strsplit(";") |>
  unlist() |>
  table() |>
  as.data.frame.table()

But then it's not too far from the tidy-solution I proposed... I suppose it's a matter of taste... :+1:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.