RStudio looking for a way to create a Chart based on keywords

Hi, I am still new with Rstudio, and now I am looking to find a way to create a chart from 1 column keywords.
I know its working in Pandas, but I cannot find a way in Rstudio.

For example I have a dataframe CSV " headwear"

Name Color
Winter Hat Blue
Bunny Ears Grey
Summer Hat Pink
Woolen Cap Purple

I want to make a bar chart based on how many "Hats", "Caps", "Ears", and ignore anything else in the cell.
Tried different ways with count, but 1 error is bigger than the next error.
Could someone help me in the right direction?

Thanks in advance.

1 Like

The following code generates a rather bland looking bar chart from some example data.

library(dplyr)    # to manipulate the data frame
library(stringr)  # for string operations
library(ggplot2)  # for the bar chart

# Incoming data set (from CSV or wherever).
headgear <- data.frame(Name = c("Winter Hat", "Bunny Ears", "Summer hat", "Woolen Cap"),
                       Color = c("Blue", "Grey", "Pink", "Purple"))

# Categories for the bar chart.
types <- c("Hat", "Ears", "Cap", "Fedora")

# Add a "Type" column to the data frame by comparing "Name" entries to know types.
# (Comparison is case-insensitive here.)
headgear <- headgear |> rowwise() |> mutate(Type = types[str_detect(Name, paste("(?i)", types))])

# Plot the bar chart.
ggplot(data = headgear, mapping = aes(x = Type)) + geom_bar()

It is fragile in that it assumes each entry in the "Name" column will contain exactly one match to one of the predefined types (hat, cap, ears, ...). The comparison is case-insensitive, so if someone enters "Bunny ears", the entry will be recognized as matching "Ears".

1 Like

Thank you so much for helping me out with this.

Assumptions:

  1. Name is will have the interested word(Cap,Hat etc) at the end of the line else will have to use some form of regex in str_extract()
  2. Name is space separated, else change the arguments to str_split_i()

Used the dataframe from earlier comment but added few more rows

headgear <- data.frame(Name = c("Winter Hat", "Bunny Ears", "Summer hat", "Woolen Cap","Bunny2 Ears","Bunny3 Ears","Winter2 Hat"),
                       Color = c("Blue", "Grey", "Pink", "Purple","Purple","Purple","Green"))
headgear %>% 
  mutate(new_name= str_to_title(str_split_i(Name," ",-1))) %>%  # picks the last word, formats to uniform case
  count(new_name,sort=TRUE,name="total_counts") %>%  # gets counts of each new column value
  ggplot()+
  aes(new_name,total_counts)+
  geom_col()

1 Like

Hi Vinaychuri,

Thank you so much for this code. I will look into this. :slight_smile:

You can categorize the keywords in your "Name" column using pattern matching (e.g., detecting "Hat," "Cap," or "Ears"), then count occurrences and visualize them with a bar chart using ggplot2. Use functions like grepl() to classify items and dplyr to summarize counts before plotting. This approach will help you analyze keyword frequency efficiently in RStudio.