Hello. I have received amazing help from this community before. I really appreciate it!
I have two datasets. The first, "Advertised", is a list of toothpaste brands that have been advertised. The second, "Sold", is a list of toothpaste products that have been sold. My goal is to match the list of items in "Advertised" to the best matches in "Sold". Note that in the real data, there are more records in the "Sold" dataset than in "Advertised"
Here is my example:
Advertised <- data.frame(BrandVariant = c("Crest Cavity Protection",
"Colgate Cavity Protection",
"Pepsodent Clean Mint USA"))
Sold <- data.frame(ID = c(1, 2, 3),
Ultimate.Company = c("Colgate-Palmolive", "Procter & Gamble-Crest", "Unilever"),
Product = c("Colgate 360", "Crest Cavity", "Pepsodent Mint"),
Product.Description = c("Colgate's first 360 degree whitening toothpaste",
"Fast acting and whiteness with Crest's cavity buster and protector",
"A clean mint taste for healthy gums"))```
There are multiple columns in "Sold" that I would like to match against. Best case scenario would be if I could have the ID value from "Sold" lined up with the best match from "Advertised" along with a similarity score.