I am trying to use grepl to match a pattern in a column of a dataframe. The data frame column is a list of irish peoples surnames and I want to return the first letter of the surname. However some surnames start with Mc, Mac and O'. in those cases I want to return the prefix and the next letter after that.
I have some code that successfully does this for the names that have the Mc and Mac prefixes. But I can't get it to work for cases where the name begins with O'.
I have the following code:
ifelse(grepl("^Mc", DF$surname), substr(DF$surname, 1, 3),
ifelse(grepl("^Mac", DF$surname), substr(DF$surname, 1, 4),
ifelse(grepl("^O'", DF$surname), substr(DF$surname, 1, 3), substr(DF$surname, 1, 1))))
This code will work if I run it using a vector I created myself such as surnames <- c("O'Connell, "O'Callaghan")
But doesn't work for a dataframe column
What is the difference between a vector and a dataframe column
Any help would be appreciated
Thanks!