I have a data frame with an empty column (e.g., X) and a full character column (e.g., Y).
I would like to add a number to the empty column's rows according to the order of the subset of alphabetical values in the character column.
For example, the first few rows in column Y start off with the letter A and the last row's string begins with the letter P. The next row in column Y begins with the letter C and the last row of that set begins with the letter M. The following row in column Y begins with the letter B and the set ends with a string that starts with the letter X. The dataframe has thousands of rows.
How can I add to column X the number 1 for the first set of rows (starting with A - P), 2 for the second set (starting with C-M), 3 for the third set (starting with B-X), and so forth every time the next row breaks the alphabetical order pattern?
example <- data.frame(
column_x <- rep("NA", 20),
column_y <- c("A ward", "A word", "B word", "C word", "D word", "P word", "C word", "K word", "L word", "M word","B word", "C word", "D ward", "D word", "K word", "P word", "X word", "A word", "K word", "X word" ))
print(example)
Thanks!