I have value for several columns in a df and want to assign the column names as categorical variables in a new column in the df based on which value is the greatest. I'll try to demonstrate.
I would be trying to create the column "Cat". I was able to do something similar, but it was based on which column was above 50. I used mutate() and case_when() to achieve that.
The values in my actual "A-D" columns are percentages that add to roughly 100 (with rounding error), I also have 7 of those columns rather than 4, so there is not always one percent with the majority, so the >50 barrier creates a lot of NAs in the "Cat" column.
@startz
Thanks for the suggestion! it almost works. I should've said this originally, but these are a range on columns inside a larger df.
Trying out ways to put a range of columns in max.col() now
Here's one way to identify the maximum column from a subset of current columns. I feel like there must be a less verbose approach, but this is what I've got for now.
In the original version, the df inside mutate is referring to the version of df in the global environment, rather than the current version of the data at that point in the chain. Switching to . ensures that you're operating on the current version of the data within the chain, rather than the (possibly different) version of the data in the global environment.