I would like some advice with fast ways to rename columns that working within dplyr and are pipeable.
I am going to use iris as an example here but my actual data set is much larger with 120 columns which is why some of the known solutions for renaming columns don't work for me.
If I have vector v = c('s_length', 's_width',' p_length', 'p_width', 'species') I want to replace the all the column names on iris with the names in v
I know the following is very easy and accomplishes what I want
v = c('s_length', 's_width',' p_length', 'p_width', 'species')
colnames(iris)<-v
but it's not a pipeable statement. Or at least not that I am aware of.
This is pipeable which is good but my actual data frame is 120 columns long. I don't want to have to manually write out something like replace_names for 120 columns. I know I could easily get a vector of the current names and a vector of the names I want (ie v above) but I don't know how to combine them programically into the same form as the replace_names vector that could then be fed into rename.
I would be open to any of the following solutions:
Being able to feed in a vector of new names without defining their relationship to the old column names in dplyr
Forcing the construction colnames(data frame)<-vector_new_col_names to be pipeable
Being able to easily construct a vector of the form c(new_col_1_name = old_col_1_name, ...) from a vector of old names and vector of new names to then be passed to rename
That seems a lot harder, and also unreadable of course.
Actually I am tempted to ask about the context: if you need to rename all these columns based on an external vector, maybe you're asking the wrong questions (of course I have no idea in your particular case). Maybe these 120 columns should have been rows, and the right approach would be to pivot_longer() before renaming anything. Maybe you're reading it from a csv file with bad names, and the column renaming step should be part of a readr call. Maybe you should really modify the existing names with rename_with() rather than replace them all at once.
Or maybe I'm just wasting time thinking too much about a trivial problem
library(tidyverse)
(orig_names <- names(iris))
(new_names <- str_replace_all(orig_names,
fixed("."),
"_"))
#if the new_names are in the same order as the original names they replace
# then you can relate them by setting them as names of the orig_names vec
names(orig_names) <- new_names
# see this
orig_names
# use this
rename(head(iris),
all_of(orig_names))
Also the context was I was pulling out a slice of data from a netcdf. It was very easy to get an array of the data I wanted to be the values of my data frame and the vectors of the dimension values that would end up being the column and row names. I was just then having trouble combining them together. Also yes I will eventually pivot longer but I want already have the column names associated with the values before I do so.