I have a series of data frames USA, Canada, Mexico and such. How can I structure a loop in R so that no matter how many data frames we have, data cleaning steps can be applied to each data frame? For example, below step can be applied to USA, Canada and Mexico with loop.
USA <- df %>%
gather(key = "Year", value = "Volume", Jan:Dec)
Thanks @nirgrahamuk!
I didn't have a list of them earlier. But now I have created as follows. I am very new to looping and such in R. Any help will be appreciated. Thank you!
Thanks @nirgrahamuk!
When I try with gather alone as per your suggestion, then it works. But if I try to add other cleaning steps, the result remains without any change. Here is what I was trying to do:
The result of Northen_Market remains same as that of USA, Canada, Mexico without considering any changes mentioned in the function. I am sure I am missing out something very important in function here or we cannot use pipes in function. Can you please help? How should I structure it right?
Thank you so much @nirgrahamuk ! I opted for the first option as that's what I am used to in terms of assignments. Last question on this subject - After I am done cleaning these lists. How can I convert them back to dataframes as USA, Canada again - will it be unlist(dfList)? and in that case will it preserve original dataframe names USA, etc.?
they are dataframes, they are just dataframes that are in a list.
Northen_Market [[1]] is the transformed USA
Northen_Market [[2]] is the transformed Canada
if you want to pick them out by name, you should name them as you insert them into your original dfList.
For example:
dfList <- list(iris=iris,mtcars=mtcars)
new_list <- map(dfList,
~head(.))
#access them one at a time
new_list$iris
new_list$mtcars
#name them as I add them, the left is name , dataframe on the right of the equal sign
dfList <- list(iris=iris,
mtcars=mtcars)
new_list <- map(dfList,
~head(.))
#how to access by name
new_list$iris
new_list$mtcars
I also wanted to create a new column called Country while looping through these lists with their list names to differentiate, but it doesn't seem to work with lists the way it works with variable names.
Northen_Market <- Northen_Market%>%
mutate(Country = ifelse(dfList == Northen_Market$USA, "USA",
ifelse(dfList == Northen_Market$Canada, "Canada",
ifelse(dfList == Northen_Market$Mexico, "Mexico", ""
))))
Can we iterate the column name based on dataframe name? or the only way is to mutate in individual dataframe separately.
Thanks again for your help!
a list can not contain a column, only dataframes can contain column, and Northern_Market is a ilst of dataframes.
Here I go through a list of two dataframes (by their names, and add columns into them saying what that name was)
list_of_frames <- list(iris=as_tibble(iris),mpg=mpg)
altered_list_of_frames <- purrr::map(names(list_of_frames), # 'iris' then 'mpg'
~mutate(list_of_frames[[.]], src_col = .) # the . symbol will be replaced first with 'iris' then 'mpg'
)