In R, what is the best way to split a dataframe, join the split with another df, run several functions, and then combine the results?

DownEastAaron · March 21, 2023, 3:54am

I am having a bit of trouble getting started. I have a df with a large number of rows. I need to split this df based on the grouping of a certain column/observation. Each group will result in 2 rows of data. I then need to join this data with another df and then perform a number of mutations and calculations. This same process will happen until all data has been processed. Once all rows have been processed I would then like to combine everything to a single df. Is there a tidy package that will accomplish this? Perhaps purrr, or do I need to write a more customized function? I am not asking for a solution but rather just point me in the right direction. Thank you!

williaml · March 21, 2023, 4:03am

Perhaps base::split() then one of the purrr::map_df() family for example in here: Learn to purrr (rebeccabarter.com)

Otherwise perhaps dplyr::group_split() and dplyr::group_map():

Split data frame by groups — group_split • dplyr (tidyverse.org)

Apply a function to each group — group_map • dplyr (tidyverse.org)

split-apply-combine with group_map - coolbutuseless

DownEastAaron · March 21, 2023, 11:36am

Thank you Williaml. I will read through these articles.

system · March 28, 2023, 11:36am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.