Using a for loop in R to loop through the name of dataframes

mohitarora · January 4, 2021, 1:41am

I have data on mergers for 20 years for various firms. I have used a "for" loop in R to separate data for each year which gives me 20 data frames in the global environment. Each data frame is identified by its year: Merger2000 to Merger2019 for 20 years. Now I want to write another for loop to find the unique companies in each data frame (that is, unique firms in each year). Each company is identified by a unique company code (co_code). I know how to do this for each year separately. For example, for the year 2000, I would do something like:

uniquemerger2000 <- Merger2000 %>% distinct(co_code, .keep_all = TRUE)

How do I run a for loop to enable this operation for all years (that is from 2000-2019)? There is some indexing required in the code but I am not sure how to operationalise this in a loop.

Any help would be appreciated. Thanks!

andresrcs · January 4, 2021, 2:02am

Hi, welcome!

There is nothing fundamentally wrong with using a for loop but this is rarely the best way of doing things in R, if you could ask this with a minimal REPRoducible EXample (reprex) illustrating your issue, very likely someone will come up with a better (or at least more idiomatic) solution.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

mohitarora · January 4, 2021, 12:27pm

Thanks. I will keep that in mind when I post a question next time.

nirgrahamuk · January 4, 2021, 2:09pm

library(tidyverse)

#making example data ...
(example_data <- dplyr::storms %>% filter(day==28) %>%
                                           select(name,year,month,day,wind,pressure) %>%
    rename(company_name=name,
           stat_1 = wind,
           stat_2 = pressure)  %>% distinct())

# write a for loop to split the example data by year, 
# and then write more for loops to process those?

# or query the data directly
# looking for unique name year combinations

example_data %>% 
  group_by (company_name,year) %>% 
  count(name = "entries_per_year") %>% 
  filter(entries_per_year == 1)

system · January 25, 2021, 2:09pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.