Loop over data frames

Hello,
I would like to run a loop over my data frames which are enumerated and add to each dataframe a new column. My problem ist as follows:

dataframes

df2000
df2001
df2002
df2003

add for each df a new column "year" with the same values as df, i.e.

df2000$ year <- 2000
df2001$ year <- 2001
...

Has anyone a clue how to run a loop?
THANKS!
EV

years <- 2000:2003
for (i in years) {
  df <- paste0("df", i)
  assign(df, cbind(get(df)), year = i))
}

Thanks a lot, mduvekot!
However, i receive the following error message: "Error in get(df) : invalid first argument"?
Do you have any clue what's wrong?
Thanks again! Ehsan

Could you run this:

ls(pattern = "df")

and tell us what the output looks like?

Thanks dromano!

this is the output:

ls(pattern = "df")
[1] "df" "df2015" "df2016" "df2017" "df2018" "df2019" "df2020" "df2021" "df2022"

1 Like

Your range is from 2015 to 2022, so you need to change

years <- 2000:2003

to

years <- 2015:2022
1 Like

Thanks! Indeed I adjusted the range. but still get the error.

Could you do this?

  1. run your code, then paste it here, and then
  2. copy the full error message (including any red text) from the console and paste that here, too.

Also, I'm curious: Is there a reason why you originally created one table per year? Would you prefer a single object that contains all the tables?

This is my code:

 year <- 2015:2022

 for (i in year) {
 df <- assign( paste0("df", i), read_excel("data/mydata.xlsx", sheet = paste0(i),  range = "A4:H187")) 
  assign(df, cbind(get(df)), year = i)

"Error in assign(df, cbind(get(df)), year = i) : 
 unused argument (year = i)"

So I basically import Excel sheets and save as a df2015, df2016, etc.
than for each df2015 etc. I want to add a new column defining the year = 2015 etc.
at the end I want to combine all df2015-2022 into one dataframe.

Does this make sense?

This would seem to be what you want to do:

(You may need to edit the code slightly as purrr has changed its syntax since the article was written).

Does this do what you want?

library(tidyverse)
library(readxl)
2015:2022 |> 
  map(
    \(n) {
      sheet_name <- as.character(n)
      read_excel("data/mydata.xlsx", sheet = sheet_name,  range = "A4:H187") |> 
        mutate(year = n)
    }
  ) |> 
  list_rbind()

(The result should be the single, combined table you're after.)

the year part is out of place, it should have been part of the cbind

assign(df, cbind(get(df), year = i))
1 Like

Thanks dromano! seems this is what I want, butI get the following error:

> 2015:2022 |> 
+   map(
+     \(n) {
+     sheet_name <- as.character(n)
+     read_excel("data/mydata.xlsx", sheet = sheet_name,  range = "A4:H187") |> 
+         mutate(year = n)
+     }
+   ) |> 
+   list_rbind()
"Error in list_rbind(map(2015:2022, function(n) { : 
  could not find function "list_rbind" "

Thanks nirgrahamuk!
Still gives the same error message

"Error in get(df) : invalid first argument"

list_rbind comes from recent purrr package, if you have an older purrr package you would need to update it.

I can trigger the same error by failing to have a data.frame called df , because df is also the name of a function in stats.

I strongly urge you to never use df as your own variable name, so as to reduce confusions such as as that.

Be sure to run library(tidyverse) first — do you have that package installed?

Unfortunately, I can't update the package, because I have no admin rights on the R-Server.
So I guess this is not an option.

Thanks for the hint. This is what I did and seems to work.

year <- 2015:2022

for (i in year) {
  azryr <- assign( paste0("azr", i), read_excel("data/fdz.xlsx", sheet = paste0(i),  range = "A4:H187")) 
  assign("azr", cbind(get("azryr"), year = i))
}

however, this creates only in data file "azr" the variable year == 2022. There is also a list of data files azr2015, azr2016, etc. but without generating the variable "year"
do you know what is missing?

Thanks again!