I am trying to mutate a column in a list of data frames using the name of the list itself. I am struggling to do that. Any help here would be appreciated. Thanks!
I was trying to create a sample data for us. But I am unable to use datapasta or dput to create a sample. Below is the sample code otherwise.
I am looking for Column name "Category" with inputs as the name of List which is CategoryA and CategoryB respectively and these were the original names from excel sheets within workbook which were created as list for sample.
library(tidyverse)
library(readxl)
library(datapasta)
path <- "Excel Files/Test_Data_202305.xlsx"
data <- path %>%
excel_sheets() %>%
set_names %>%
map(read_excel, path = path)
df <- dput(data)
# Adding Variable "Category" in each List suing the name of the list
category_fn <- function(x){
# option1
mutate(Category = x$x)
# option2
x$Category <- x[[x]]
x <- x %>%
select(Category, everything())
return(x)
}
final_data <- map(df, ~category_fn(.))
If I understand your question, you want to add a column to each data frame of the list, where that column contains the list element name. For that, you can use imap(), same as map() but it passes two variables, the list element and its name:
library(tidyverse)
# example data
data <- set_names(paste("Category", letters[1:3])) %>%
map(~ tibble(mycol = rnorm(5),
othercol = rnorm(5)))
data
#> $`Category a`
#> # A tibble: 5 × 2
#> mycol othercol
#> <dbl> <dbl>
#> 1 0.669 0.524
#> 2 1.18 -1.09
#> 3 -1.39 -0.646
#> 4 1.53 2.20
#> 5 1.53 -0.178
#>
#> $`Category b`
#> # A tibble: 5 × 2
#> mycol othercol
#> <dbl> <dbl>
#> 1 0.553 -0.480
#> 2 -1.49 1.37
#> 3 0.433 -1.06
#> 4 0.231 1.32
#> 5 -1.85 -0.809
#>
#> $`Category c`
#> # A tibble: 5 × 2
#> mycol othercol
#> <dbl> <dbl>
#> 1 1.52 -0.510
#> 2 -0.455 0.0529
#> 3 -0.949 0.281
#> 4 0.125 1.59
#> 5 0.199 -0.107
# Adding Variable "Category" in each List suing the name of the list
category_fn <- function(df, name){
mutate(df,
Category = name,
.before = 1)
}
# use imap to pass both the data frame and its name
imap(data, ~ category_fn(.x, .y))
#> $`Category a`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category a 0.669 0.524
#> 2 Category a 1.18 -1.09
#> 3 Category a -1.39 -0.646
#> 4 Category a 1.53 2.20
#> 5 Category a 1.53 -0.178
#>
#> $`Category b`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category b 0.553 -0.480
#> 2 Category b -1.49 1.37
#> 3 Category b 0.433 -1.06
#> 4 Category b 0.231 1.32
#> 5 Category b -1.85 -0.809
#>
#> $`Category c`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category c 1.52 -0.510
#> 2 Category c -0.455 0.0529
#> 3 Category c -0.949 0.281
#> 4 Category c 0.125 1.59
#> 5 Category c 0.199 -0.107
# or directly as an anonymous function:
imap(data, ~ mutate(.x, Category = .y, .before = 1))
#> $`Category a`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category a 0.669 0.524
#> 2 Category a 1.18 -1.09
#> 3 Category a -1.39 -0.646
#> 4 Category a 1.53 2.20
#> 5 Category a 1.53 -0.178
#>
#> $`Category b`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category b 0.553 -0.480
#> 2 Category b -1.49 1.37
#> 3 Category b 0.433 -1.06
#> 4 Category b 0.231 1.32
#> 5 Category b -1.85 -0.809
#>
#> $`Category c`
#> # A tibble: 5 × 3
#> Category mycol othercol
#> <chr> <dbl> <dbl>
#> 1 Category c 1.52 -0.510
#> 2 Category c -0.455 0.0529
#> 3 Category c -0.949 0.281
#> 4 Category c 0.125 1.59
#> 5 Category c 0.199 -0.107