Hello,
I need to import many excel files from a directory.
To perform the task I wrote a code like:
all_excels<-map_df( paths, read_excel )
But it fails because some xlsx files some columns are stored as char and others as dbl.
The reason is because the same column in some file is using the "." as decimal separator, and others spreadsheets are using the "," as decimal separator.
The columns with this issue of different decimal separator always contains the symbol "$".
However, this code always renames the columns containing "$" with the suffix "_to_correct", even when the column is recognized as db or integer. I thought the code: columns_to_rename was the cause, but I can see the right names with select and contains. Now I'm a bit lost. I intend to write the code always using tidyverse, but again, I failed.
What's the issue, people? I can't solve It.
Anyway, thanks for your time, patience and help.
Have a nice weekend.
When I run a data frame with column names containing $ through rename_all, it replaces the $ with .. Can you post an example of a data frame where the $ survives rename_all()?
library(tidyverse)
DF1 <- data.frame(type = "A", `Ball$` = "B", D = 3.4, `Call$IT` = 3.0)
DF1 |> rename_all(tolower)
#> type ball. d call.it
#> 1 A B 3.4 3
The $ becomes a casualty not from the application of rename_all(), but on the creation of the data frame. I ran the first line of your code and the result was:
> data.frame(type = "A", `Ball$` = "B", D = 3.4, `Call$IT` = 3.0)
type Ball. D Call.IT
1 A B 3.4 3
You can see that the $ disappears immediately as the result of the data.frame() function.
I tried a little experimentation using the tidyverse. Instead of creating a data frame, I created a tibble:
> DF1 <- tibble(type = "A", `Ball$` = "B", D = 3.4, `Call$IT` = 3.0)
> DF1
# A tibble: 1 × 4
type `Ball$` D `Call$IT`
<chr> <chr> <dbl> <dbl>
1 A B 3.4 3
You can see here that the $ in the column names is preserved.
I tired the rename_all(tolower) next, but was warned that this has been superseded. It doesn't fail in every circumstance, but it does fail when applied to a tibble object. So I proceeded with the suggested function which is designed to replace it.
Here's what I got. It does what you asked for
library(tidyverse)
> tibble(type = "A", `Ball$` = "B", D = 3.4, `Call$IT` = 3.0) %>% rename_with(tolower)
# A tibble: 1 × 4
type `ball$` d `call$it`
<chr> <chr> <dbl> <dbl>
1 A B 3.4 3
Wow, I'm sorry to have wasted your time making such a silly mistake.
If I run your function with some toy data frames instead of reading Excel files, I get the result I would expect. Columns with $ that are characters get tagged with "_to_correct" and columns with $ that are numeric preserve their names. The final data frame is rather messy because all the different names are included. Do you agree that this is the expected result? I can't easily make xlsx files, so I can't test with those.
library(tidyverse)
DF1 <- tibble(type = "A", `Ball$` = "B", D = 3.4, `Call$IT` = 3.0)
DF2 <- tibble(type = "E", `Ball$` = 2.3, D = 7.2, `Call$IT` = "C")
DF3 <- tibble(type = "R", `Ball$` = 2.3, D = 15.2, `Call$IT` = 4)
InList = list(D1 = DF1, D2 = DF2, D3 = DF3)
columns_to_process <- function(file_excel) {
df <- file_excel |> rename_all(tolower) |> #read_excel(file_excel) %>%
select(type, contains("$"))
columns_to_rename <- df |>
select(contains("$") & where(is.character)) |>
names()
if (length( columns_to_rename > 0)) {
df <- df |>
rename_with(~ paste0(., "_to_correct"), all_of(columns_to_rename ))
}
return(df)
}
columns_to_process(DF1)
#> # A tibble: 1 × 3
#> type `ball$_to_correct` `call$it`
#> <chr> <chr> <dbl>
#> 1 A B 3
columns_to_process(DF2)
#> # A tibble: 1 × 3
#> type `ball$` `call$it_to_correct`
#> <chr> <dbl> <chr>
#> 1 E 2.3 C
columns_to_process(DF3)
#> # A tibble: 1 × 3
#> type `ball$` `call$it`
#> <chr> <dbl> <dbl>
#> 1 R 2.3 4
test2<-map_df(InList ,columns_to_process, .id="id")
test2
#> # A tibble: 3 × 6
#> id type `ball$_to_correct` `call$it` `ball$` `call$it_to_correct`
#> <chr> <chr> <chr> <dbl> <dbl> <chr>
#> 1 D1 A B 3 NA <NA>
#> 2 D2 E <NA> NA 2.3 C
#> 3 D3 R <NA> 4 2.3 <NA>
At a glance, that looks right. If you are open to writing a file to disk then reading it back you can use R to create XLSX files. Check out the writexl library. If you do that you can simulate your actual use case to test without risk to your Excel files.
I really don't know how to say thank you, guys.
I couldn't figure what went wrong. The rename and $ symbol, the tibble option, the rename_all are minor mistakes difficult to solve at the same time.
I corrected my code and yes, now It works.
You still are way above chatgpt/gemini with these errors.
Have a nice week, kilonovember and FJCC.