Hayk
May 6, 2020, 7:03pm
1
Hi,
I have a tibble with dates in wide format. I transposed this tibble so that I have a column of dates as a variable and a column of values with melt function (reshape package) as well as with pivot_longer function (tidyr package). In both cases the character values which were written in my native language (Armenian) messed up and now there are weird symbols in their place.
How do you come to have the tibble in question?
You constructed it by typing it into your code? If so confirm that the script encoding is UTF-8.
If you imported from a file such as a CSV. Confirm that the CSV is in UTF-8
Hayk
May 7, 2020, 5:53pm
3
I have read it from an excel file. okay and how can I confirm script encoding or check it?
what R command did you use ?
Hayk
May 7, 2020, 6:11pm
5
R command? R studio? Do you mean that?
Hayk
May 8, 2020, 11:58am
7
Ah yes, sorry. here is my code.
library(readxl)
library(reshape2)
setwd("D:/M&L_Risks/Bank_Performance_Analysis")
excel_sheets("Working_X.xlsx")
df_db <- read_excel("Working_X.xlsx", sheet = "Accounts_DB")
col <- colnames(df_db[,1:18])
m_db <- iconv(df_db,from = "Latin-1", to = "UTF-8")
m_db <- melt(df_db, col, na.rm = TRUE, value.name = "Value" )
Hayk
May 8, 2020, 12:01pm
8
And by the way the code written as
iconv(df_db,from = "Latin-1", to = "UTF-8") doesn't work.
It generates an error - unsupported conversion from Latin-1 to UTF-8
Hayk
May 10, 2020, 4:30pm
9
library(readxl)
library(reshape2)
setwd("C:/R-studio")
df_db <- read_excel("Working_X.xlsx", sheet = "Accounts_DB")
col <- colnames(df_db[,1:18])
m_db <- melt(df_db, col, na.rm = TRUE, value.name = "Value" )
Created on 2020-05-10 by the reprex package (v0.3.0)
After reading an excel file I get a tibble on which I can see the characters written in my native language(Armenian). But after I melt the tibble, the characters become messed. I guess the issue is about encoding, and I need to change encoding to UTF-8, but whatever I searched for and tried I could not manage to conduct the change.
Here is the topic I have created on this issue.
iconv()
wants to convert character vectors, it wont work for dataframes like you are trying to use it.
Were you trying iconv because you know your excel file is in Latin-1 encoding ? or was that a guess on your part ?
If you have excel perhaps you can open the file and resave it with UTF-8 encoding ?
Perhaps you can raise an issue, or add to an existing issue on readxl github.
opened 06:54AM - 02 Apr 19 UTC
closed 03:28AM - 19 Mar 22 UTC
bug
xls 👵
libxls
I want to read xls files in loop but the **read_xls** seems can only read the **… first** xls file, but when I open the second or other xls file in EXCEL2016 window(PS:NO file broken or other infomation occured), then close it **without** any changes.
Rerun the code below , then the read_xls can read the second xls file. I am so confused.
I have put the data here [Data](https://drive.google.com/open?id=1q5gcHeeeXNv_rZqOfdr9zZ-J88rj_yUw)
My code show as below
``` r
library(readxl)
library(reprex)
setwd("C:\\Users\\len\\Desktop\\Data")
files<-list.files(recursive = T)
files<-files[grepl(".XLS",files)]
files
#> [1] "20190326.seq/33.0000.XLS" "20190327.seq/01.0000.XLS"
#> [3] "20190328.seq/04.0000.XLS" "20190329.seq/15.0000.XLS"
#> [5] "20190330.seq/09.0000.XLS" "20190331.seq/03.0000.XLS"
packageVersion("readxl")
#> [1] '1.3.1'
for(i in 1:length(files))
{
#dat<-read_xls(files[i],sheet = 1,na = "n.a.",skip = 0,col_names = F)[1:52,]
dat<-read_xls(files[i])
message(paste("Read",files[i]))
print(head(dat))
}
#> New names:
#> * Area -> Area...3
#> * Area -> Area...4
#> * Area -> Area...5
#> * Area -> Area...6
#> * Area -> Area...7
#> * ... and 13 more problems
#> Read 20190326.seq/33.0000.XLS
#> # A tibble: 6 x 20
#> No. Time Area...3 Area...4 Area...5 Area...6 Area...7 Area...8
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 <NA> <NA> "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~
#> 2 <NA> <NA> AOC Cl NO2 SO4 NO3 NH4
#> 3 <NA> <NA> ECD_1 ECD_1 ECD_1 ECD_1 ECD_1 ECD_2
#> 4 1 4355~ n.a. 0.11860~ 0.02600~ 0.36162~ 0.36177~ 0.54646~
#> 5 2 4355~ n.a. 0.13981~ 0.03278~ 0.36993~ 0.38382~ 0.56742~
#> 6 3 4355~ n.a. n.a. n.a. 0.35741~ 0.34220~ 0.56133~
#> # ... with 12 more variables: Area...9 <chr>, Area...10 <chr>,
#> # Area...11 <chr>, Area...12 <chr>, Area...13 <chr>, Area...14 <chr>,
#> # Area...15 <chr>, Area...16 <chr>, Area...17 <chr>, Area...18 <chr>,
#> # Area...19 <chr>, Area...20 <chr>
#> Read 20190327.seq/01.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190328.seq/04.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190329.seq/15.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190330.seq/09.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190331.seq/03.0000.XLS
#> # A tibble: 0 x 0
```
rerun the code after I open the second XLS file
``` r
library(readxl)
library(reprex)
setwd("C:\\Users\\len\\Desktop\\Data")
files<-list.files(recursive = T)
files<-files[grepl(".XLS",files)]
files
#> [1] "20190326.seq/33.0000.XLS" "20190327.seq/01.0000.XLS"
#> [3] "20190328.seq/04.0000.XLS" "20190329.seq/15.0000.XLS"
#> [5] "20190330.seq/09.0000.XLS" "20190331.seq/03.0000.XLS"
packageVersion("readxl")
#> [1] '1.3.1'
for(i in 1:length(files))
{
#dat<-read_xls(files[i],sheet = 1,na = "n.a.",skip = 0,col_names = F)[1:52,]
dat<-read_xls(files[i])
message(paste("Read",files[i]))
print(head(dat))
}
#> New names:
#> * Area -> Area...3
#> * Area -> Area...4
#> * Area -> Area...5
#> * Area -> Area...6
#> * Area -> Area...7
#> * ... and 13 more problems
#> Read 20190326.seq/33.0000.XLS
#> # A tibble: 6 x 20
#> No. Time Area...3 Area...4 Area...5 Area...6 Area...7 Area...8
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 <NA> <NA> "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~
#> 2 <NA> <NA> AOC Cl NO2 SO4 NO3 NH4
#> 3 <NA> <NA> ECD_1 ECD_1 ECD_1 ECD_1 ECD_1 ECD_2
#> 4 1 4355~ n.a. 0.11860~ 0.02600~ 0.36162~ 0.36177~ 0.54646~
#> 5 2 4355~ n.a. 0.13981~ 0.03278~ 0.36993~ 0.38382~ 0.56742~
#> 6 3 4355~ n.a. n.a. n.a. 0.35741~ 0.34220~ 0.56133~
#> # ... with 12 more variables: Area...9 <chr>, Area...10 <chr>,
#> # Area...11 <chr>, Area...12 <chr>, Area...13 <chr>, Area...14 <chr>,
#> # Area...15 <chr>, Area...16 <chr>, Area...17 <chr>, Area...18 <chr>,
#> # Area...19 <chr>, Area...20 <chr>
#> New names:
#> * Area -> Area...3
#> * Area -> Area...4
#> * Area -> Area...5
#> * Area -> Area...6
#> * Area -> Area...7
#> * ... and 13 more problems
#> Read 20190327.seq/01.0000.XLS
#> # A tibble: 6 x 20
#> No. Time Area...3 Area...4 Area...5 Area...6 Area...7 Area...8
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 <NA> <NA> "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~ "\xa6\x~
#> 2 <NA> <NA> AOC Cl NO2 SO4 NO3 NH4
#> 3 <NA> <NA> ECD_1 ECD_1 ECD_1 ECD_1 ECD_1 ECD_2
#> 4 1 4355~ n.a. 0.05765~ 0.01526~ 0.43807~ 0.37693~ 0.57607~
#> 5 2 4355~ n.a. 0.05745~ 0.01683~ 0.42131~ 0.36903~ 0.57463~
#> 6 3 4355~ n.a. 0.05793~ 0.01444~ 0.42888~ 0.38235~ 0.58119~
#> # ... with 12 more variables: Area...9 <chr>, Area...10 <chr>,
#> # Area...11 <chr>, Area...12 <chr>, Area...13 <chr>, Area...14 <chr>,
#> # Area...15 <chr>, Area...16 <chr>, Area...17 <chr>, Area...18 <chr>,
#> # Area...19 <chr>, Area...20 <chr>
#> Read 20190328.seq/04.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190329.seq/15.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190330.seq/09.0000.XLS
#> # A tibble: 0 x 0
#> Read 20190331.seq/03.0000.XLS
#> # A tibble: 0 x 0
```
system
Closed
May 18, 2020, 8:41am
11
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.