convert datatype chr in num in a dataset

Hello,
I have a dataset which recognizes my input as character, but it needs to be numeric.
In R it looks like this:
str(example1)
'data.frame': 9185 obs. of 7 variables:
ccc2 : chr "0.280611855" "0.063328681" "0.0558188246153846" "0.0258890675" ... ccc5 : chr "0.3690021275" "0.0738335925555555" "0.0573284499230769" "0.069981407" ...
ccc12: chr "0.2402121975" "0.0804443753333333" "0.0580245564615385" "0.03491928175" ... ccc23: chr "0.3530686075" "0.095604075125" "0.0562225292142857" "0.051274448" ...
ccc34: chr "0.278558275" "0.0726508113333333" "0.0640484183846154" "0.03975575525" ... ccc63: chr "0.29702648" "0.072651336" "0.0657946802307692" "0.031911788" ...
$ ccc71: chr "0.51959915" "0.07053381125" "0.0582691238461538" "0.125750736666667" ...

If I try to convert it, it doesn't work:
test <- as.numeric(example1)
Error: 'list' object cannot be coerced to type 'double'

I need the output above like my other list, which R recognizes correctly as numeric.
str(example2)
'data.frame': 9185 obs. of 7 variables:
ccc2 : num 1.96 9.52 10.3 8.73 10.62 ... ccc5 : num 2.89 10.35 10.83 7.93 10.44 ...
ccc12: num 3.38 10.46 10.59 8.09 10.21 ... ccc23: num 2.98 9.85 10.87 8.43 10.15 ...
ccc34: num 2.49 9.58 10.08 7.71 10.46 ... ccc63: num 2.42 9.99 9.66 8.96 10.41 ...
$ ccc71: num 2.67 10.46 10 8.67 10.85 ...

Hi,

Welcome to the RStudio community!

You are getting this error because you are trying to apply the function to the whole data frame instead of individual columns. Below I provide two ways of fixing this:

#Generate dummy data
set.seed(1) #Only needed for reproducibility 
myData = data.frame(
  col1 = as.character(runif(5)),
  col2 = as.character(runif(5)),
  col3 = as.character(runif(5))
)

#Check the class
sapply(myData, class)
#>        col1        col2        col3 
#> "character" "character" "character"

# Transform the columns using dplyr (Tidyverse) ...
library(dplyr)
myData = myData %>% mutate(across(everything(), as.numeric))

# OR Transform the columns using base R
for(column in colnames(myData)){
  myData[,column] = as.numeric(myData[,column])
}

#Check again
sapply(myData, class)
#>      col1      col2      col3 
#> "numeric" "numeric" "numeric"

Created on 2022-05-17 by the reprex package (v2.0.1)

In both the Tidyverse (learn more here) and base R approach, you can apply this function to only a subset of the columns by substituting everything() or colnames(myData) by a vector of column names eg: c("col1", "col3")

Hope this helps,
PJ

Thank you very much Pieterjanvc. Your solution worked.

Best MP

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.