- Error in storage.mode(x) <- "double" : (list) object cannot be coerced to type 'double'

kreitz · October 30, 2018, 6:44am

I'm now getting the following error when running the collect function when creating the object "match_cat"

Error in storage.mode(x) <- "double" : (list) object cannot be coerced to type 'double'

if I separate the code into two blocks

match_cat <- df2_n2_s %>% 
             filter(SalesRankProductCategoryID %in% df2_s)

match <- collect(match_cat)

then I get this error

Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class "c('tbl_spark', 'tbl_sql', 'tbl_lazy', 'tbl')"

Any ideas?

kevinykuo · October 30, 2018, 8:46am

Are you trying to do an inner join?

kreitz · October 30, 2018, 4:27pm

Actually I am trying a semi_join, I was thinking about it from a filter perspective but I'll give the semi_join() a try. Thanks so much for your help! Do you suggest I use collect() at the end of the code block with semi_join()?

kevinykuo · October 30, 2018, 5:14pm

If you want to get the dataframe back to your R session you can use collect(), but since it needs to fit in memory in R the data frame has to be small enough.

cderv · November 1, 2018, 1:31pm

With spark, I understand that every pipe flow you make create lazy operation. Everything could be run when using collect. I wonder if the issue would not come from somewhere before when you create df2_n2_s...

kreitz · November 3, 2018, 1:40am

The connection looks like it is working alright, I'm thinking its not liking the filtering by a matching column within a different df. When I used semi_join() it appears to work but I'm still figuring out how to look at the structure of the dataframe in the connection haha. I'm still getting used to the difference between working in a spark context vs with R objects in the global environment.

system · November 10, 2018, 1:40am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.