Snowflake to Rstudio, slow table-data transfer

francois_conexance · December 16, 2020, 9:52am

Hello,

I would like your help please.
We have migrated our data from SQL Server to Snowflake recently, then we adapt all our processes, including the modeling processes of our dataminers.

They must ingest huge tables (approx 7M rows and hundreds of "integer" columns) for their daily work.
Their aim is to transfer these data from our Snowflake environnment to their R/H2O solution for modeling purpose.

With JDBC (jdbc-3.12.11)/dplyr, they successfully query Snowflake data, however, the time it takes to ingest the data in R is long, around 15min, whereas the Snowflake query only took seconds to run according to Snowflake history tab.

It's even worse using an ODBC driver which seems to commit few thousands rows per second.

I would say no network issue as Internet speed is approx 2gbits.

I think we are missing a setting somewhere in R, but which one ?

Your help will be appreciated.
Thanks a lot.
François

francois_conexance · December 28, 2020, 8:20am

Hello,
Please can I have some help on this case ?
Many thanks in advance.

francois_conexance · January 8, 2021, 2:22pm

Hello,
Still stuck
Please can I have some help on this case ?
Many thanks in advance.

nirgrahamuk · January 8, 2021, 2:51pm

I think you would be wise to contact the R package developers that made whatever package you are using in R to interface with snowflake.

Is it this?
"Issues · snowflakedb/dplyr-snowflakedb · GitHub" https://github.com/snowflakedb/dplyr-snowflakedb/issues

Also you, might try to find a snowflake community to ask

francois_conexance · January 11, 2021, 4:31pm

Many thanks for your comment nirgrahamuk, I will submit an issue on this page.

system · February 1, 2021, 4:31pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.