Work with big data reasonably quickly without destroying RAM over time

Hello everyone,

Please excuse me if this is the wrong place for this, but I have a very general question. Is R useful when writing code used to work large tables? Are there better programming languages? The reason that I ask is that I have ruined 2 sticks of ram by constantly running an analysis on my computer that requires a very large table to be loaded to subset from. I have tried read.csv.sql() but this takes an amount of time that makes it unusable. Any quicker sql methods?



Sound like you could take a look at dbplyr :+1:

Thanks for the response. I looked into this but the function to collect data from the database once it is created and can be referenced is way too slow. I have a 7gb table of species occurrence data and it takes more 20min to pull a just few species from it (cannot work for my work flow).

What seems to be the case: sql language is just slow, most RAM is too fragile and most computers aren't smart enough to not let this component fry. Solution?- money and time for large desktop setup and hardware and software modification.

This doesn't sound normal, maybe your sql server needs some fine tuning to speed up queries or you need to change to a more specialized big data solution for on-disk data storage e.g. Spark + Apache Arrow

Another option is to invest less money renting a large virtual machine just for the time it takes to process your data and pull the results out.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.