Hello RStudio Communiy,
for my thesis, I am currently working with the Dominick's finer foods dataset.
I was trying to merge two big datasets (about 800 obs. of 7 var. + about 3.990.000 obs. of 11 var.).
Even though processing difficulties are somewhat to be expected, I feel like the following problems I am facing are not.
-
First, I treid to merge them with the normal merge-function and by one common variable (first all=TRUE, then only tried all.x and all.y; both didn't work). What happened here is that R kept processing for what felt like forever. So after max. 30 mins I stopped the processing and then R crased and some data got lost (previous entreis in a table). "previous session crashed...", "may have lost workspace data..." and so on.
-
So I tried it withh the "dplyr"-package. To my surprise the new table was generated in about a second and all was good. UNTIL I triew viewing the table (View(table)). Then the same thing and same errors appeared again. Session crashed and data was lost. The new table disappears.
Is there any way I can deal with this? Should I just now "View" the datatable? And just use str or else instead? Is this a common problem with such large datasets?
Thank you for your help!
Best reagrds,
Emin