How to import and merge data in r studio?

Hello Posit Members,

I have some issue regarding to import and merge data . I uploaded external data and import in the r script . but Its not allow to import all data and I loaded tidyverse library again and again and import it . but now I am trying to merge the data it is showing some errors . I am upload a screenshot . If is there any mistake in importing data please tell what should I do?

Thanks

This is likely the same issue as discussed here arising from the RAM limitations of the cloud version. The desktop version on any reasonably provisioned laptop or desktop should overcome this. If installing the desktop version is not an option you can try revising the script to minimize the memory load. Something along these lines

#omit loading libraries yet
# import jan and feb
all_trips <- dplyr::bind_rows(jan,feb)
rm(jan,feb)
# import mar apr
all_trips <- dplyr::bind_rows(all_trips,mar,apr)
rm(mar,apr)
# continue through end of year, removing 
rm(nov,dec)
# save for reuse later
saveRDS(all_trips,file="all_trips.Rds")

I tried it but it is also not working . I import the library as well but still it is not working. Then what should I do?

How far did you get?

Based on your screenshot, it seems you did not have enough RAM to load all files in one go (as mentioned by @technocrat https://forum.posit.co/t/how-to-import-and-merge-data-in-r-studio/169088/2?u=yiy)

Would you like to try release some RAM after each load by calling the gc() function.

1 Like

The RAM should be big enough to hold the entire final data frame. gc() may help if the OS cooperates (which doesn't always happen) but the operations shown don't seem likely to leave much mess that needs collection. More important, I think, is to eliminate files as soon as they are added to the growing accumulated data frame by putting them to rm().

Glad to see you here.

I am stuck in merging the data .I am sharing screenshot .

That looks like a doubled underscore in bind_rows()

Thank you for helping it was helpful.

I tried to import datasets through below mention method

import jan and feb

all_trips <- dplyr::bind_rows(jan,feb)
rm(jan,feb)

import mar apr

all_trips <- dplyr::bind__rows(all_trips,mar,apr)
rm(mar,apr)

continue through end of year, removing

rm(nov,dec)

But I can import only 6 months datasets and after that I am trying to import more dataset so it shows RAM memory has full and imported datasets also removed automatic . What should I do in this situation ?

You should use a single underscore.

Yeah I have corrected this issue. But whatever I mention in the previous post that is the issue . Please help me to resolve this issue

Maybe read it in using sqldf?

Reading large data files in R • Bart Aelterman (inbo.github.io)

Adding on to @ williaml's suggestion, you may need to consider a data base solution. You might have a look at CRAN - Package duckdb and theDuckdb database. It seems to be getting some good reviews.

I assume that this is for the Google Data Analytics Certificate capstone project. If so, I suggest you post a new topic with that in the title. Hopefully people who were successful importing the data will have some sage advice.

Okay , Thank you for suggesting me . I will create new post .

Thank you all for helping me .

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.