Which instance to run RStudio on AWS for text mining project

Apologies if this is not the right place and rather general, but I was wondering whether anyone here can share her/his experience when selecting instances to run RStudio on AWS.

I am working on a (for my standards) somewhat comprehensive text mining project (with a lot of regex-es, one file around 300 MB), and doing it on my laptop (16 gig RAM, i5-8xx) is possible, but slow.

Aside from trying to optimizing the code, I would like to try running RStudio in the cloud on a more powerful machine. As far as I can see AWS provides a rather convenient service for this, however, one has to know what instance best serves one's purpose. Since I am pretty new to AWS, I was wondering whether anyone has an experience with AWS instances and RStudio. Here a link to the myriad of EC2 instances

What are people using? What would you recommend?

Many thanks! And apologies for the rather general question.

I think you can answer your own question once you identify what is your limiting computational resource.

For example, if you are falling short with your 16GB of RAM, choose an instance with more RAM, or if RAM is OK but the process is slow because you are using parallelized code and you have too few cores, choose an instance with more cores, etc etc.

The nice thing is that once you have setup your AMI you can change the EC2 instance type by simply changing an option without having to reinstall anything so you can keep trying until you get the desired performance within your budget.

1 Like

Great. Many thanks. I didn't know that I am able to change the instance type without reinstalling everything.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.