Could anyone please help me? I have RStudio in my local machine and running into memory issues when reading a huge data file. Planning to use SparkR. But, want to know on the below.
1.Is there any difference between Sparklyr and SparkR?
2. Since , both are used for Spark Integration from R, when to use which package?
3. Since, both have active community, trying to see which one is the best?
4. Also , are there any limitations or challenges on these two?

If anyone could provide details on this, that would be great.

Things may have changed since May, but thought I'd point to @kevinushey's outline of differences/motivations here:

About this, there another topic on the community that could have interesting information for you

And also the two topic are related, it is good to link them.

I recently wrote a blog post comparing sparklyr and SparkR on a range of criteria.

The table at the top of the post provides a rough overview of my conclusions. Essentially sparklyr is already nicer to use even at this early stage of its development (compared to SparkR).


Great job on the article, I recently read it and shared it internally. Thanks!

