Trying to get the gist of how the Apache Arrow project can be used in R...
I see it has been already implemented in sparklyr.. but we dont use spark.
From all the stuff i've read it has many benefits including these:
- efficient memory representation. R is quite bloated for memory usage - is Arrow something that can be used as the backend for any objects?
- gandiva - language-agnostic data querying - i.e. once the memory object is in arrow format, non-R libraries can be used to compile/execute the query
- querying across larger-than-memory datasets
does anyone know what work there is to integrate arrow beyond sparklyr?
Calvin