How to Read Huge Files With R

Have a GB data, and facing the data import issue when using less memory. which is the problem I faced many year ago. Though expand memory is final and best solution. I want to share some approach as well.

  1. loading part: loading by chunk and store the loaded data into disk storage(some format fast than csv ,database or data.table. or Rdata and etc)
  2. analyze part: it is be hard to analyze the data if facing memory limitation when loading, you can you put it into database and using DBplry to anlyze data via database power as well as R strong power in chunk mode.

That's the way. R + some powerful database(such as Postgresql) can handle any size data if it not large than disk space.
The disadvantage is slow computation such data moved from disk to memory.
The advantage is persistent you analytic & analysis mid-stag result and reuse it, reproduciable.

That's all. Hope it useful for the case.

WangYong

1 Like