How to free memory from overwritten .RDA variable?

Hey guys,

I have to load some really big dataframes stored as .rda files. In doing so, I did notice the following behavior of my Rstudio Server (same behavior on local Rstudio installation).

Whenever I load multiple variables, the memory seems not to be freed after deleting a variable with rm and calling gc. Example

load("path_to_2gb.rda")
rm(data)
gc()
load("path_to_4gb.rda")

At this point, the session still allocates over 6gb of RAM. My expectation would be that the memory is 4gb, as the first loaded variable is no longer required.

Is this a known behavior or is there another way to free memory? I really need to load all my data sequentially however I have not enough space to have them all stored in RAM.

can you post your output if you do the following ?

gc(full = TRUE,verbose = TRUE)
load("path_to_2gb.rda")
gc(full = TRUE,verbose = TRUE)
rm(data)
gc(full = TRUE,verbose = TRUE)
> gc(full = TRUE,verbose = TRUE)
Garbage collection 16 = 12+1+3 (level 2) ... 
43.7 Mbytes of cons cells used (60%)
14.0 Mbytes of vectors used (22%)
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  817740 43.7    1368111 73.1  1368111 73.1
Vcells 1833753 14.0    8388608 64.0  2206687 16.9
> load("path_to_4gb_file.rda")
> gc(full = TRUE,verbose = TRUE)
Garbage collection 36 = 12+1+23 (level 2) ... 
62.0 Mbytes of cons cells used (56%)
4099.9 Mbytes of vectors used (75%)
            used   (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells   1160025   62.0    2080672  111.2   1368111   73.1
Vcells 537370475 4099.9  712269123 5434.2 537370640 4099.9
> rm(data)
> gc(full = TRUE,verbose = TRUE)
Garbage collection 37 = 12+1+24 (level 2) ... 
43.7 Mbytes of cons cells used (39%)
14.0 Mbytes of vectors used (0%)
          used (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells  817734 43.7    2080672  111.2   1368111   73.1
Vcells 1833743 14.0  569815299 4347.4 537370640 4099.9

After end of code snippet
image

Ok, its what I expected. the current used is just 14Mb, but the max used recorded was 4099.9.
I think the situation is something like. R gets memory from the OS for it to allocate to your objects.
R tracks your object and will garbage collect them when you drop them; but it wont release this to your OS; it will retain that memory for other objects (without it needing to go beg your OS for more memory resource).
So this is a double edged sword.
the good ) you can expect to load some other equivalent sized file with R successfully, you really did release that as far as R is concerned ; the used (Mb ) went back down to where it was at the starts 14 to 4099.9 to 14Mb. going back up to 4099.9 should be no issue.
the bad ) its not released as far as the OS is concerned so if your are using other applications outside of R; they won't get that memory that R is hanging onto for you...

I'm not an expert here and would be happy for someone who is to clarify; and also speak up if there are methods to force R to release memory back to the OS without closing or restarting...

That would mean that as soon as the R-Session reaches the RAM Limit of the system, all available RAM is reserved for that R session and cannot not be used from other projects until the session is
terminated?

This forum has a minimum post length of 20 characters;
to answer your question: yes

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.