Preserve and restore data in R

Hi Everyone,

I am new to Rstudio. I would like to ask about how to use the reserve and restore function. I am working on cleaning and editing the data frame (old_data) and I need to save it as a new data frame (new_data). However, the data in memory (old_data) should be unchanged after finishing and saving. I am using this code:

library(genvar)

use(old_data, clear=TRUE)
p <- preserve()
old_data <- subset(old_data, systemsizedc >= 10,)
setwd("~/Dropbox/Rlang/03_raw_data")
save.image(file = "newdata.RData")
restore(p, replace=TRUE)

But the old_data frame chnages and has not been preserved.

Thank you for your help

You can assign a new variable for the data and run that through cleaning. Then the new variable will point to a clean version, but old_data will be unchanged.

new_data <- subset(old_data, systemsizedc >= 10,)
setwd("~/Dropbox/Rlang/03_raw_data")
save(new_data, file = "newdata.RData")

The save.image function is more like a "pause" during work. It saves every object in your session. If you want to save specific objects, use the save or saveRDS functions.

I could be mistaken, but are you a Stata user? I have the impression that you are trying to replicate Stata's preserve and restore commands. If that is the case, I would like to complement @nwerth's answer to clarify some important differences between the way you store data in R and Stata.

First, unlike with Stata, you can have multiple data frames (or objects) loaded at once in R. This means that you can perform any transformation you want to a data frame and give it a new name, like so:

setwd("~/Dropbox/Rlang/03_raw_data")

old_data <- iris # iris comes with R
new_data <- subset(old_data, Sepal.Length > 5)

# And you can save them wherever you want
save(old_data, file = "old_data.RData")
save(new_data, file = "new_data.RData")

# Both old_data and new_data remain available in your environment and 
# you can continue to work with either or them. For example, to print the 
# first six observations:
head(old_data)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
head(new_data)
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1           5.1         3.5          1.4         0.2  setosa
#> 6           5.4         3.9          1.7         0.4  setosa
#> 11          5.4         3.7          1.5         0.2  setosa
#> 15          5.8         4.0          1.2         0.2  setosa
#> 16          5.7         4.4          1.5         0.4  setosa
#> 17          5.4         3.9          1.3         0.4  setosa

In Stata, the same operations may look something like this:

cd "~/Dropbox/Rlang/03_raw_data"
webuse http://www.stata-press.com/data/r10/iris.dta, clear

preserve
  keep if seplen > 5
  save "new_data.dta"
restore

save "old_data.dta"

list in 1/6 
2 Likes

Hi Sir,

Thank you for your reply.

Indeed, I am a stata user since 5 years. You are right I was trying to mimic the same stata code/function in R.

However, your answer is clear and helpful!

Best regards,
Al

Hi,

Many thanks for your tips! I tried that and it worked better.

You're welcome. FYI I'm a woman :slight_smile:

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.