Hello,
Sometime ago I asked this using a STATA file as source.
Stata can read the variables names and then import the selected ones.
For example, if the dta file has 500 columns and the variables SEX and AGE, I can import only those columns.
The two columns will be read and load on memory.
This is very fast when you are looping through big datasets.
Is It possible to perform the above task using a RData file?
I mean, importing only the specified columns/variables names in order to perform a loop?
I read the help file and It doesn't show that capability.
If I try to write a loop with a RData file, I'm always reading and importing all the variables.
Using select from dplyr doesn't improve the proc time, because I'm still force to read many variables that aren't important or relevant.
I hope I made myself clear. It's a strange question in some way.
Thanks for your time and interest.
Have a nice day.
It's a good choice what You said.
But sometimes the DTA files comes with labels.
CSV are a poor source when You think about that.
I recently read that heaven can read some selected columns with cols option.
But RData have'nt implemented that option yet.
I joined 12 dta files, each one of 80 MB.
A RData with the dta files only amount to 70 MB.
I just solved It using CSV or DTA files.
RData can't do the col_selec or select option.
data.table::fread or haven::read_dta can perform the task desired.
fread was the best option.
Thanks for your ideas.