I know how to import a csv file using R. But the file has no variable and value labels (codes). What can I do to import a file to R including variable and value labels (similar to SPSS sav files).
Shall I have two csv files? One for data and the other for metadata? Or one csv file including data and metadata?
How to relate data and metadata?
Ex. Imagine that I have in one csv file a column called gender with values 1 and 2. And in another csv file the codes: 1 -> male, 2 -> female.
It seems that you have authority to define the structure of the raw data - Congratulations - this can save you some hassle!
There are multiple ways; I suggest here my personal recommendation: Structure first!
You may want to have 1 file per dataset, with all data and all related meta data in it.
The "classical structure" is one rectangular table with 1 variable per column and 1 entity per row.
First row holds the variable names.
You also may want to look up upfront the difference between long and wide data sets. In a nutshell, "long" is easier for the machines, "wide" easier for the human reader - but these can be converted.
This rectangular structure is easy to import and to process.
Have fun!
My raw data is in several Excel files, some for data and other for metadata. The simpler solution is to import data and metadata to SPSS. But I wanted to know if there is an R similar solution for this problem.
Here is an example using two csv files. But I have a problem with value labels.
> data
# A tibble: 6 × 2
se ctr
<chr> <chr>
1 1 1
2 1 2
3 2 3
4 2 2
5 1 1
6 2 3
> metadata
# A tibble: 2 × 3
var var_label val_lab
<chr> <chr> <chr>
1 se sex (1,'Female'),(2,'Male')
2 ctr country (1,'UK'),(2,'USA'),(3,'France')