Hello I'm new to the R community and trying to use it to prepare some data for analysis. I have 2 slightly different excel spreadsheets which I have uploaded to r studio just fine now I want to make the 2 data frames consistent before I merge them for analysis. I'm using the code below in R studio:
#add missing column to data1
missing_in_data1 <- setdiff( colnames(data2), colnames(data1)) for (col in missing_in_data1) { data1[[col]] <- NA }
And I keep getting this error in R studio
Error: unexpected 'for' in "missing_in_data1 <- setdiff(colnames(data2),colnames(data1)) for"
I have crosschecked the code many times and rewritten it over and over and keep getting the same error message.
What am I doing wrong? please help me I'm on a deadline.
It is difficult to tell. The code does not seem to make sense in R terms.
This part will work though I am not sure it is doing what you want
setdiff( colnames(DT), colnames(DT1))
However you cannot use for as you are trying to do.
It looks like you are trying to add a column of "NA" to a file but I don't think this is how you want to go about it. I think you may be over-complicating things. It is easy to do when beginning to use R.
I think what we need is some sample data and a clear description of the problem. If the data files are not too large and are not confidential, the best thing to do is to upload them here.
Perhaps the best way to supply data is to use the dput() function. Do dput(data1) where "data1" is the name of your dataset. For really large datasets probably dput(head(data1, 100). Paste it here between
```
```
Repeat for "data2".
IF the data is confidential just mock up a couple of data sets with the same structure as data1 and data2. We don't really need the real data; what we need is the structure of the two data sets.