Hi, I would like to create a new column in my dataset X and fill this column with values from dataset Y. Two conditions have to be met (same region and same date). I try to do this by forloop: for (i in length(X[,1])) { X$Newcolumn <- sapply(1:nrow(X), function(i) Y$Value[Y$region==X$region[i] & Y$date==X$date[i]]) }
But I got this error message: Error in Ops.factor(Y$date, X$date[i]) : level sets of factors are different date in both datasets has the same format ymd.
What does it mean? I don`t understand why should factors be the same? How can I fix it please? Thanks a lot!
To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:
Well, you could programmatically generate a large dataset, but nevertheless, I don't see the need to reinvent the wheel, instead of a for-loop you could simply use dplyr::left_join() or even base::merge(), see this example.
library(dplyr)
Y <- data.frame(
date = c("2005-01-01", "2005-01-01", "2005-01-01", "2005-01-02", "2005-01-02"),
region = c(0, 1, 2, 0, 1),
Value = c(12, 12, 11, 5, -8)
)
X <- data.frame(
date = c("2005-01-01", "2005-01-01", "2005-01-01", "2005-01-01", "2005-01-01", "2005-01-02", "2005-01-02"),
region = c(0, 0, 1, 2, 3, 0, 1)
)
X %>%
left_join(Y, by = c("date", "region"))
#> date region Value
#> 1 2005-01-01 0 12
#> 2 2005-01-01 0 12
#> 3 2005-01-01 1 12
#> 4 2005-01-01 2 11
#> 5 2005-01-01 3 NA
#> 6 2005-01-02 0 5
#> 7 2005-01-02 1 -8
Created on 2020-01-18 by the reprex package (v0.3.0.9000)