I have two tbls based on a odbc connection:
> glimpse(order_discrepancy)
Observations: ??
Variables: 6
$ cqfescioqy.shopper_id <chr> "1234", "5678", "9101112", "12141516", "65758946", "8756745346", "76543456", "8765", "10536771", "98765", "264957...
$ cqfescioqy.tos_signed_date <date> NA, 2015-03-19, NA, 2016-01-21, NA, NA, 2015-10-07, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2016-10-03, NA, NA, NA, NA, 2017-02-2...
$ cqfescioqy.order_date <date> 2018-12-07, NA, 2018-12-07, NA, 2018-12-07, 2018-12-07, NA, 2018-09-17, 2018-12-09, 2018-12-09, 2018-12-08, 2018-12-10, 2018-12-10, 2...
$ cqfescioqy.date_diff_days <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ cqfescioqy.inorder <lgl> TRUE, NA, TRUE, NA, TRUE, TRUE, NA, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, NA, TRUE, TRUE, TRUE, TRUE, NA, ...
$ cqfescioqy.intos <lgl> NA, TRUE, NA, TRUE, NA, NA, TRUE, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, TRUE, NA, NA, NA, NA, TRUE, NA, NA, NA, NA, NA, NA, TRUE...
and
> glimpse(merges)
Observations: ??
Variables: 13
$ okmalpjbbd.merge_progressid <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,...
$ okmalpjbbd.gainingid <chr> "5843743", "5843743", "3779857", "3779857", "4899246", "4899246", "4377979", "4377979", "2972481", "2972481", "3720865", "3720865"...
$ okmalpjbbd.losingid <chr> "5626534", "5626534", "5625603", "5625603", "5755277", "5755277", "1683489", "1683489", "2976089", "2976089", "5380632", "5380632"...
These are tbls pre collect()
. I do not know why the variable names are prepended with cqfescioqy and okmalpjbbd. If I apply regular dplyr functions such as filter, I can just type the regular feature name and ignore these prepended field strings e.g. merges %>% filter(losingid = 5626534) # works
However, I'm trying to join the two tbls like so:
order_discrepancy <- order_discrepancy %>% left_join(merges, by = c("shopper_id" = "losingid"))
This gives an error:
"Error: by
can't contain join column shopper_id
which is missing from LHS. "
However, using glimpse() I can see that it does exist on the left hand side.
I tried adding explain() after my left join but received the same error.
I know that the behavior of dplyr will differ with TBLs based on a live connection to our database, however this looks like a fairly standard task so presumably I'm not doing it right. I cannot tell if the fact that glimpse shows the feature names prepended with these odd strings is relevant to this problem or not.
How can I left join these twi TBLs on shopper_id = losingid?