ply
1
two dataframes are identical but the 2nd one could have updated data and also new records
I want to display which rows are new and which rows have been updated ?
example 2 dataframes. the 2nd one has a new rows added and also one of the column values changed for another row
a1 <- structure(list(
key = c("1", "2", "3"),
town = c("Crewe", "Sandbach", "Middlewich"),
area = c("Cheshire","Cheshire", "Cheshire"),
total_pop = c(100, 400, 120)),
row.names = c(NA, -3L),
class = "data.frame")
a2 <- structure(list(
key = c("1", "2", "3","4"),
town = c("Crewe", "Sandbach", "Middlewich","Nantwich"),
area = c("Cheshire","Cheshire", "Cheshire","Cheshire"),
total_pop = c(100, 400, 100,200)),
row.names = c(NA, -4L),
class = "data.frame")
cheers
If I wanted to see differences, I usually reach for waldo
waldo::compare(a1,a2)
Unfortunately this forum doesn't colourise the way waldo does, the colourisation highlights the differences which you wont see in the below text
`attr(old, 'row.names')`: 1 2 3
`attr(new, 'row.names')`: 1 2 3 4
`old$key`: "1" "2" "3"
`new$key`: "1" "2" "3" "4"
`old$town`: "Crewe" "Sandbach" "Middlewich"
`new$town`: "Crewe" "Sandbach" "Middlewich" "Nantwich"
`old$area`: "Cheshire" "Cheshire" "Cheshire"
`new$area`: "Cheshire" "Cheshire" "Cheshire" "Cheshire"
`old$total_pop`: 100 400 120
`new$total_pop`: 100 400 100 200
ply
3
Thanks for that..nice but the formatting has a lot to be desired
Is there a way of exporting these differences rather than trying to work them out from the console ?
library(tidyverse)
dplyr::setdiff(a2,a1) %>% mutate(key_in_first = key %in% pull(a1,key))
in this case you can show the differences by row of the 2nd as compared to 1st, and distinguish additions from updates by reference to the key
system
Closed
5
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.