I have two data frames similar to below and I want to create a dataframe showing the differences as in the differences... how easy is this with R
Hello,
I'm sure you shared this image with the best intentions, but perhaps you didnt realise what it implies.
If someone wished to use example data to test code against, they would type it out from your screenshot...
This is very unlikely to happen, and so it reduces the likelihood you will receive the help you desire.
Therefore please see this guide on how to reprex data. Key to this is use of either datapasta, or dput() to share your data as code
Sorry about that.. here are the 2 dput statement for the 2 dataframe I want to compare
structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(100, 400, 120)), row.names = c(NA,
-3L), class = "data.frame")
structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(120, 350, 100)), row.names = c(NA,
-3L), class = "data.frame")
and this the result I want
structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(20, -50, -20)), row.names = c(NA,
-3L), class = "data.frame")]
a1 <- structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(100, 400, 120)), row.names = c(NA,
-3L), class = "data.frame")
b2 <- structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(120, 350, 100)), row.names = c(NA,
-3L), class = "data.frame")
goal <- structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(20, -50, -20)), row.names = c(NA,
-3L), class = "data.frame")
c3 <- a1
c3$total_pop <- b2$total_pop - a1$total_pop
identical(c3,goal)
Here's a different way to also achieve the desired end-result.
library(tidyverse)
df_1 <- structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(100, 400, 120)), row.names = c(NA,
-3L), class = "data.frame")
df_2 <- structure(list(town = c("Crewe", "Sandbach", "Middleiwch"), area = c("Cheshire",
"Cheshire", "Cheshire"), total_pop = c(120, 350, 100)), row.names = c(NA,
-3L), class = "data.frame")
df_3 <-
rbind(df_1, df_2) %>%
group_by(town, area) %>%
summarise(total_pop = diff(total_pop))
Created on 2022-02-28 by the reprex package (v2.0.1)
Thanks for your answer
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.