Handling nested dplyr mutates

Hello, I have the following very simple code:

library("dplyr")

I have the following data frame:

cars = data.frame(
  brand = c("Ford", "Mercedes Benz", "Honda", "Ford", "Honda"),
  color = c("Grey", "Black", "Blue", "Red", "Blue")
)
cars
##           brand color
## 1          Ford  Grey
## 2 Mercedes Benz Black
## 3         Honda  Blue
## 4          Ford   Red
## 5         Honda  Blue

Also, I have the following mapping_car_country :

mapping_car_country = data.frame(
  brand = c("Ford", "Mercedes Benz", "Honda"),
  country = c("USA", "Germany", "Japan")
)
mapping_car_country
##           brand country
## 1          Ford     USA
## 2 Mercedes Benz Germany
## 3         Honda   Japan

My goal is: mutate data frame: cars by adding a new column: country accordingly with the mapping: mapping_car_country .

I tried the following which is not correct, but probably it is an starting point:

# hardcoding "Ford" where it should be the brand on data frame: `cars`
# I need to get a reference to the outer brand somehow
cars = cars %>% mutate(country = (mapping_car_country %>% dplyr::filter(brand == "Ford"))[["country"]])
cars
##           brand color country
## 1          Ford  Grey     USA
## 2 Mercedes Benz Black     USA
## 3         Honda  Blue     USA
## 4          Ford   Red     USA
## 5         Honda  Blue     USA

Thanks!

Is this what you want?

library(dplyr)

cars <- data.frame(
    brand = c("Ford", "Mercedes Benz", "Honda", "Ford", "Honda"),
    color = c("Grey", "Black", "Blue", "Red", "Blue")
)

mapping_car_country <-  data.frame(
    brand = c("Ford", "Mercedes Benz", "Honda"),
    country = c("USA", "Germany", "Japan")
)

cars %>% 
    left_join(mapping_car_country)
#> Joining, by = "brand"
#>           brand color country
#> 1          Ford  Grey     USA
#> 2 Mercedes Benz Black Germany
#> 3         Honda  Blue   Japan
#> 4          Ford   Red     USA
#> 5         Honda  Blue   Japan

Created on 2019-09-27 by the reprex package (v0.3.0)

1 Like

Thanks @andresrcs, I think I have to rephrase my question, since what I posted on my original question was a simplification of my problem and I'm afraid that the answer doesn't apply to it.

Here there is a second try:

library("dplyr")

I have the following data frame:

countries = data.frame(
  country = c("USA", "United Kingdom", "China"),
  president = c("Donald Trump", "Boris Johnson", "Xi Jinping")
)
countries
##          country     president
## 1            USA  Donald Trump
## 2 United Kingdom Boris Johnson
## 3          China    Xi Jinping

Also, I have the following mapping_car_country :

mapping_country_population_area = data.frame(
  country = c("USA", "United Kingdom", "China"),
  population = c("329963026", "66435600", "1399285880"),
  area_sqm = c("3796742", "93788", "3722342")
)
mapping_country_population_area
##          country population area_sqm
## 1            USA  329963026  3796742
## 2 United Kingdom   66435600    93788
## 3          China 1399285880  3722342

My goal is: mutate data frame: countries by adding a new column: popdensity accordingly with the mapping: mapping_country_population_area .

I tried the following which is not correct, but probably it is an starting point:

# hardcoding "USA" where it should be the country on data frame: `countries`
# I need to get a reference to the outer country somehow
countries = countries %>% mutate(
  popdensity =
    (mapping_country_population_area %>% dplyr::filter(country == "USA"))[["population"]] /
    (mapping_country_population_area %>% dplyr::filter(country == "USA"))[["area_sqm"]]
)
## Warning in Ops.factor((mapping_country_population_area %>%
## dplyr::filter(country == : '/' not meaningful for factors
countries
##          country     president popdensity
## 1            USA  Donald Trump         NA
## 2 United Kingdom Boris Johnson         NA
## 3          China    Xi Jinping         NA

Thanks!

Based on the example you provided, it seems to me the left_join approach provided by @andrescs is still the way to go: countries %>% left_join(mapping_country_population_area). If not, please show us what the expected output data frame would look like. That will help us understand exactly what you want to achieve.

1 Like

Thank you guys, you helped me!

Here I post the code that did the trick, probably it helps somebody else:

countries_info =
  countries %>%
  left_join(mapping_country_population_area, by = "country") %>%
  mutate(popdensity = population / area_sqm) %>%
  select(-c("population", "area_sqm"))
countries_info

Thanks!

1 Like

For the select you can also use bare column names:

select(-population, -area_sqm)
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.