Is there a way to group by 2 columns or a custom parameter ?

I have 2 tables:

  1. Master_table has data below;
City Dependency Product Type
Area1 Payment PRD876 D
Area1 Collection PRD543 A
Area2 Collection PRD109 C
Area3 Payment PRD598 S
Others1 Debit PRD120 A
Others2 e-Payment PRD501 G
Others3 DD PRD409 X
  1. Dependency_table
City Dependency Start Date
Area1 Payment 9/29/2020
Area1 Collection 9/29/2020
Area2 Collection 9/30/2020
Area3 Payment 10/1/2020
Debit 9/29/2020
e-Payment 9/30/2020
DD 10/4/2020

What I need:

City Dependency Product Type Start Date
Area1 Payment PRD876 D 9/29/2020
Area1 Collection PRD543 A 9/29/2020
Area2 Collection PRD109 C 9/30/2020
Area3 Payment PRD598 S 10/1/2020
Others1 Debit PRD120 A 9/29/2020
Others2 e-Payment PRD501 G 9/30/2020
Others3 DD PRD409 X 10/4/2020

Basically I would like to group 2 tables by columns City and Dependency.

Much appreciated

If I assume that the blank City values in Dependency_table are a typo, then this should work.

library(dplyr)
NewTable <- inner_join(Master_table, Dependency_table, by = c("City", "Dependency")

If they really are blank, then I do not understand the logic of the process.

@FJCC, they are really blank. But we can expand the table by adding "Global" on those blank cells.

City Dependency Start Date
Area1 Payment 9/29/2020
Area1 Collection 9/29/2020
Area2 Collection 9/30/2020
Area3 Payment 10/1/2020
Global Debit 9/29/2020
Global e-Payment 9/30/2020
Global DD 10/4/2020

Do you think this would work ?

Edit: Logic: only payment & collection has city dependency. Remaining rows have availability globally, so their city fields were kept blank.

If the City column in both tables is labeled as Global when Dependency is not Payment or Collection, then the code I suggested should work. If the labeling is not identical, then you are faced with joining some rows based on City and Dependency and other rows on only Dependency. I do not know how to do that in one step. If it must be done, I would make subsets of the data, one for Payment and Collection rows and another for all other rows, join those separately, and then combine them.

Maybe I'm oversimplifying your issue, but from your explanation, I can infer that (Others1, Others2, Others3) = Global, if that is the case, then you can simply rename them before making the join, see this example:

library(dplyr)
library(stringr)

# Sample data on a copy/paste friendly format
Master_table <- data.frame(
  stringsAsFactors = FALSE,
              City = c("Area1","Area1","Area2",
                       "Area3","Others1","Others2","Others3"),
        Dependency = c("Payment","Collection",
                       "Collection","Payment","Debit","e-Payment","DD"),
           Product = c("PRD876","PRD543","PRD109",
                       "PRD598","PRD120","PRD501","PRD409"),
              Type = c("D", "A", "C", "S", "A", "G", "X")
)

Dependency_table <- data.frame(
  stringsAsFactors = FALSE,
              City = c("Area1","Area1","Area2",
                       "Area3","Global","Global","Global"),
        Dependency = c("Payment","Collection",
                       "Collection","Payment","Debit","e-Payment","DD"),
        Start_Date = c("9/29/2020","9/29/2020",
                       "9/30/2020","10/1/2020","9/29/2020","9/30/2020",
                       "10/4/2020")
)

Master_table %>% 
    mutate(City = if_else(str_detect(City, "^Others"), "Global", City)) %>% 
    left_join(Dependency_table, by = c("City", "Dependency"))
#>     City Dependency Product Type Start_Date
#> 1  Area1    Payment  PRD876    D  9/29/2020
#> 2  Area1 Collection  PRD543    A  9/29/2020
#> 3  Area2 Collection  PRD109    C  9/30/2020
#> 4  Area3    Payment  PRD598    S  10/1/2020
#> 5 Global      Debit  PRD120    A  9/29/2020
#> 6 Global  e-Payment  PRD501    G  9/30/2020
#> 7 Global         DD  PRD409    X  10/4/2020

Created on 2020-09-30 by the reprex package (v0.3.0)

1 Like

Thanks for your help @andresrcs

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.