Hello!
Trying to find Product Ids that have different Product Types in the following sample data. In this case, looking for solution with all rows where Product ID is M12
df <- data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Date = c("2022-01-01", "2022-01-01", "2022-01-01"),
Country = c("Canada", "Canada", "USA"),
`Product Type` = c("AFC", "FBF", "CBF"),
`Product ID` = c("M10", "M12", "M12")
)%>%
mutate(Date = ymd(Date))
Thanks for the help!
FJCC
August 25, 2022, 4:16pm
2
This is conveniently done with the filter() function from the dplyr package.
library(dplyr)
library(lubridate)
df <- data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Date = c("2022-01-01", "2022-01-01", "2022-01-01"),
Country = c("Canada", "Canada", "USA"),
`Product Type` = c("AFC", "FBF", "CBF"),
`Product ID` = c("M10", "M12", "M12")
)%>%
mutate(Date = ymd(Date))
dfFiltered <- df |> filter(`Product ID` == "M12")
dfFiltered
#> Date Country Product Type Product ID
#> 1 2022-01-01 Canada FBF M12
#> 2 2022-01-01 USA CBF M12
Created on 2022-08-25 by the reprex package (v2.0.1)
Well! I am looking for solution without knowing Product ID to look at al those Ids that have different Types. I am trying to check those and find data discrepancy.
Thanks!
You could something like this:
library(tidyverse)
df |>
group_by(`Product ID`) |>
summarise(
different_prodTypes = n_distinct(`Product Type`)
) |>
ungroup() |>
filter(different_prodTypes > 1)
#> # A tibble: 1 × 2
#> `Product ID` different_prodTypes
#> <chr> <int>
#> 1 M12 2
Created on 2022-08-25 by the reprex package (v2.0.1)
This will show you all Product IDs with more than one Product Type.
Kind regards
Thanks @FactOREO !
Is there a way to see Product Types along with it. It will make it easier for us to clear discrepancies in data.
You could do something like this:
library(tidyverse)
df |>
group_by(`Product ID`) |>
summarise(
different_prodTypes = n_distinct(`Product Type`),
the_prodTypes = list(unique(`Product Type`))
) |>
ungroup() |>
filter(different_prodTypes > 1) |>
unnest(the_prodTypes)
#> # A tibble: 2 × 3
#> `Product ID` different_prodTypes the_prodTypes
#> <chr> <int> <chr>
#> 1 M12 2 FBF
#> 2 M12 2 CBF
Created on 2022-08-25 by the reprex package (v2.0.1)
It's in long format, for wide format use a counter along the different Product Types and reshape it with e.g. pivot_wider
or another function that does this and you like to use.
Kind regards
system
Closed
September 1, 2022, 5:22pm
8
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.