code
name
123
A
123
B
666
X
999
M
999
M
555
P
555
Q
555
R
I want to mutate a variable wanted
, which will count the number of different names for each code
. For example, for code 123
the new variable will take the value 2 but for code 999 the value will be 1 since names are the same.
Here is the desired output:
code
name
wanted
123
A
2
123
B
2
666
X
1
999
M
1
999
M
1
555
P
3
555
Q
3
555
R
3
library(dplyr)
# toy data
df <- tibble(
code = c(123, 123, 666, 999, 999, 555, 555, 555),
name = c("A", "B", "X", "M", "M", "P", "Q", "R"),
wanted = c(2, 2, 1, 1, 1, 3, 3, 3)
)
FJCC
November 16, 2021, 11:15pm
2
I think this is what you want, though the count of 999 is 2 not 1.
library(dplyr)
# toy data
df <- tibble(
code = c(123, 123, 666, 999, 999, 555, 555, 555),
name = c("A", "B", "X", "M", "M", "P", "Q", "R")#,
#wanted = c(2, 2, 1, 1, 1, 3, 3, 3)
)
df
# A tibble: 8 x 2
code name
<dbl> <chr>
1 123 A
2 123 B
3 666 X
4 999 M
5 999 M
6 555 P
7 555 Q
8 555 R
df <- df |> group_by(code) |> mutate( wanted = n())
df
# A tibble: 8 x 3
# Groups: code [4]
code name wanted
<dbl> <chr> <int>
1 123 A 2
2 123 B 2
3 666 X 1
4 999 M 2
5 999 M 2
6 555 P 3
7 555 Q 3
8 555 R 3
Thanks for the reply. Unfortunately, I want the count of 999 to be 1 because it has only one distinct name - M.
Until googling just now I would have gone through 2 group_by/summarise layers. (please excuse the change of pipes)
df <- df %>% group_by(code, name) %>% summarise() # get distinct rows
df <- df %>% group_by(code) %>% summarise(wanted = n()) # get countss
But it turns out you can have n_distinct in summarise:
df <- df %>% group_by(code) %>% summarise(wanted = n_distinct())
df <- df %>% group_by(code) %>% summarise() # get distinct rows
Thanks for the tip on ´n_distinct()´.
The following code solves the problem
library(dplyr)
# toy data
df <- tibble(
code = c(123, 123, 666, 999, 999, 555, 555, 555),
name = c("A", "B", "X", "M", "M", "P", "Q", "R")#,
#wanted = c(2, 2, 1, 1, 1, 3, 3, 3)
)
df %>% group_by(code) %>%
mutate(wanted = n_distinct(name))
#> # A tibble: 8 x 3
#> # Groups: code [4]
#> code name wanted
#> <dbl> <chr> <int>
#> 1 123 A 2
#> 2 123 B 2
#> 3 666 X 1
#> 4 999 M 1
#> 5 999 M 1
#> 6 555 P 3
#> 7 555 Q 3
#> 8 555 R 3
Created on 2021-11-17 by the reprex package (v2.0.1)
system
Closed
December 8, 2021, 11:50am
6
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.