Frequency of buy alternatives or options mutate()

Hello,

I have three clients: "90-1", "90-2", "90-3".
How may I ordered by descendant probability each?

  df <- data.frame(id = c("1-1","1-1","1-1","1-1","1-1","1-1","1-1","1-1",
                            "1-1","1-1", "1-1","1-1","1-1","1-1","1-1",
                            "2-2","2-2","2-2","2-2","2-2"),
                     group = c(1,1,1,2,2,3,3,3,4,4,5,
                               5,5,5,5,
                               5,5,5,7,8),
                     product = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,
                                 15,16,17, 18, 19, 20),
                     client = c("90-1", "90-1","90-1","90-1","90-1","90-1","90-1",
                              "90-1","90-1","90-1","90-1","90-1","90-2","90-2",
                              "90-2","90-2","90-2","90-3","90-3","90-3"),
                     freq = c(2,2,2,4,5,6,1,1,2,8,11,1,3,4,
                                 5,6,1, 3, 7, 6)) %>% mutate(prob = 1/freq)
  
  df

how it´s possible to count all freq and mutate another column in R can obtain the probability like this or other ways?

  mutate(probability = n / sum(n))

I really appreciate some alternatives or ways to calculate the probability. Thanks!!!!!!!!!!
Thanks

I need the order by rut, but this code doesn´t order correctly by client:

  > df %>% group_by(client, df[order(df$freq),])
  
     id    group product client  freq   prob
     <chr> <dbl>   <dbl> <chr>  <dbl>  <dbl>
   1 1-1       3       7 90-1       1 1     
   2 1-1       3       8 90-1       1 1     
   3 1-1       5      12 90-1       1 1     
   4 2-2       5      17 90-2       1 1     
   5 1-1       1       1 90-1       2 0.5   
   6 1-1       1       2 90-1       2 0.5   
   7 1-1       1       3 90-1       2 0.5   
   8 1-1       4       9 90-1       2 0.5   
   9 1-1       5      13 90-2       3 0.333 
  10 2-2       5      18 90-3       3 0.333 
  11 1-1       2       4 90-1       4 0.25  
  12 1-1       5      14 90-2       4 0.25  
  13 1-1       2       5 90-1       5 0.2   
  14 1-1       5      15 90-2       5 0.2   
  15 1-1       3       6 90-1       6 0.167 
  16 2-2       5      16 90-2       6 0.167 
  17 2-2       8      20 90-3       6 0.167 
  18 2-2       7      19 90-3       7 0.143 
  19 1-1       4      10 90-1       8 0.125 
  20 1-1       5      11 90-1      11 0.0909

Hi, I'm not I got all your questions correctly but here are some tips:

df <- df %>% ... adding the df at start with arrow means it will save any changes you make to it by the piping on right side.

df %>% mutate(...) %>% arrange(client, prob) Adding the arrange function with levels of arranging will create the following table:

library(dplyr)

df <- data.frame(id = c("1-1","1-1","1-1","1-1","1-1","1-1","1-1","1-1",
                        "1-1","1-1", "1-1","1-1","1-1","1-1","1-1",
                        "2-2","2-2","2-2","2-2","2-2"),
                 group = c(1,1,1,2,2,3,3,3,4,4,5,
                           5,5,5,5,
                           5,5,5,7,8),
                 product = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,
                             15,16,17, 18, 19, 20),
                 client = c("90-1", "90-1","90-1","90-1","90-1","90-1","90-1",
                            "90-1","90-1","90-1","90-1","90-1","90-2","90-2",
                            "90-2","90-2","90-2","90-3","90-3","90-3"),
                 freq = c(2,2,2,4,5,6,1,1,2,8,11,1,3,4,
                          5,6,1, 3, 7, 6))

df <- df %>%
  mutate(prob = 1/freq) %>%
  arrange(client, prob)

df %>% as_tibble()

id group product client freq prob

1 1-1 5 11 90-1 11 0.0909
2 1-1 4 10 90-1 8 0.125
3 1-1 3 6 90-1 6 0.167
4 1-1 2 5 90-1 5 0.2
5 1-1 2 4 90-1 4 0.25
6 1-1 1 1 90-1 2 0.5
7 1-1 1 2 90-1 2 0.5
8 1-1 1 3 90-1 2 0.5
9 1-1 4 9 90-1 2 0.5
10 1-1 3 7 90-1 1 1
11 1-1 3 8 90-1 1 1
12 1-1 5 12 90-1 1 1
13 2-2 5 16 90-2 6 0.167
14 1-1 5 15 90-2 5 0.2
15 1-1 5 14 90-2 4 0.25
16 1-1 5 13 90-2 3 0.333
17 2-2 5 17 90-2 1 1
18 2-2 7 19 90-3 7 0.143
19 2-2 8 20 90-3 6 0.167
20 2-2 5 18 90-3 3 0.333

For the other question about sum, I would use a simple loop:

for (each line in df){
 t = current value of client
s = sum(df$prob[df$client == t,])
probability <- df$prob[current line]/s

I hope this answers your questions.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.