Average of transposed counts in dplyr, is it possible?

Slavek · August 22, 2023, 8:50am

Hi, I have this simple df with DealerName and Region. I have done this:

source <- data.frame(
  stringsAsFactors = FALSE,
  DealerName = c("aaa","aaa","aaa","bbb",
                 "bbb","bbb","ccc","ccc","ccc","ccc","ccc","ccc",
                 "ddd","ddd","ddd","ddd","ddd","ddd","ddd","ddd"),
  Region = c("East Midlands",
             "East Midlands","East Midlands","East of England","East of England",
             "East of England","East of England","East of England",
             "East of England","East of England","Greater London",
             "Greater London","Greater London","Greater London",
             "Greater London","Greater London","Greater London",
             "Greater London","Greater London","Greater London"),
  Drive.Time = c(20,15,20,18,12,15,20,15,
                 20,10,18,12,15,20,15,20,18,12,15,10)
)

source

library(dplyr)
result <- source %>%
  group_by(DealerName, Region) %>%
  summarise(cnt = n()) 
result

library(tidyr)
result.table <- result %>%
  select(DealerName, Region, cnt)  %>%
  spread(Region, cnt)
result.table

but is it possible to find an average count for each Region?
Very easy in Excel and East Midlands is 3, East of England 3.5 and Greater London 5

Leon · August 22, 2023, 9:22am

Like so:

> result |> ungroup() |> group_by(Region) |> summarise(mu = mean(cnt))
# A tibble: 3 × 2
  Region             mu
  <chr>           <dbl>
1 East Midlands     3  
2 East of England   3.5
3 Greater London    5

?

system · October 3, 2023, 9:22am

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.