Dynamic grid which summarizes


Welcome to the RStudio community!

Let me start by saying that I have not heard or used PowerBI, so my approach below is just the solution as if it were performed in a regular R script.

I'd also like to point you to the Reprex guide. A reprex consists of the minimal code and data needed to recreate the issue/question you're having. You can find instructions how to build and share one here:

The reason for this is that the way you shared your dataset here is not easy to transfer to R just by copy-paste. Reading the guide, you can learn how to share the data like this in future:

  stringsAsFactors = FALSE,
             Sales = c(20L, 15L, 25L, 10L, 19L, 11L, 10L),
            Profit = c(10L, 12L, 18L, 10L, 15L, 20L, 30L),
          Category = c("FISH", "FISH", "FISH", "MEAT", "MEAT", "MEAT", "VEG"),
            Region = c("UK", "US", "AUS", "UK", "US", "AUS", "US")

OK, now for the solution. Here is my approach, assuming you have the dataset as a variable in R and the column of interest as one too:

library(dplyr) #make sure you have the latest version, dplyr 1.0.0+

> myData
  Sales Profit Category Region
1    20     10     FISH     UK
2    15     12     FISH     US
3    25     18     FISH    AUS
4    10     10     MEAT     UK
5    19     15     MEAT     US
6    11     20     MEAT    AUS
7    10     30      VEG     US

colOfInterest = "Region"

myData = myData %>% group_by(across(all_of(colOfInterest))) %>% 
  summarise(Sales = sum(Sales), Profit = sum(Profit), .groups = "drop")

> myData
# A tibble: 3 x 3
  Region Sales Profit
  <chr>  <int>  <int>
1 AUS       36     38
2 UK        30     20
3 US        44     57

I used the tidyverse implementation here. If you are unfamiliar with this, please read some more here

The group_by and summarise function are basic tidyverse functions (part of dplyr package). The first groups the data by a certain column, the second allows to summarise the other columns. In this case, I assumed that Sales and Profit are always to be summarised.

The more difficult to grasp part is the across(all_of(colOfInterest)) bit. Normally, you would just type the column name in the group_by function like this: myData %>% group_by(Region). However, you don't know beforehand which column to group by, so you need to convert a string given by the user's selection (from PowerBI) to a column name that the tidyverse language accepts.

To do this, you use the new across function (dplyr 1.0.0) combined with the all_of function. Here you can have a vector of one or more strings that convert to column names.

I know this all sounds a bit confusing, and I still struggle with it myself as it's all new in dplyr 1.0.0, so maybe someone else comes up with even a more elegant solution :slight_smile:

Hope this helps,