select only the highest value for every client in each month

Omar · February 17, 2022, 11:25am

hello guys ,
So i have a dataset ("mydf") have 3 columns month, reseller name and net sales , I'm trying to see for each client it's highest net sales in every month.

i found a solution but it's really slow , i had first filter for only specific month and as you can see in my code select the top n ...

any solution to speed this up , please ?

mydf <- read.csv("C:/Users/HmissiOm/Downloads/my_df.csv")

mydf202101 <- mydf[mydf$month == 202101,]

mydf202101 %>% group_by(mydf202101$Reseller_name) %>% 
  top_n(1, Net_Sales)

nirgrahamuk · February 17, 2022, 11:30am

this syntax is incorrect

mydf202101 %>% group_by(mydf202101$Reseller_name) %>% 
  top_n(1, Net_Sales)

should be

mydf202101 %>% group_by(Reseller_name) %>% 
  top_n(1, Net_Sales)

how slow is your code ?
does it take hours ? minutes? seconds ? micro seconds ?

Omar · February 17, 2022, 11:35am

it's not slow in that way , i mean for every month i need to filter first so it's time consuming and slow to finish the work for me in the best delay

nirgrahamuk · February 17, 2022, 12:30pm

mydf %>% group_by(month,
                  Reseller_name) %>% 
                 top_n(1, Net_Sales)

system · February 24, 2022, 12:30pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.