I have a dataframe called table4 with many columns including the diff_charge column. i need to group the dataframe .
The only column that can added to the parenthesis of group_by is the diff_charge column which is found by (diff_charge = amount_courier - total_charges). But here's the thing it is possible for 2 diffrent order_id to have same diff_charge. In this case how will i should i group the dataframe.
a. You can group-by multiple multiple variables in the 'group-by' argument, something like: table4 %>% group_by(order_id, diff_charge).
b. For removing rows that are complete duplicates of each other, you could simply use the 'distinct()' function from dplyr.
Hey thanks for the reply.
Is there a way to view the final dataframe after group_by is used. When I tried using the group_by, R reads it but never displays the grouped data
Yes, you only need to assign the resultant data frame to a variable.
ex. newdf <- table4 %>% group_by(order_id, diff_charge)
View(newdf)
It will be useful to check the documentation for 'dplyr' on CRAN for all kinds of data manipulation.
What I want the data to look like:
First row has the values corresponding to diff_charge = 50.1. the second row has values corresponding to diff_charge =84.9. and so, on
Are you expecting grouping to change the total number of rows in your set ?
if so you want grouped aggregation, or grouped summarisation ... in R/tidyverse simply grouping things changes the behaviour of verbs you might apply to the frame but other than that its simply metadata and does not summarise in itself.