From a standard table function, i want to add a % column and sort on this % criteria.
My main problem is I'd like this column appears with the % symbol, so i have to transform to numeric into character type. But of course, i can't sort properly because character variable.
Is there a way to sort this character variable "like" a numeric variable ?
dataset <- data.frame(x=c(rep("Crit1",43),rep("Crit2",10),rep("Crit3",4)))
dataset_table <- as.data.frame(table(dataset[,1]))
dataset_table[,"Pourcentage"] <- paste(round(dataset_table[,2]/sum(dataset_table[,2])*100, digits=2), "%", sep = " ")
sort(dataset_table[,3]) # shows bad order
I'm proposing two solutions but there are many more. The short answer is no, you can't directly sort those characters and expect them to behave like numbers. It will sort alphabetically and the first "letter" of 7.02% is "7" which comes after the "1" in 17.54%
The numeric column Freq has a direct proportion with the percentage, so if you order by that column it will have the same effect, see this example using the tidyverse
When you append "%" with mutate it changes the class to character, arranging according to character . Work around solutions is to arrange first, then paste "%" at the end.
This is the best advice when you have to format values for presentation. Do your analysis with the numbers first and don't mind "ugly" values until you get to the end and need to share results*. Like writing reports in Markdown, appearance comes last.
* IMO, keep them as "ugly" values if the person you're sharing with will also use them in analysis software.
In case of a shiny app application, i use the datatable fonction, so i would like the user can filter as he wants, clicking on the "percentage" column, frequency column or Var1 column..
If he clicks on the percentage column, the unique way seems to use the zero pad percentage in order to have a good order with a character type.
I would suggest providing the numerical values and putting the heading of the table column as "blah (%)" and only have the percentage symbol in the table column, not in the values.