Sort a column table by "Percentage" adding text "%"

Hi there and happy X-Mas to all :slight_smile:,

From a standard table function, i want to add a % column and sort on this % criteria.

My main problem is I'd like this column appears with the % symbol, so i have to transform to numeric into character type. But of course, i can't sort properly because character variable.

Is there a way to sort this character variable "like" a numeric variable ?

dataset <- data.frame(x=c(rep("Crit1",43),rep("Crit2",10),rep("Crit3",4)))
dataset_table <- as.data.frame(table(dataset[,1]))
dataset_table[,"Pourcentage"] <- paste(round(dataset_table[,2]/sum(dataset_table[,2])*100, digits=2), "%", sep = " ")
sort(dataset_table[,3]) # shows bad order

I'm proposing two solutions but there are many more. The short answer is no, you can't directly sort those characters and expect them to behave like numbers. It will sort alphabetically and the first "letter" of 7.02% is "7" which comes after the "1" in 17.54%

dataset <- data.frame(x=c(rep("Crit1",43),rep("Crit2",10),rep("Crit3",4)))

# Option 1 - make numeric variable, sort by that, then remove it
dataset_table <- as.data.frame(table(dataset[,1]))
dataset_table[,"PourcentageNum"] <- dataset_table[,2]/sum(dataset_table[,2])*100
dataset_table[,"Pourcentage"] <- paste(round(dataset_table$PourcentageNum, digits=2), "%", sep = " ")
dataset_table <- dataset_table[order(dataset_table$PourcentageNum),]
dataset_table[,-3]
#>    Var1 Freq Pourcentage
#> 3 Crit3    4      7.02 %
#> 2 Crit2   10     17.54 %
#> 1 Crit1   43     75.44 %

# Option 2 - zero pad your percentage
# need the package stringr

dataset_table <- as.data.frame(table(dataset[,1]))
dataset_table[,"Pourcentage"] <- stringr::str_pad(paste(round(dataset_table[,2]/sum(dataset_table[,2])*100, digits=2), "%", sep = " "), 7, "left", "0")
dataset_table <- dataset_table[order(dataset_table$Pourcentage),]
dataset_table
#>    Var1 Freq Pourcentage
#> 3 Crit3    4     07.02 %
#> 2 Crit2   10     17.54 %
#> 1 Crit1   43     75.44 %

Created on 2019-12-27 by the reprex package (v0.3.0)

The numeric column Freq has a direct proportion with the percentage, so if you order by that column it will have the same effect, see this example using the tidyverse

library(tidyverse)

dataset <- data.frame(x=c(rep("Crit1",43),rep("Crit2",10),rep("Crit3",4)))
dataset_table <- as.data.frame(table(dataset[,1]))

dataset_table %>% 
    arrange(Freq) %>% 
    mutate(Percentage = paste(round(Freq / sum(Freq) * 100, 2), "%"))
#>    Var1 Freq Percentage
#> 1 Crit3    4     7.02 %
#> 2 Crit2   10    17.54 %
#> 3 Crit1   43    75.44 %

Created on 2019-12-27 by the reprex package (v0.3.0.9000)

When you append "%" with mutate it changes the class to character, arranging according to character . Work around solutions is to arrange first, then paste "%" at the end.


library(tidyverse)
# data

dataset <- data.frame(x=c(rep("Crit1",43),rep("Crit2",10),rep("Crit3",4)))
dataset_table <- as.data.frame(table(dataset[,1]))



# Sort ascending


dataset_table %>% 
         mutate(prcnt = (round(Freq/sum(Freq),4) * 100)) %>% 
         as_tibble() %>% 
         arrange(prcnt) %>% 
         mutate(prcnt = paste0(prcnt, "%"))
# A tibble: 3 x 3
  Var1   Freq prcnt 
  <fct> <int> <chr> 
1 Crit3     4 7.02% 
2 Crit2    10 17.54%
3 Crit1    43 75.44%

#Sort descending
 dataset_table %>% 
              mutate(prcnt = (round(Freq/sum(Freq),4) * 100)) %>% 
              as_tibble() %>%
              arrange(desc(prcnt)) %>% 
             mutate(prcnt = paste0(prcnt, "%"))
# A tibble: 3 x 3
  Var1   Freq prcnt 
  <fct> <int> <chr> 
1 Crit1    43 75.44%
2 Crit2    10 17.54%
3 Crit3     4 7.02%
1 Like

This is the best advice when you have to format values for presentation. Do your analysis with the numbers first and don't mind "ugly" values until you get to the end and need to share results*. Like writing reports in Markdown, appearance comes last.

* IMO, keep them as "ugly" values if the person you're sharing with will also use them in analysis software.

Thanks for all your answers,

In case of a shiny app application, i use the datatable fonction, so i would like the user can filter as he wants, clicking on the "percentage" column, frequency column or Var1 column..

If he clicks on the percentage column, the unique way seems to use the zero pad percentage in order to have a good order with a character type.

Thanks again :slight_smile:

I would suggest providing the numerical values and putting the heading of the table column as "blah (%)" and only have the percentage symbol in the table column, not in the values.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.