Sorting a vector using a partial order

I need some help to sort a vector.

DB Query returns a vector as below:
condition_choices <- c("Good", "Very Good", "Fair", "Poor", "Very Poor", "Not Assessed", "Discontinued")
condition_order <- c("Very Good", "Good","Fair", "Poor", "Very Poor")

The desired order should be:
condition_choices_ordered <- c("Very Good", "Good","Fair", "Poor", "Very Poor","Discontinued", "Not Assessed")

This should be done in a dynamic way to put the missing condition_choices (not present in the condition_order) at the end of the vector (no specific order for the missing ones).

condition_choices <- c("Good", "Fair","Discontinued","Very Good", "Fair", "Poor", "Very Poor", "Not Assessed", "Discontinued")
condition_order <- forcats::as_factor(c("Very Good", "Good","Fair", "Poor", "Very Poor"))

(overlap <- intersect(condition_choices,condition_order))
(leftout <- setdiff(condition_choices,condition_order))

library(forcats)
(overlap_f <- factor(overlap,levels=condition_order))
(leftout <- factor(leftout))

(condition_choices_ordered_new <- sort(factor(condition_choices,levels=lvls_union(list(overlap_f,leftout)))))
table(condition_choices_ordered_new)

Note that I made the condition_choices more 'interesting' by doubling up Fair and Discontinued.

Another solution but without factors along the same lines using match in a separate function for use in e.g. dplyr:

condition_order <- c("Very Good", "Good","Fair", "Poor", "Very Poor")

order_f <- function (condition_choices, condition_order) {
  condition_order2 <- c(condition_order,
      setdiff(condition_choices, condition_order))
  match(condition_choices, condition_order2)
}

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(magrittr)

df1 <- data.frame(
  condition_choices = c("Good", "Good", "Very Good", "Fair", "Poor", "Very Poor", "Not Assessed", "Discontinued"),
  value = 1:8
) %>%
  print()
#>   condition_choices value
#> 1              Good     1
#> 2              Good     2
#> 3         Very Good     3
#> 4              Fair     4
#> 5              Poor     5
#> 6         Very Poor     6
#> 7      Not Assessed     7
#> 8      Discontinued     8

df1 %>%
  mutate (my_order = order_f(condition_choices, condition_order)) %>%
  arrange(my_order) %>%
  select(-my_order) %>% 
  print()
#>   condition_choices value
#> 1         Very Good     3
#> 2              Good     1
#> 3              Good     2
#> 4              Fair     4
#> 5              Poor     5
#> 6         Very Poor     6
#> 7      Not Assessed     7
#> 8      Discontinued     8
Created on 2021-11-18 by the reprex package (v2.0.0)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.