Is there a way to alter how arrange
sorts strings? I ran into a discrepancy between arrange
and sort
, illustrated by the following example.
# Demonstrate sorting discrepancy between `arrange` and `sort`.
# Create sample data.
df <- data.frame(Label = c("bama", "mama", "1000x", "BAnn", "10:00x"), Index = 1:5)
# Sort the rows into ascending label order using `dplyr::arrange`.
df |> dplyr::arrange(Label) |> print()
#> Label Index
#> 1 1000x 3
#> 2 10:00x 5
#> 3 BAnn 4
#> 4 bama 1
#> 5 mama 2
# Order: 1000x, 10:00x, BAnn, bama, mama.
# Sort the rows into ascending label order using `sort`.
df[sort(df$Label, index.return = TRUE)$ix, ] |> print()
#> Label Index
#> 5 10:00x 5
#> 3 1000x 3
#> 1 bama 1
#> 4 BAnn 4
#> 2 mama 2
# Order: 10:00x, 1000x, bama, BAnn, mama.
Created on 2024-06-26 with reprex v2.1.0
The context of the original problem involves reading data from a spreadsheet and presenting results in the same order as the spreadsheet rows. Both Google Sheets and LibreOffice Calc use the same sort order that sort
does, making arrange
the outlier. I'm asking whether there is a setting somewhere (option to arrange
, global setting for dplyr
) that would convince it to use the same sorting rules used by sort
.