So I'm wondering.. I have a table of several colums that have names consisting of A to H and consecutive numbers and a looong annex xxx, e.g. "A1 xxx", "A2 xxx", ..., "H11 xxx", "H12 xxx" and the first one containing the time. Since I want to adress the columns calculating mean and standard deviation from them, it would be really helpful to rename them to something shorter, so only the "A1" is still standing, whithout the appndix.
I was thinking of using a loop to do it, but I feel there might be a smarter solution using tidyverse/dplyr. Is there a possibility of using the same loop for mean and standard deviation? It would be like "A1" to "A3" and "A4" to "A6", ... to "H10" to "H12" belong together and give data for one mean and standard deviation each.
I was trying to start with
head(data) %>% rename_with(~ substring())
But I'm not sure how to proceed and if "substring" is really what I am looking for.
I'm happy for any suggestions.
I think you're on the right lines with dplyr::rename_with()- I've used it with stringr::str_extract to get the part of the column name you're interested in, e.g. 'A1'.
data <- tibble::tribble(
~`Time (h) 2025-01-24T 14.19.33.188`, ~`A A1 OTR (mmol/L/h)`, ~`A A2 OTR (mmol/L/h)`, ~`A A3 OTR (mmol/L/h)`,
0.33333, 0.639942, 2.309817, -1.577695,
0.66666, 1.188565, 0.487093, 1.854478
)
data_renamed <-
data |>
dplyr::rename_with(
.cols = 2:length(names(data)), # select all columns except the first one
.fn = ~stringr::str_extract(
string = .x,
pattern = '\\h([[:alnum:]]+)\\h' # pick out the alphanumeric portion between two spaces, e.g. ' A1 '
) |> trimws() # remove the spaces
)
# see the result
data_renamed
#> # A tibble: 2 × 4
#> `Time (h) 2025-01-24T 14.19.33.188` A1 A2 A3
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0.333 0.640 2.31 -1.58
#> 2 0.667 1.19 0.487 1.85