Partial pivot wider to extract a subset of rows

I have a column that contains values, I want to extract a subset of rows into a new column. In other words, I'd like to do a partial pivot_wider, only for some rows.

Here is a simplified example:

library(tidyverse)


set.seed(53)
dat <- tibble(descriptor1 = 1:3 |> rep(each = 3),
              metric = c("aa", "bb", "descriptor2") |> rep(times = 3),
              value = rnorm(3*3))

dat
#> # A tibble: 9 × 3
#>   descriptor1 metric       value
#>         <int> <chr>        <dbl>
#> 1           1 aa           0.200
#> 2           1 bb          -1.34 
#> 3           1 descriptor2  0.643
#> 4           2 aa          -1.51 
#> 5           2 bb          -0.799
#> 6           2 descriptor2  1.16 
#> 7           3 aa           0.299
#> 8           3 bb          -0.992
#> 9           3 descriptor2  1.48

dat |>
  pivot_wider(id_cols = descriptor1,
              names_from = "metric",
              values_from = "value") |>
  pivot_longer(-starts_with("descriptor"),
               names_to = "metric",
               values_to = "value")
#> # A tibble: 6 × 4
#>   descriptor1 descriptor2 metric  value
#>         <int>       <dbl> <chr>   <dbl>
#> 1           1       0.643 aa      0.200
#> 2           1       0.643 bb     -1.34 
#> 3           2       1.16  aa     -1.51 
#> 4           2       1.16  bb     -0.799
#> 5           3       1.48  aa      0.299
#> 6           3       1.48  bb     -0.992

Created on 2024-03-26 with reprex v2.0.2
Initially descriptor2 is just one of the metrics, but I decided to put it in its own column.

I can do that with two pivots as above, but that quickly becomes unreadable (in the real problem I have a lot more columns), with two pivots in a row the goal of the operation becomes really obfuscated.

Is there a way to simplify this code, or make its intention more clear?

Does this approach take you in the direction you're hoping for?

dat |> 
  filter(metric == 'descriptor2') |> 
  pivot_wider(names_from = metric, values_from = value) |> 
  inner_join(
    dat |> 
      filter(metric != 'descriptor2')
  )
#> Joining with `by = join_by(descriptor1)`
#> # A tibble: 6 × 4
#>   descriptor1 descriptor2 metric  value
#>         <int>       <dbl> <chr>   <dbl>
#> 1           1       0.643 aa      0.200
#> 2           1       0.643 bb     -1.34 
#> 3           2       1.16  aa     -1.51 
#> 4           2       1.16  bb     -0.799
#> 5           3       1.48  aa      0.299
#> 6           3       1.48  bb     -0.992

Created on 2024-03-26 with reprex v2.0.2

1 Like

That's a great idea!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.