Create new dataframe (long to wide)

Schermafbeelding 2020-05-04 om 20.43.04

Hi all,

I want to create a new dataframe where every row is 1 version (0-100 versions) , and all the attribute information (Att1 Att2 Att3) of this version are extra columns of that specific row. So create a new wide dataframe from a long dataframe.

Could you help me out? I hope I was clear enough.

Best,
Boris

Check out the pivot_wider function in the tidyr package, which is part of tidyverse.

2 Likes

Thank you. Unfortunately i'm not able to get the code working.

df_orderd <- designcbc_now %>%
pivot_wider(names_from = version,
values_from = c(att1, att2, att3),
values_fn = list)

One of the problems is, is that I lose the column "version" and I need that column to match it with another dataframe.

Is it also possible to create matrices or vectors of the attribute columns with the version number as condition?

Your syntax isn't quite right. Please read the documentation for pivot_longer() to learn how to use it correctly.

Here is a working example using the first 6 rows of your data as a sample.

library(tidyverse)

data <- tribble(~ version, ~ task, ~ concept, ~ attr1, ~ attr2, ~ attr3,
                0, 1, 1, 1, 1, 2,
                0, 1, 2, 3, 2, 1,
                0, 1, 3, 2, 3, 3,
                1, 1, 1, 1, 1, 2,
                1, 1, 2, 3, 2, 3,
                1, 1, 3, 2, 3, 2)

pivot_longer(data, cols = c(attr1, attr2, attr3))
#> # A tibble: 18 x 5
#>    version  task concept name  value
#>      <dbl> <dbl>   <dbl> <chr> <dbl>
#>  1       0     1       1 attr1     1
#>  2       0     1       1 attr2     1
#>  3       0     1       1 attr3     2
#>  4       0     1       2 attr1     3
#>  5       0     1       2 attr2     2
#>  6       0     1       2 attr3     1
#>  7       0     1       3 attr1     2
#>  8       0     1       3 attr2     3
#>  9       0     1       3 attr3     3
#> 10       1     1       1 attr1     1
#> 11       1     1       1 attr2     1
#> 12       1     1       1 attr3     2
#> 13       1     1       2 attr1     3
#> 14       1     1       2 attr2     2
#> 15       1     1       2 attr3     3
#> 16       1     1       3 attr1     2
#> 17       1     1       3 attr2     3
#> 18       1     1       3 attr3     2

Created on 2020-05-05 by the reprex package (v0.3.0)

Many thanks for the clear example. However, I read the file multiple times but still not able to create a wide dataframe.

Could you help me a little bit further please?

Sorry, I misread your question. You want to convert a long data frame to a wide data frame (the solution I gave was for the inverse transformation).

Can you please post a sample of what your long data frame looks like?

No worries, it is like the picture above. Is that enough for you?

I'm now trying this syntax:
pivot_wider(data, names_from = concept, values_from = c(attr1, attr2, attr3))

If the picture is your original data, then it is already in wide form. May I ask you to post a sample of what your expected output should look like?

You could take the first 3 records from your data frame that have version 0 and illustrate what your table should look like after the transformation.

Schermafbeelding 2020-05-05 om 14.22.34

I need all the attributes information as columns, so I only have 1 version number for each row.
Then I can combine my two datasets based on version number.

I hope this is a good explanation.

Maybe a looped matrix can work?

OK, that helped. I think you want something like this. Here I've prefixed the resulting columns with t and c (for task and concept respectively) and their values from the original data frame.

library(tidyverse)

data <- tribble(~ version, ~ task, ~ concept, ~ attr1, ~ attr2, ~ attr3,
                0, 1, 1, 1, 1, 2,
                0, 1, 2, 3, 2, 1,
                0, 1, 3, 2, 3, 3,
                1, 1, 1, 1, 1, 2,
                1, 1, 2, 3, 2, 3,
                1, 1, 3, 2, 3, 2)

data %>% 
  pivot_longer(cols = c(attr1, attr2, attr3)) %>% 
  mutate(new_name = str_c("t", task, "_c", concept, "_", name)) %>% 
  pivot_wider(id_cols = version, names_from = new_name, values_from = value)
#> # A tibble: 2 x 10
#>   version t1_c1_attr1 t1_c1_attr2 t1_c1_attr3 t1_c2_attr1 t1_c2_attr2
#>     <dbl>       <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#> 1       0           1           1           2           3           2
#> 2       1           1           1           2           3           2
#> # ... with 4 more variables: t1_c2_attr3 <dbl>, t1_c3_attr1 <dbl>,
#> #   t1_c3_attr2 <dbl>, t1_c3_attr3 <dbl>

Created on 2020-05-05 by the reprex package (v0.3.0)

1 Like

Outstanding! Thank you very much, it works perfect.

1 Like

Great! If I solved your problem, would you mind marking my post as a solution?

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.