Split information in cells (line breaking)

dhveening · November 3, 2021, 3:59pm

I have an issue splitting information, separated with soft returns (link breaking) in cells to obtain a new row below it containing the information from all the other cells, and only one of the information in the line breaked cell. I am a new user, and after a whole day on internet trying to find a solution I gave up

I looked into the following packages: stringr, tidyverse and tidyr
But maybe that is the wrong place

FJCC · November 3, 2021, 4:33pm

I do not understand your goal. Can you give an example of how the data look before and after the transformation?

dhveening · November 3, 2021, 5:03pm

I would like to have 1 funding detail per row, so the rest of the row should be copied.
E.g. line 2 all information should stay the same, except only 1 funding organization
New line should contain all information, exept second funding organization of line 2, etc.
Dataset example

AlexisW · November 8, 2021, 4:43am

This seems to be Excel. Is the goal to load this Excel file into R, process it, then export it back to Excel? Then this should work:

library(tidyverse)

my_excel <- readxl::read_excel("Book1.xlsx")

my_excel
#>  A tibble: 2 x 5
#>   DOI   Funding.Details          Pubmed.ID Source EID  
#>   <chr> <chr>                    <chr>     <chr>  <chr>
#> 1 a     "fa\r\nfb\r\n\r\nfc\r\n" b         c      d    
#> 2 e     "ff\r\n"                 g         h      i   

my_excel <- my_excel |>
  tidyr::separate_rows(Funding.Details) |>
  filter(nchar(Funding.Details) > 0 )


my_excel
#>  A tibble: 4 x 5
#>   DOI   Funding.Details Pubmed.ID Source EID  
#>   <chr> <chr>           <chr>     <chr>  <chr>
#> 1 a     fa              b         c      d    
#> 2 a     fb              b         c      d    
#> 3 a     fc              b         c      d    
#> 4 e     ff              g         h      i    

writexl::write_xlsx(my_excel, "Book1_expanded.xlsx")

Where \r\n is the newline on Windows, it would look slightly different on MacOS (\r) or Linux (\n).

Not sure you can do that purely in Excel.

dhveening · November 8, 2021, 3:37pm

Thank you very much for your solution, reads like it might work as well.

In the meantime, someone else suggested another solution (enclosed), which I tried and it worked.
The original dataframe consists of 23 columns, >10.000 entries ....

system · November 15, 2021, 3:38pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.