Split information in cells (line breaking)

I have an issue splitting information, separated with soft returns (link breaking) in cells to obtain a new row below it containing the information from all the other cells, and only one of the information in the line breaked cell. I am a new user, and after a whole day on internet trying to find a solution I gave up :frowning:

I looked into the following packages: stringr, tidyverse and tidyr
But maybe that is the wrong place

I do not understand your goal. Can you give an example of how the data look before and after the transformation?

I would like to have 1 funding detail per row, so the rest of the row should be copied.
E.g. line 2 all information should stay the same, except only 1 funding organization
New line should contain all information, exept second funding organization of line 2, etc.
Dataset example

This seems to be Excel. Is the goal to load this Excel file into R, process it, then export it back to Excel? Then this should work:


my_excel <- readxl::read_excel("Book1.xlsx")

#>  A tibble: 2 x 5
#>   DOI   Funding.Details          Pubmed.ID Source EID  
#>   <chr> <chr>                    <chr>     <chr>  <chr>
#> 1 a     "fa\r\nfb\r\n\r\nfc\r\n" b         c      d    
#> 2 e     "ff\r\n"                 g         h      i   

my_excel <- my_excel |>
  tidyr::separate_rows(Funding.Details) |>
  filter(nchar(Funding.Details) > 0 )

#>  A tibble: 4 x 5
#>   DOI   Funding.Details Pubmed.ID Source EID  
#>   <chr> <chr>           <chr>     <chr>  <chr>
#> 1 a     fa              b         c      d    
#> 2 a     fb              b         c      d    
#> 3 a     fc              b         c      d    
#> 4 e     ff              g         h      i    

writexl::write_xlsx(my_excel, "Book1_expanded.xlsx")

Where \r\n is the newline on Windows, it would look slightly different on MacOS (\r) or Linux (\n).

Not sure you can do that purely in Excel.

Thank you very much for your solution, reads like it might work as well.

In the meantime, someone else suggested another solution (enclosed), which I tried and it worked.
The original dataframe consists of 23 columns, >10.000 entries ....

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.