I need to group a column from a data frame depending on its name.
To expose the issue I set an example. Lets imagine the data. frame "section", that is the column I would like to group:
section <- ALICE_!AAAA, !AAAA_!AAAB, !AAAB_NADIR, NADIR_!AAAC, !AAAC_MANDI.
Here I have 3 names that represent a section of a line and 2 that represent another line. It's the line name what I would like to identify in that segment, in other words, I would like to create another column that group them by their line name, that is to say:
- ALICE_!AAAA, !AAAA_!AAAB, !AAAB_NADIR <- Line ALICE_NADIR (I know it´s ALICE_NADIR line as sections names are preceded by an exclamation, i.e:!AAAA)
- NADIR_!AAAC, !AAAC_MANDI <- Line NADIR_MANDI
In the data. frame, sections of a line are list one after another and not mixed with other line sections, I mean, !AAAA_!AAAB is a section from line ALICE_NADIR as it is between ALICE_!AAAA and !AAAB_NADIR, those names set the begging and end of the section.
What I want R Studio to do is reading the column as I have explained, I mean, I want it to read the section column and write in a new column the line to which that section belongs. Important to note that R Studio has to identify line names as I don´t have a list with them and there is more than 2000000 sections and around 56000 lines to be identified. The command I was thinking about is something like this: if there is NOT an "!" before section name that is the beginning of line name. Thereafter, when another section name without an "!" appears, that is the end of the line name. Therefore, all section names between the end and the beginning belong to that line name.