I'm replying with a solution because I thought this was an interesting problem. I think the best solution would involve using a nested list to store the data and recursively simplifying the deepest level of the list and promoting it to its parent level. But that's not the solution I took because I was curious about how I might to do this dplyr.
My approach does what a proper recursive solution might do (make the deepest level into flattened HTML and promoted to its parent level) but using group_by()
to "nest" the data and successive summarise()
to peel off a level of grouping.
library(tidyverse)
d <- readr::read_csv(
"
bullet_1, bullet_2, bullet_3, bullet_4
Big Bullet 1, Sample 1, Event 1,
Big Bullet 1, Sample 1, Event 2, Detail 1
Big Bullet 1, Sample 1, Event 2, Detail 2
Big Bullet 2, Sample 1,,
Big Bullet 2, Sample 2,,
Big Bullet 2, Sample 3,,
Big Bullet 3, Sample 1, Event 1, Detail 1
Big Bullet 3, Sample 1, Event 1, Detail 2
Big Bullet 3, Sample 1, Event 2, Detail 1
Big Bullet 3, Sample 1, Event 2, Detail 2
Big Bullet 3, Sample 1, Event 3, Detail 1
Big Bullet 3, Sample 1, Event 3, Detail 2
Big Bullet 3, Sample 2, Event 1,
Big Bullet 3, Sample 2, Event 2,
"
)
I also want to note that I like to make simple functions to make a miniature language to help with a problem. So the first part of the solution is make functions to produce HTML tags, flatten lists together, and flatten strings between pairs of columns.
# make a function that creates html tags
make_html_tag <- function(tag) {
left <- sprintf("<%s>", tag)
right <- sprintf("</%s>", tag)
function(xs) {
ifelse(!is.na(xs), paste0(left, xs, right), NA_character_)
}
}
ul <- make_html_tag("ul")
li <- make_html_tag("li")
p <- make_html_tag("p")
# tag elements as li's and flatten
li_flatten <- function(xs) {
stringr::str_flatten(li(xs), "\n")
}
# flatten pairs of items together over two vectors
str_flatten2 <- function(xs, ys, collapse = "\n") {
f2 <- function(x, y) stringr::str_flatten(c(x, y), collapse = collapse)
purrr::map2_chr(xs, ys, f2)
}
So here is the data.
d
#> # A tibble: 14 x 4
#> bullet_1 bullet_2 bullet_3 bullet_4
#> <chr> <chr> <chr> <chr>
#> 1 Big Bullet 1 Sample 1 Event 1 <NA>
#> 2 Big Bullet 1 Sample 1 Event 2 Detail 1
#> 3 Big Bullet 1 Sample 1 Event 2 Detail 2
#> 4 Big Bullet 2 Sample 1 <NA> <NA>
#> 5 Big Bullet 2 Sample 2 <NA> <NA>
#> 6 Big Bullet 2 Sample 3 <NA> <NA>
#> 7 Big Bullet 3 Sample 1 Event 1 Detail 1
#> 8 Big Bullet 3 Sample 1 Event 1 Detail 2
#> 9 Big Bullet 3 Sample 1 Event 2 Detail 1
#> 10 Big Bullet 3 Sample 1 Event 2 Detail 2
#> 11 Big Bullet 3 Sample 1 Event 3 Detail 1
#> 12 Big Bullet 3 Sample 1 Event 3 Detail 2
#> 13 Big Bullet 3 Sample 2 Event 1 <NA>
#> 14 Big Bullet 3 Sample 2 Event 2 <NA>
We want to make each element in bullet_4
an <li>
tag unless it is NA
. Then for each group in bullet_3
, we want to combine (flatten) together the values bullet_4
into a single string and tag the the string with <ul>
. That's what the first line does. We use unique(bullet_3)
so that we only get one row per group in bullet_3
. Finally, we need to combine the bullet_3
and bullet_4
together into a single string. But if bullet_4
is NA
, then str_flatten2(b3, b4)
returns NA
. So we use coalesce()
replace the NA values instr_flatten2(b3, b4)
with original value in b3
.
d_working <- d %>%
group_by(bullet_1, bullet_2, bullet_3) %>%
summarise(
# Make lists at deepest point
b4 = ul(li_flatten(bullet_4)),
b3 = unique(bullet_3),
# Promote deepest point to parent, using parent if deepest is NA
b3 = coalesce(str_flatten2(b3, b4), b3),
.groups = "drop_last"
) %>%
select(-b4, -bullet_3) %>%
print()
#> # A tibble: 10 x 3
#> # Groups: bullet_1, bullet_2 [6]
#> bullet_1 bullet_2 b3
#> <chr> <chr> <chr>
#> 1 Big Bullet 1 Sample 1 "Event 1"
#> 2 Big Bullet 1 Sample 1 "Event 2\n<ul><li>Detail 1</li>\n<li>Detail 2</li></ul~
#> 3 Big Bullet 2 Sample 1 <NA>
#> 4 Big Bullet 2 Sample 2 <NA>
#> 5 Big Bullet 2 Sample 3 <NA>
#> 6 Big Bullet 3 Sample 1 "Event 1\n<ul><li>Detail 1</li>\n<li>Detail 2</li></ul~
#> 7 Big Bullet 3 Sample 1 "Event 2\n<ul><li>Detail 1</li>\n<li>Detail 2</li></ul~
#> 8 Big Bullet 3 Sample 1 "Event 3\n<ul><li>Detail 1</li>\n<li>Detail 2</li></ul~
#> 9 Big Bullet 3 Sample 2 "Event 1"
#> 10 Big Bullet 3 Sample 2 "Event 2"
Now we are where we started but with a simplified dataframe so we do the same steps to simplify it further. In the final step, we tag bullet_1
with <p>
tags.
d_working <- d_working %>%
summarise(
# Make lists at deepest point.
b3 = ul(li_flatten(b3)),
b2 = unique(bullet_2),
# Promote deepest point to parent, using parent if deepest is NA
b2 = coalesce(str_flatten2(b2, b3), b2),
.groups = "drop_last"
) %>%
select(-b3, -bullet_2) %>%
print()
#> # A tibble: 6 x 2
#> # Groups: bullet_1 [3]
#> bullet_1 b2
#> <chr> <chr>
#> 1 Big Bullet 1 "Sample 1\n<ul><li>Event 1</li>\n<li>Event 2\n<ul><li>Detail 1</~
#> 2 Big Bullet 2 "Sample 1"
#> 3 Big Bullet 2 "Sample 2"
#> 4 Big Bullet 2 "Sample 3"
#> 5 Big Bullet 3 "Sample 1\n<ul><li>Event 1\n<ul><li>Detail 1</li>\n<li>Detail 2<~
#> 6 Big Bullet 3 "Sample 2\n<ul><li>Event 1</li>\n<li>Event 2</li></ul>"
d_working <- d_working %>%
summarise(
# Make lists at deepest point.
b2 = ul(li_flatten(b2)),
# Make parent a paragraph item
b1 = unique(p(bullet_1)),
# Promote deepest point to parent, using parent if deepest is NA
b1 = coalesce(str_flatten2(b1, b2), b1),
.groups = "drop_last"
) %>%
select(-b2, -bullet_1) %>%
print()
#> # A tibble: 3 x 1
#> b1
#> <chr>
#> 1 "<p>Big Bullet 1</p>\n<ul><li>Sample 1\n<ul><li>Event 1</li>\n<li>Event 2\n<u~
#> 2 "<p>Big Bullet 2</p>\n<ul><li>Sample 1</li>\n<li>Sample 2</li>\n<li>Sample 3<~
#> 3 "<p>Big Bullet 3</p>\n<ul><li>Sample 1\n<ul><li>Event 1\n<ul><li>Detail 1</li~
Finally, we can flatten and print.
content <- stringr::str_flatten(d_working$b1)
cat(content)
#> <p>Big Bullet 1</p>
#> <ul><li>Sample 1
#> <ul><li>Event 1</li>
#> <li>Event 2
#> <ul><li>Detail 1</li>
#> <li>Detail 2</li></ul></li></ul></li></ul><p>Big Bullet 2</p>
#> <ul><li>Sample 1</li>
#> <li>Sample 2</li>
#> <li>Sample 3</li></ul><p>Big Bullet 3</p>
#> <ul><li>Sample 1
#> <ul><li>Event 1
#> <ul><li>Detail 1</li>
#> <li>Detail 2</li></ul></li>
#> <li>Event 2
#> <ul><li>Detail 1</li>
#> <li>Detail 2</li></ul></li>
#> <li>Event 3
#> <ul><li>Detail 1</li>
#> <li>Detail 2</li></ul></li></ul></li>
#> <li>Sample 2
#> <ul><li>Event 1</li>
#> <li>Event 2</li></ul></li></ul>
knitr::asis_output(content)
Big Bullet 1
Big Bullet 2
-
Sample 1
-
Sample 2
-
Sample 3
Big Bullet 3
Created on 2021-06-16 by the reprex package (v2.0.0)