I'm by no means an expert in ggplot2 internals, but some observations: the build output has 3 components,
> names(plot_build1)
[1] "data" "layout" "plot"
The most promising one in my opinion is layout
, but:
> waldo::compare(plot_build1$layout, plot_build2$layout)
✔ No differences
so there is no hope there.
The plot
component contains the original plot, so it does have the information you seek:
> plot_build1$plot$scales$scales[[3]]$aesthetics
[1] "colour"
> plot_build1$plot$scales$scales[[3]]$call
discrete_scale(aesthetics = aesthetics, scale_name = "hue", palette = hue_pal(h,
c, l, h.start, direction), na.value = na.value)
> plot_build2$plot$scales$scales[[3]]$aesthetics
[1] "colour"
> plot_build2$plot$scales$scales[[3]]$call
continuous_scale(aesthetics = aesthetics, scale_name = "gradient",
palette = seq_gradient_pal(low, high, space), na.value = na.value,
guide = guide)
in a way it's cheating, as this is the plot structure rather than the build result, but this part of the structure is indeed computed by ggplot_build()
so maybe it counts.
Finally, the first component:
> waldo::compare(plot_build1$data, plot_build2$data)
old[[1]]$colour | new[[1]]$colour
[1] "#F8766D" - "#234A6D" [1]
[2] "#F8766D" - "#244C6F" [2]
[3] "#F8766D" - "#255074" [3]
[4] "#F8766D" - "grey50" [4]
[5] "#F8766D" - "#1D3F5E" [5]
[6] "#F8766D" - "#234B6E" [6]
[7] "#F8766D" - "#22496C" [7]
[8] "#F8766D" - "#234A6D" [8]
[9] "#F8766D" - "#17344F" [9]
[10] "#F8766D" - "#29587F" [10]
... ... ... and 334 more ...
`attr(old[[1]]$group, 'n')`: 3
`attr(new[[1]]$group, 'n')`: 1
`old[[1]]$group`: 1 1 1 1 1 1 1 1 1 1 and 334 more...
`new[[1]]$group`: -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
So the data
does not contain that information explicitly, but you can see that with a discrete scale, the colour
column corresponds to the group
column, whereas with a continuous scale they are not related. Things could get a bit more complicated if you have several grouping factors:
p3 <- penguins |>
ggplot() +
geom_point(aes(x = flipper_length_mm, y = body_mass_g, colour = species, shape = island))
dat3 <- layer_data(p3) |> tibble()
table(dat3$colour, dat3$group)
1 2 3 4 5
#00BA38 0 0 0 68 0
#619CFF 0 0 0 0 124
#F8766D 44 56 52 0 0
but that could still be possible to distinguish.
Finally, this does sound a bit like an XY problem, are you sure this is the right approach? What is the context?