Strategies for Optimizing/Speeding up geom_sf

katchamp · July 13, 2023, 10:01pm

Hi there

I am trying to use ggplot and geom_sf to create a map but the performance is terrible. It takes over a minute to generate one basic plot. I am looking for ways to make this run quicker so I am not wasting so much time as I iterate through plots.

I am working with a multi polygon shape file that is around 600 MB and has ~24,000 polygons in it. I have tried running st_simplify(map_file, dTolerance = 1000) which bring the size down to 400 MB, but ggplot still take a minute to render and I am getting visible distortions in my map.

Here is my super basic plot code that take 60 seconds to return

ggplot(map_file) +
  geom_sf(aes(fill = weight)) +
  theme_void()

Any ideas on how to make geom_sf run faster would be much appreciated.

technocrat · July 14, 2023, 12:01am

What proportion of the small polygons can you throw overboard before losing too much visual quality? You've already tried as much dTolerance as will help.

The next step is to see whether there's something you can do with creating the plot object to speed things up, possibly, or whether it's all down to the device driver. This might help seeing where to point the finger.

library(ggplot2)

# Create a sample plot
p <- ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point()

# Measure the time taken to build the ggplot2 object (grid)
grid_time <- system.time({
  g <- ggplotGrob(p)
})

# Measure the time taken to open the device and render the plot
device_time <- system.time({
  png(filename = "output.png", width = 800, height = 600)
  grid::grid.draw(g)
  dev.off()
})

# Print the time taken for ggplot2/grid and the device driver
paste("Time taken for ggplot2/grid: ", grid_time[3], "seconds")
#> [1] "Time taken for ggplot2/grid:  0.034 seconds"
paste("Time taken for the device driver: ", device_time[3], "seconds")
#> [1] "Time taken for the device driver:  0.0209999999999999 seconds"

^{Created on 2023-07-13 with reprex v2.0.2}

technocrat · July 14, 2023, 2:10am

Should have added that png will be the fastest option to file, being a raster.

williaml · July 14, 2023, 5:14am

Maybe you could simplify the polygon?

rmapshaper works better than sf for this in terms of keeping the shape intact:

rmapshaper Basics (r-project.org)

katchamp · July 17, 2023, 9:12pm

This was a good idea, but unfortunately everything time I tried rmapshaper on my shape file it crashed my R code.

katchamp · July 17, 2023, 9:20pm

technocrat:

p <- ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point()

# Measure the time taken to build the ggplot2 object (grid)
grid_time <- system.time({
  g <- ggplotGrob(p)
})

# Measure the time taken to open the device and render the plot
device_time <- system.time({
  png(filename = "output.png", width = 800, height = 600)
  grid::grid.draw(g)
  dev.off()
})

# Print the time taken for ggplot2/grid and the device driver
paste("Time taken for ggplot2/grid: ", grid_time[3], "seconds")
#> [1] "Time taken for ggplot2/grid:  0.034 seconds"
paste("Time taken for the device driver: ", device_time[3], "seconds")

Unfortunately I can not remove my small entirely polygons because those are the polygons I am trying to map. I ran your code and it looks like my main issue is with the rendering of the plot.

system · August 7, 2023, 9:20pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.