How to trim/crop spatial vector based on another one?

Hello,

I am trying to crop/trim a spatial vector based on another. The code is as follows, and comments explain what is the purpose:

# libraries we need
libs <- c(
  "tidyverse", "sf", "osmdata",
  "terra"
)

# install missing libraries
installed_libs <- libs %in% rownames(installed.packages())
if (any(installed_libs == F)) {
  install.packages(libs[!installed_libs])
}

# load libraries
invisible(lapply(libs, library, character.only = T))

# 2. GET LISBON BOUNDARIES FROM OSM DATA - Retrieve OSM data
#--------------------------------------
city <- "Lisbon, Portugal"
# define longlat projection
crsLONGLAT <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

# retrieves a list
lisbon_city_border <- osmdata::getbb(
  city,
  format_out = "sf_polygon"
)

lisbon_border <- lisbon_city_border[[1]] |> # lisbon_city_border[[2]] : Lisbon district
  sf::st_set_crs(crsLONGLAT) |>
  sf::st_transform(crsLONGLAT)

# check if its everything right
terra::plot(lisbon_border)

# Problems to solve:
# Problem 1 - If I use trim_osmdata(lisbon_border) certain polygons disappear where they were intended to exist
# Problem 2 - It was suppose to have only "park", "nature_reserve", "golf_course, but "playground", "fitness", pitch, dog, also appear

# Get the data from OSM
greensp_osm_query <-
  # start query, input bounding box
  opq(bbox = lisbon_city_border) %>%
  # we want to extract "leisure" features that are park, nature reserve or a golf course
  add_osm_feature(key = "leisure",
                  value = c("park", "nature_reserve", "golf_course")) %>%
  # query OSM and return as simple features (sf)
  osmdata_sf() 

# Convert POLYGON to MULTIPOLYGON and bind into one sf object. Both contain information.
greensp_sf_box <- bind_rows(st_cast(greensp_osm_query$osm_polygons, "MULTIPOLYGON"),
                        greensp_osm_query$osm_multipolygons) %>%
  select(name, osm_id, leisure)

terra::plot(greensp_sf_box["leisure"])

# How to limit data to Lisbon boundaries instead of the rectangular bounding box?
# ----> Some polygons disappear .......

greensp_osm <- greensp_osm_query %>%
  trim_osmdata(lisbon_border) # Applying makes polygons disappear

# Convert POLYGON to MULTIPOLYGON and bind into one sf object. Both contain information.
greensp_sf <- bind_rows(st_cast(greensp_osm$osm_polygons, "MULTIPOLYGON"),
                        greensp_osm$osm_multipolygons) %>%
  select(name, osm_id, leisure)

terra::plot(greensp_sf["leisure"]) # More than the values selected are presented?

Basically, the plots I get are the following:
1)


2) While trimmed there are polygons that disappear. As you may notice by this plot:

What am I doing wrong?
Another question: while retrieving values from OSM, more specifically, "park", "nature_reserve", "golf_course", why does other values appear ?

I would greatly appreciate your help in solving these problems.
Thank you very much,
Luis Barqueira

It's been just over 40 years since I've been to Lisbon, so I'm hazy on the ground truth—seems like there should be more within the city boundaries.

# libraries we need
libs <- c("sf", "osmdata","ggplot2")

# load libraries
invisible(lapply(libs, library, character.only = TRUE))
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright

# 2. GET LISBON BOUNDARIES FROM OSM DATA - Retrieve OSM data
city <- "Lisbon, Portugal"
# longlat projection already set in getbb()

# retrieves a list
lisbon_city_border <- osmdata::getbb(
  city,
  format_out = "sf_polygon"
)

lisbon_border <- lisbon_city_border[[1]]

# Get the data from OSM
greensp_osm_query <-
  # start query, input bounding box
  opq(bbox = lisbon_city_border) %>%
  # we want to extract "leisure" features that are park, nature reserve or a golf course
  add_osm_feature(key = "leisure",
                  value = c("park", "nature_reserve", "golf_course")) %>%
  # query OSM and return as simple features (sf)
  osmdata_sf()
#> bb_poly has more than one polygon; the first will be selected.

# Convert POLYGON to MULTIPOLYGON and bind into one sf object. Both contain information.
greensp_sf_box <- dplyr::bind_rows(st_cast(greensp_osm_query$osm_polygons, "MULTIPOLYGON"),
  greensp_osm_query$osm_multipolygons) |>
  dplyr::select(name, osm_id, leisure)

# limit to park and nature reserve features;
# otherwise problem with leisure coded NA 

limited <- greensp_sf_box$leisure == "park" | 
           greensp_sf_box$leisure == "nature_reserve"
short_green <- greensp_sf_box[limited,]
lisbon_boundary <- st_boundary(lisbon_border)
sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
sel_logical <- st_intersects(short_green,lisbon_boundary,sparse = FALSE)[, 1]
#> although coordinates are longitude/latitude, st_intersects assumes that they
#> are planar

combo <- short_green[sel_logical, ]
ggplot(lisbon_boundary) + 
  geom_sf() +
  geom_sf(mapping = aes(fill = leisure), data = combo) + 
  theme_minimal()

Created on 2023-05-29 with reprex v2.0.2

Thanks @technocrat
Lisbon is very different today than it was 40 years ago, but it's still very beautiful.

It seems like you are getting only the "nature_reserve" and "park" that intersects with the boundary of Lisbon. There are parks that exist inside the boundary but do not show up.

Thank you

1 Like

Ah, I see now. You might try st_within. Didn’t think there could be that few parks.

I`ve tried with:

sel_logical <- st_within(short_green,lisbon_boundary,sparse = FALSE)[, 1]

and unfortunately does not work.

There seems to be an issue with spatial validity of some of your polygons; this unfortunately can happen with OSM results. Usually sf::st_make_valid() helps - although not always, and a care (+ a visual check) is required.

Consider this piece of code; what it does is:

  • makes sure the greensp_sf_box object is spatially valid
  • adds a new column to the data frame called "in_lisbon", it is TRUE if any part of the polygon intersects (even partially, though it seems not to be an issue) with city borders and FALSE otherwise
  • I use the new variable for plotting (the said visual check) but it can be easily fed to a dplyr::filter() to get rid of unwanted greenery...
# Convert POLYGON to MULTIPOLYGON and bind into one sf object. Both contain information.
greensp_sf_box <- bind_rows(st_cast(greensp_osm_query$osm_polygons, "MULTIPOLYGON"),
                            greensp_osm_query$osm_multipolygons) %>%
  select(name, osm_id, leisure) %>% 
  st_make_valid()


greensp_sf_box$in_lisbon <- st_intersects(greensp_sf_box, lisbon_border, sparse = F)[,1]

plot(lisbon_border, border = "red")
plot(greensp_sf_box["in_lisbon"], add = T, border = NA)

2 Likes

@jlacko Thank you very much

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.