Weird behavior of magrittr and sf

Ajackson · December 19, 2023, 1:47am

I feel like I must be missing something really simple, but I do not understand why these seemingly equivalent constructions are giving different results. Pulling my hair out...

> Quakes_2 %>% sf::st_buffer(Radius)
Error in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle = endCapStyle,  : 
  object 'Radius' not found
> sf::st_buffer(Quakes_2, Quakes_2$Radius)
Simple feature collection with 3 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -97.39488 ymin: 31.32648 xmax: -94.1949 ymax: 32.68898
Geodetic CRS:  WGS 84
# A tibble: 3 × 5
  Date       Magnitude Depth                                                                                 geometry Radius
* <date>         <dbl> <dbl>                                                                            <POLYGON [°]>  <dbl>
1 2018-05-19       4     7.1 ((-97.03103 32.34148, -97.02834 32.34163, -97.02565 32.34178, -97.02565 32.3446, -97.02… 21300 
2 2018-09-04       3.8   8.6 ((-94.69155 31.91214, -94.69155 31.90647, -94.69155 31.90079, -94.69155 31.89511, -94.6… 19817.
3 2019-01-20       3.6   5.6 ((-94.1949 31.4229, -94.1949 31.42575, -94.1949 31.42717, -94.19618 31.42713, -94.19618…  9669.

FJCC · December 19, 2023, 2:40am

Perhaps I'm missing something. The code

Quakes_2 %>% sf::st_buffer(Radius)

is equivalent to

sf::st_buffer(Quakes_2, Radius)

The function st_buffer will look for an object named Radius in the environment from which the function was called and not find it. You could call

Quakes_2 %>% sf::st_buffer(.$Radius)

where the . is a placeholder for the object on the left hand side of %>%. I'm not positive that will work, but give it a try.

jlacko · December 19, 2023, 12:47pm

I believe you are confusing sf::st_buffer() with functions from {dplyr} and its tidyverse ilk. These implement data masking approach - which is technically speaking a non-standard evaluation, even though the "non-standard" is a bit tricky word given the wide spread of of tidyverse as compared to base R.
Given that we are talking in Posit's house I will stop here

In any case this will work; note that it was necessary to access the mag (short for magnitude) column via the dollar sign notation, which is the "standard" evaluation approach.

library(sf)

sf_quakes <- quakes %>% # dataframe that lives in {datasets}
   head() %>% 
   st_as_sf(coords = c("long", "lat"), crs = 4326) # from regular data frame to sf object  

sf_quakes %>% 
   st_buffer(.$mag * 10000) %>% 
   mapview::mapview()

Ajackson · January 3, 2024, 2:16am

Most sf functions work just fine in a pipeline. I use them all the time, and in the book by Robin Lovelace he has several examples of sf functions being part of a pipeline.

jlacko · January 3, 2024, 1:52pm

I fully agree, but with a comment: your issue is not with the usage of pipeline per se, but with the fact that sf::st_buffer() does not subscribe to {dplyr} idea of data masking (see the link in my original post) and when provided with Radius for buffer size it will look for object named Radius in your environment, not for column named Radius in the data frame that was piped to it.

Which is why it will work with both:

sf::st_buffer(Quakes_2, Quakes_2$Radius) = the Radius column of the quakes data frame quoted via dollar sign as per base R specs
Quakes_2 %>% sf::st_buffer(.$Radius) = the Radius column of the dataframe piped in via {magrittr} pipe operator & using the dot shortcut for "current lhs result" as per {magrittr} specs

system · January 10, 2024, 1:53pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.