tidygraph: edge data as list column

josiahparry · August 17, 2022, 4:08pm

I have a tidygraph and I want to have each location's neighborhood edge data as a list column.

I can easily retrieve the neighborhood as a list column using igraph::neighborhood() but I cannot so easily retrieve edge data in a nested list form.

Take the following example where I retrieve the neighbor as a list

library(sf, quietly = TRUE)
#> Linking to GEOS 3.9.1, GDAL 3.2.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(sfnetworks, quietly = TRUE)
library(tidygraph, quietly = TRUE)
#> 
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#> 
#>     filter

net <- as_sfnetwork(roxel)

# get neighbors as a list column
mutate(net, nb = igraph::neighborhood(.G()))
#> # A sfnetwork with 701 nodes and 851 edges
#> #
#> # CRS:  EPSG:4326 
#> #
#> # A directed multigraph with 14 components with spatially explicit edges
#> #
#> # Node Data:     701 × 2 (active)
#> # Geometry type: POINT
#> # Dimension:     XY
#> # Bounding box:  xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#>              geometry nb            
#>           <POINT [°]> <list>        
#> 1 (7.533722 51.95556) <igrph.vs [5]>
#> 2 (7.533461 51.95576) <igrph.vs [4]>
#> 3 (7.532442 51.95422) <igrph.vs [5]>
#> 4  (7.53209 51.95328) <igrph.vs [4]>
#> 5 (7.532709 51.95209) <igrph.vs [4]>
#> 6 (7.532869 51.95257) <igrph.vs [5]>
#> # … with 695 more rows
#> #
#> # Edge Data:     851 × 5
#> # Geometry type: LINESTRING
#> # Dimension:     XY
#> # Bounding box:  xmin: 7.522594 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#>    from    to name                  type                                geometry
#>   <int> <int> <chr>                 <fct>                       <LINESTRING [°]>
#> 1     1     2 Havixbecker Strasse   residential (7.533722 51.95556, 7.533461 51…
#> 2     3     4 Pienersallee          secondary   (7.532442 51.95422, 7.53236 51.…
#> 3     5     6 Schulte-Bernd-Strasse residential (7.532709 51.95209, 7.532823 51…
#> # … with 848 more rows

GOAL: I want associated edge data. The only way I've been able to do this is by creating this custom semi-janky function.

# get a variable nested as list  from edges
sfn_var_from_edges <- function(var) {
  e_df <- .E()
  x <- rlang::ensym(var)
  res <- tapply(e_df[[x]], e_df[["from"]], FUN = c)
  names(res) <- NULL
  res
}

This works but only when each node has a from value in the edge dataframe. In this case, many of the edges are mutual so the node wont be recorded in a from and to value and the result of tapply() will have fewer observations than nodes.

Using the above structure doesn't work because the sizing is wrong.

mutate(net, sfn_var_from_edges(name))
#> Error in `stopifnot()`:
#> ! Problem while computing `..1 = sfn_var_from_edges(name)`.
#> ✖ `..1` must be size 701 or 1, not 494.
#> Run `rlang::last_error()` to see where the error occurred.

Here's a working example with the above function

  library(dplyr)
  library(sfdep)
  library(tidygraph)
  library(sfnetworks)
  
  nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"))
  
  # get a variable nested as list (weights) from edges
  sfn_var_from_edges <- function(var) {
    e_df <- .E()
    x <- rlang::ensym(var)
    res <- tapply(e_df[[x]], e_df[["from"]], FUN = c)
    names(res) <- NULL
    res
  }
  
  # cast nc as an sfnetwork
  sfn <- nc |> 
    mutate(nb = st_contiguity(geometry)) |> 
    st_as_graph(nb)
  
  e_cols <- sfn |> 
    activate(edges) |> 
    # calculate edge length
    mutate(e_len = edge_length()) |> 
    # activate nodes to create nb and wt columns
    activate(nodes) |> 
    # create nb and wt columns
    mutate(wt = sfn_var_from_edges(e_len)) |> 
    # cast to tibble
    as_tibble() 

  head(e_cols$wt, 3)
#> [[1]]
#> Units: [m]
#> [1] 32956.24 37369.89 26405.30
#> 
#> [[2]]
#> Units: [m]
#> [1] 32956.24 40467.33 29276.62
#> 
#> [[3]]
#> Units: [m]
#> [1] 40467.33 41036.56 46884.54 24304.08 49931.95

Is anyone aware of a way to get nested edge data for tidygraph objects?

josiahparry · August 17, 2022, 4:39pm

It appears that to_unfolded_tree() may be a way to do that. But I cannot determine what root is and how to provide it.

nirgrahamuk · August 17, 2022, 4:53pm

I think this code would do it ?

net <- as_sfnetwork(roxel)

net2 <-net %>% 
  mutate(neighborhood_edges = map_local(.f = function(neighborhood, ...) {
    igraph::get.edgelist(neighborhood)
  }))

I adapted this from Introducing tidygraph · Data Imaginist (data-imaginist.com) 'mapping over neighborhoods' section

hmmm. I guess its not so simple, as the edges seem to be labelled from a local rather than global viewpoint. Is there a way to name the edges globally and see if those names would repeat here? I dont know , I never work with graphs

josiahparry · August 17, 2022, 6:20pm

This was super helpful!
igraph::get.edge.attribute() can be used to iterate over indexes (nb values).

Thanks!

josiahparry · August 17, 2022, 6:22pm

The solution is:

net <- as_sfnetwork(roxel)

mutate(net, 
       nb = igraph::neighborhood(.G()),
       names = purrr::map(
         nb,
         ~igraph::get.edge.attribute(.G(), "name", .x)
         )
       )

#>             geometry nb             names    
#>          <POINT [°]> <list>         <list>   
#> 1 (7.533722 51.95556) <igrph.vs [5]> <chr [5]>
#> 2 (7.533461 51.95576) <igrph.vs [4]> <chr [4]>
#> 3 (7.532442 51.95422) <igrph.vs [5]> <chr [5]>
#> 4  (7.53209 51.95328) <igrph.vs [4]> <chr [4]>
#> 5 (7.532709 51.95209) <igrph.vs [4]> <chr [4]>
#> 6 (7.532869 51.95257) <igrph.vs [5]> <chr [5]>

This is wrong, you need to get the adjacent edge list and pass that in as an index using get.adjedgelist()