I have a list of edges and I want to group the related nodes into maximal cliques, so that each row is given the same group number as every node it is connected to (even if through multiple edges). So if I had these links: A-B B-C D-E, I would expect ABC to be in group 1 and DE in group 2.
I'm running into a hiccup with my analysis, where some linked nodes are getting assigned different groups from each other. This should not be!
my_edges <- structure(list(row.x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 13L, 14L, 14L, 17L), row.y = c(2L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 18L)), row.names = c(NA,
-15L), class = c("tbl_df", "tbl", "data.frame"))
## A tibble: 15 x 2
# row.x row.y
# <int> <int>
# 1 1 2
# 2 2 4
# 3 3 5
# 4 4 6
# 5 5 7
# 6 6 8
# 7 7 9
# 8 8 10
# 9 9 11
#10 10 12
#11 11 13
#12 13 14
#13 14 15
#14 14 16
#15 17 18
I see here that in the 6th row 6 and 8 are joined, and in the 9th row 9 and 11 are joined.
But when I run tidygraph::group_fast_greedy()
below, the group membership seems to ignore those edges. How can I fix that?
my_edges %>%
as_tbl_graph(directed = FALSE) %>%
mutate(group = group_fast_greedy()) %>%
activate(nodes) %>%
data.frame()
The output of this puts rows 6+8 and 9+11 into different groups, when they should be connected and therefore in the same group.
Output
name group
1 1 3
2 2 3
3 3 2
4 4 3
5 5 2
6 6 3 # Why is this group...
7 7 2
8 8 4 # different than this one?
9 9 2 # And why is this group...
10 10 4
11 11 1 # different than this one?
12 13 1
13 14 1
14 17 5
15 12 4
16 15 1
17 16 1
18 18 5