I need to find clusters based on the following adjacency matrix:
library(dplyr)
# importing the adjacency matrix
adj_matrix <- read.table('https://raw.githubusercontent.com/sergiocostafh/adjmatrix_example/main/adj_m.txt',header = T,check.names = F) %>% as.matrix()
row.names(adj_matrix) <- colnames(adj_matrix)
Using igraph
package I turn the matrix into a graph to perform the clustering. The ggraph
package helps with visualization.
library(ggplot2)
library(igraph)
# turning into a graph
grafo <- graph_from_adjacency_matrix(adj_matrix,'undirected')
# detecting clusters
fc <- cluster_walktrap(as.undirected(grafo))
# results to data.frame
ms <- data.frame(id=membership(fc)%>%names(),cluster=as.character(as.vector(membership(fc))))
# plot
ggraph(grafo)+
geom_edge_link0(edge_colour = "grey66")+
geom_node_point(aes(fill = ms$cluster),size=5,shape=21)
The above procedure does not consider the node weights, but I need to consider it and set some constraints.
The weights vector can be imported as follows:
# weights
w <- read.table('https://raw.githubusercontent.com/sergiocostafh/adjmatrix_example/main/weights.txt') %>% as.vector()
# adding the weights column to the dataset
ms$weight <- w
# calculating the total weight of each cluster
ms %>% group_by(cluster) %>% summarise(weight = sum(weight)) %>% arrange(-weight)
# A tibble: 12 x 2
cluster weight
<chr> <dbl>
1 2 429.
2 1 351.
3 6 330.
4 3 325.
5 5 194.
6 7 120.
7 4 80.9
8 11 68.9
9 10 57.4
10 8 53.6
11 9 42.0
12 12 32.9
By calculating the total weight of each cluster, we get 429 as the highest value (cluster 2) and 32.9 as the lowest (cluster 12), but I need to consider the following constraints:
- Maximum cluster total weight: 400
- Minimum cluster total weight: 50
I know the use of the cutat
function that allows us to set the number of clusters, but this does not guarantee that the restrictions are met.
Perhaps there is a better package to solve this type of problem. Well I don't know.
Any help in solving this problem will be appreciated.