R freeze with cluster_infomap()

I am having trouble with the community detection algorithm in igraph library.
A graph object of not a huge size (under 100,00 nodes and same magnitude of edges) is created, using the dataframe. Then I work with the strongly connected component of the network (GCC).
When I run the cluster_infomap() R freezes, whether it's RStudio or the Posit Cloud and I cannot run anything, cannot stop it, and it doesn't return the values. After running other commands I get the error:

Error: Status code 405 returned by RStudio Server when executing 'console_input'

which I think is unrelated to the issue.
The memory usage shows there's plenty of RAM available, so it doesn't seem to be a memory issue.
Also, I had used the same code and same datasets previously without any problems, and I'm facing this issue as I want to reproduce the results and analyse further.

What I do is the following:

setwd("/cloud/project/2006")

files <- list.files()
mergread <- function(file){
+     data <- read_xlsx(file, sheet = 2)
+ }
library(readxl)
DataList <- vector(mode = "list", length = 10)
DataList[[1]] <- do.call (rbind, lapply(files, mergread) )
              
library(igraph)

GraphList <- vector(mode = "list", length = 10)
for (j in 1) {
+     nrows <- nrow(DataList[[j]])
+     BvDs <- c()
+     for (i in 1:nrows) {
+         if (is.na(DataList[[j]]$...1[i])){
+             BvDs[i] <- BvDfiller
+         }
+         else 
+         {    BvDs[i] <- as.character(DataList[[j]][i,3])
+         BvDfiller <- as.character(DataList[[j]][i, 3])
+         }
+     }
+     
+     Edges <- as.data.frame(matrix (NA,
+                                    ncol = 2,
+                                    nrow = 0))
+     
+     for (i in 1:nrows){
+         if (!is.na(DataList[[j]][i,7]))
+         {Edges <- rbind(Edges, c(as.character(DataList[[j]][i, 7]), BvDs[i]))  }
+     }
+     
+     GraphList[[j]]  <- graph_from_edgelist (as.matrix(Edges, ncol = 2))
+ }
 GraphList[[1]] <-  GraphList[[1]] - "No data fulfill your filter criteria"
GCCList <- vector(mode = "list", length = 10)
for (j in 1){
+     Comps <- components(GraphList[[j]], mode = "weak")
+     GCC <- which.max(Comps$csize)
+     GCCList[[j]] <- induced_subgraph(GraphList[[j]], which(Comps$membership == GCC))
+ }
> Com2006 <- cluster_infomap(GCCList[[1]])

PS1: previously I thought it's related to a command after the community detection, edited and updated accordingly.
PS2: the for (j) and the lists of 10 are there to do the same thing for 10 years, but here only it's being done for one year, in the interest of time.

the posit cloud link is not public, so we cant access it to look at it

Maybe the dataframe is so much big to manage for your pc....

the network2.R scrip has a single cbind over standalone static code; it works fine in isolation certainly.
If there is some mysterious bug when you do other things first, we would need to be able to reproduce those things. the network2.R script is sadly not in a state to run out of the box. (there is no library statement to faciliate the readxl calls) , ok, I can add that in myself. but then :

> DataList[[1]] <- do.call (rbind, lapply(files, mergread) )
Error in utils::unzip(zip_path, list = TRUE) : zip file '/cloud/project/2006' cannot be opened

at this point my will to investigate is falling away....

Error: object 'graph_from_edgelist' not found

perhaps you can run your early time consuming steps, and use package qs to save the output of that to disk, from where it could be reloaded quickly later; this will cut down on the amount of time to produce that stage of data (which I assume if you have the correct libraries loaded runs ok)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.