Subsetting on matrix field in R6 object is slow

Jwaage · January 29, 2020, 8:53pm

Hello,
I'm creating a matrix, representing a certain value for a combination of customers (rows) and features (cols). Let's call this matrix NBA. Based on data received through API, this matrix needs to be updated many times each second, inserting new values for each call with m[x,y] <- new_value. Subsequently, some matrix operations are carried out (not important here).
The matrix is part of a R6 object as a private field, and a update_matrix method allows updating a certain cell of the matrix. However, this operation is very slow compared to updating a normal matrix object outside R6, on the order of microseconds instead of nanoseconds.
Reprex:

library(bench)
library(ggplot2)
library(tidyr)

# Create NBA matrix, sparse
no_customers <- 1e6
no_features <- 30

NBA_matrix <-
  matrix(
    sample(c(rep(0, 1000), 1), size = no_customers * no_features, replace = TRUE),
    nrow = no_customers,
    ncol = no_features
  )

# Create NBA_like R6 object with matrix
library(R6)
NBA_lite <- R6Class("NBA_lite",
              public = list(
                mm = NULL,
                initialize = function(input_matrix) self$mm <- input_matrix,
                get_matrix = function() self$mm,
                modify = function(row, col, value) self$mm[row,col] <- value
              )
)
new_NBA_lite <- NBA_lite$new(input_matrix = NBA_matrix)

# Benchmark modifying single value, matrix vs R6 field
bench::mark(matrix     = NBA_matrix[234123, 10] <- 2,
            R6_field   = new_NBA_lite$modify(row = 234123, col = 10, value = 2))

expression	median	total_time
NBA_matrix[234123, 10] <- 2	804ns	8.84ms
new_NBA_lite$modify(row = 234123, col = 10, value = 2)	126ms	125.66ms

So, obviously, there's some overhead in using R6, but probably not on the order of 125ms. Am I missing something in relation to using a (semi)-large matrix as a field in terms of copy-semantics?

Thanks for any suggestions,
JW

system · February 19, 2020, 8:54pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.