Error using MASS::boxcox within a function

ron · July 8, 2023, 7:09am

Hi,

I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix and run the code not in the function, it works. If I call the function either with an extracted column or in a call to apply I get an error (see the reprex below).

I'm sure I'm doing something silly, but I can't see what it is. Any help appreciated.

library(MASS)

# Find optimised Lambda for Boc-Cox transformation
BoxCoxLambda <- function(z){
    b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE)
    b$x[which.max(b$y)]    # best lambda
}

mrow <- 500
mcol <- 2
set.seed(12345)
dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol)

# Try it not using the BoxCoxLambda function:
dd1 <- dd[,1] # 1st column of dd
bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE)
print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
#> [1] "1st column's lambda is 0.2"

# Calculate lambda for each column of dd
lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
#> Error in eval(predvars, data, env): object 'z' not found

^{Created on 2023-07-08 with reprex v2.0.2}

Thanks for your time and help.

Ron.

ron · July 9, 2023, 9:07am

Hi.

I'm afraid I was a bit naughty and cross-posted this to the r-help mailing list yesterday afternoon (desire for an answer overcame my reticence about etiquette).

Anyway, John Fox came up with a solution. The call to lm needed modifying to lm(z+1 ~ 1, y = TRUE) to ensure that the lm object contained the y variable (z) and didn't then go looking for it elsewhere.

library(MASS)

# Find optimised Lambda for Boc-Cox transformation
BoxCoxLambda <- function(z){
    b <- boxcox(lm(z+1 ~ 1, y = TRUE), lambda = seq(-5, 5, length.out = 61), plotit = FALSE)
    b$x[which.max(b$y)]    # best lambda
}

mrow <- 500
mcol <- 2
set.seed(12345)
dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol)

# Try it not using the BoxCoxLambda function:
dd1 <- dd[,1] # 1st column of dd
bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE)
print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
#> [1] "1st column's lambda is 0.2"

# Calculate lambda for each column of dd
lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
print(lambdas)
#> [1] 0.1666667 0.1666667

^{Created on 2023-07-09 with reprex v2.0.2}

Thanks to anybody that took a look at this for me.

Ron.

system · July 16, 2023, 9:07am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.