I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix and run the code not in the function, it works. If I call the function either with an extracted column or in a call to apply I get an error (see the reprex below).
I'm sure I'm doing something silly, but I can't see what it is. Any help appreciated.
library(MASS)
# Find optimised Lambda for Boc-Cox transformation
BoxCoxLambda <- function(z){
b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE)
b$x[which.max(b$y)] # best lambda
}
mrow <- 500
mcol <- 2
set.seed(12345)
dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol)
# Try it not using the BoxCoxLambda function:
dd1 <- dd[,1] # 1st column of dd
bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE)
print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
#> [1] "1st column's lambda is 0.2"
# Calculate lambda for each column of dd
lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
#> Error in eval(predvars, data, env): object 'z' not found
I'm afraid I was a bit naughty and cross-posted this to the r-help mailing list yesterday afternoon (desire for an answer overcame my reticence about etiquette).
Anyway, John Fox came up with a solution. The call to lm needed modifying to lm(z+1 ~ 1, y = TRUE) to ensure that the lm object contained the y variable (z) and didn't then go looking for it elsewhere.
library(MASS)
# Find optimised Lambda for Boc-Cox transformation
BoxCoxLambda <- function(z){
b <- boxcox(lm(z+1 ~ 1, y = TRUE), lambda = seq(-5, 5, length.out = 61), plotit = FALSE)
b$x[which.max(b$y)] # best lambda
}
mrow <- 500
mcol <- 2
set.seed(12345)
dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol)
# Try it not using the BoxCoxLambda function:
dd1 <- dd[,1] # 1st column of dd
bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE)
print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
#> [1] "1st column's lambda is 0.2"
# Calculate lambda for each column of dd
lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
print(lambdas)
#> [1] 0.1666667 0.1666667