First, let's simplify a bit the example:

This works:

```
init_dta <- mtcars$disp
fit <- lm(init_dta ~ 1)
bc <- boxcox(fit)
lambda <- bc$x[bc$y == max(bc$y)]
result <- (init_dta^lambda-1)/lambda
```

But this doesn't work:

```
my_lm <- function(.x){
lm(.x)
}
init_dta <- mtcars$disp
fit2 <- my_lm(init_dta ~ 1)
bc <- boxcox(fit2)
#> Error in stats::model.frame(formula = .x, drop.unused.levels = TRUE): object '.x' not found
```

We can look at the traceback:

```
10. stats::model.frame(formula = .x, drop.unused.levels = TRUE)
9. eval(mf, parent.frame())
8. eval(mf, parent.frame())
7. lm(formula = .x, y = TRUE, qr = TRUE)
6. eval(call, parent.frame())
5. eval(call, parent.frame())
4. update.default(object, y = TRUE, qr = TRUE, ...)
3. update(object, y = TRUE, qr = TRUE, ...)
2. boxcox.lm(fit2)
1. boxcox(fit2)
```

So, the problem comes from the call to `update()`

, when running the boxcox function, it is trying to re-run the initial fit, but can not find the variables. You can check the source code of `boxcox.lm()`

to see that call. And we can look at our lm object:

```
fit2
#>
#> Call:
#> lm(formula = .x)
#>
#> Coefficients:
#> (Intercept)
#> 230.7
```

Here the important part is the `Call`

: upon running, `lm()`

captured the exact expression that was used; but since we called it within a function, the `formula`

argument is meaningless. Hence, the later call to `update()`

fails.

So, let's go back to the source code of `boxcox.lm()`

(you can display it with `View(boxcox)`

, or ctrl+click on function name, or F2 while on function name, use the menu on top to select `boxcox.lm`

). It looks like this:

```
function (object, lambda = seq(-2, 2, 1/10), plotit = TRUE,
interp = (plotit && (m < 100)), eps = 1/50, xlab = expression(lambda),
ylab = "log-Likelihood", ...)
{
m <- length(lambda)
if (is.null(object$y) || is.null(object$qr))
object <- update(object, y = TRUE, qr = TRUE, ...)
result <- NextMethod()
if (plotit)
invisible(result)
else result
}
```

So, the call to `update()`

only happens when `y`

and `qr`

are not present in the lm object. Let's look back at the arguments of the `lm()`

function:

```
lm(formula, data, subset, weights, na.action,
method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
singular.ok = TRUE, contrasts = NULL, offset, ...)
```

By default, `qr = TRUE`

, but `y = FALSE`

, so when calling boxcox, it can't find the data for `y`

, and thus tries to recompute the `lm`

fit. But since the `call`

contains references to variable names instead of the names in the data, it fails to rerun. We finally get our working solution:

```
library(MASS)
# Box-Cox-Normalization of a numerical vector
boxcox_transform <- function(x) {
# Note that x~1 counts as a linear model in R, so you can use the
# boxcox function from MASS
bc <- boxcox(lm(x ~ 1, y = TRUE))
# lambda that maximises the likelihood
lambda <- bc$x[bc$y == max(bc$y)]
# print lambda and 95% confidence limits
# lambda
# range(out$x[out$y > max(out$y)-qchisq(0.95,1)/2])
# Box-Cox-transformed variable
result <- (x^lambda-1)/lambda
# histogram
# histogram(result)
# return transformed variable
return(result)
}
res1 <- boxcox_transform(mtcars$disp)
init_dta <- mtcars$disp
fit <- lm(init_dta ~ 1)
bc <- boxcox(fit)
lambda <- bc$x[bc$y == max(bc$y)]
res2 <- (init_dta^lambda-1)/lambda
waldo::compare(res1, res2)
#> v No differences
```