Hi there, I'm attempting to use one model to calibrate another one. I need to use the data from the Vercoe site to validate the data on the Terapa site.
slm <- lm(observed ~ x, data = vercoe)
summary(slm)$coef
Estimate Std. Error t value Pr(>|t|) (Intercept) 0.38060075 0.444198479 0.8568259 4.003808e-01 x 0.02305857 0.003339521 6.9047544 4.864890e-07
slm2 <- lm(observed ~ x, data = terapa)
summary(slm2)$coef
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.0329589 0.365102194 5.568191 6.936300e-07
x 0.0125283 0.001024675 12.226606 1.079644e-17
Which adds a column with the right predictions alright, but what I need is to fundamentally change the fitted values of slm2. I've also tried manually setting the coef, but the fitted values remain unchanged after that.
The other problem I face is that Vercoe site only has 25 observations and Terapa site 60. How can I apply the coefficients from slm to slm2 to find the fitted values?
My way only gets me to the predicted values but the residuals are obviously not what I'm looking for.
produces predicted values using the coefficients from slm, derived from the vercoe data set, but using the x values from terpa. The fact that vercoe has fewer data points than terpa does not matter. The fit coefficients are from slm and the x values are from terpa. You say that instead of those predicted values, you want to change the fitted values of slm2. That is the part I do not understand. You can make predictions with the slm fit (as you did) or with slm2. What is the third alternative you are looking for?
If I understand correctly, fitted values are the predicted value of a lm. You're correct in that my current method uses slm coefficients and combines them with the "x" from the Terapa model to produce a prediction column NOT new fitted values. Those remain the same in slm2. What I would like to do is force the slm2 model into actually using the coefficients from slm to derive the fitted values of slm.
My current method is preventing me from using the residuals function (residuals(slm2) and gglm::gglm(slm2), because when I run them it keeps using the original coefficient values, not the ones I'm trying to import from slm. For example, residual would be the fitted value - observed value. The residuals function completely ignores the new prediction column.
slm2 has the following coefficients and fitted values:
if your focus is purely on getting hold of the residuals of one model as it would have been over other data, you can do substraction of the predictions of the model from the real target. i.e. just arithmetic.
(in_1 <- data.frame(x=1:10,
y=(1:10)*3+7))
(in_2 <- data.frame(x=1:10,
y=(1:10)*4+2))
plot(in_1,col="red")
points(x=in_2$x,
y=in_2$y,col="blue")
(lm1 <-lm(y~x,data=in_1))
(lm2 <-lm(y~x,data=in_2))
in_2$pred_from_1 <- predict(lm1,newdata=in_2)
lines(x=in_2$x,
y=in_2$pred_from_1,col="orange")
#residuals of if we had `fit` (but not really) lm1 on in_2 data
round(in_2$y - in_2$pred_from_1)
#manually set the coefficients for an lm using that of lm1
lm2x <- lm(y~ 0 + offset(rep(7,nrow(in_2))) + offset(x*3) ,data=in_2)
(pfake_2 <- predict(lm2x,newdata=in_2))
lines(x=in_2$x,
y=pfake_2)
residuals(lm2x)
# residuals from this fake lm object are same as the manual arithmatic at start.