Hi,
When using the prediction
function to calculate predicted means (or "margins") I get the same predicted values in both R and Stata, but different standard errors (and different confidence intervals as a result). What is the reason for the difference? Am I calculating the standard errors incorrectly below with my call to summarize
(se = mean(se.fitted)
). Or, possibly, is Stata calculating the errors incorrectly?
Ideally, I would be able to reproduce the same results in any software package. I would greatly appreciate any advice for how to replicate the results Stata gives in R.
suppressWarnings(suppressPackageStartupMessages({
library(tidyverse)
library(prediction)
}))
# model
mod <- lm(mpg ~ cyl + vs + hp, data = mtcars)
# predicted values
pred <- prediction(mod, at = list(cyl = c(4, 6, 8)))
pred %>%
group_by(cyl) %>%
summarise(pred_mean = mean(fitted) ,
se = mean(se.fitted),
ci_low = pred_mean - (1.96 * se),
ci_high = pred_mean + (1.96 * se),
total_n = n()) %>%
as.data.frame() # to carry out decimals further for printing
#> cyl pred_mean se ci_low ci_high total_n
#> 1 4 25.60488 1.907449 21.86628 29.34348 32
#> 2 6 20.56328 1.415614 17.78867 23.33788 32
#> 3 8 15.52167 1.771571 12.04939 18.99395 32
# stata output, and code to replicate, below:
# library(haven)
# mtcars %>%
# write_dta("mtcars.dta"))
#
# use mtcars, clear
# reg mpg vs cyl hp
# margins, at(cyl = (4, 6, 8))
#
# Predictive margins Number of obs = 32
# Model VCE : OLS
#
# Expression : Linear prediction, predict()
#
# 1._at : cyl = 4
#
# 2._at : cyl = 6
#
# 3._at : cyl = 8
#
# | Delta-method
# | Margin Std. Err. t P>|t| [95% Conf. Interval]
#_at |
# 1 | 25.60488 1.619802 15.81 0.000 22.28687 28.9229
# 2 | 20.56328 .5809866 35.39 0.000 19.37318 21.75337
# 3 | 15.52167 1.379056 11.26 0.000 12.6968 18.34654
# for comparison, ggpredict gives yet a different (though closer) confidence interval
suppressWarnings(suppressPackageStartupMessages({
library(ggeffects)
}))
ggpredict(mod, terms = c("cyl [4, 6, 8]"))
#> # Predicted values of mpg
#>
#> cyl | Predicted | 95% CI
#> --------------------------------
#> 4 | 25.60 | [22.43, 28.78]
#> 6 | 20.56 | [19.42, 21.70]
#> 8 | 15.52 | [12.82, 18.22]
#>
#> Adjusted for:
#> * vs = 0.44
#> * hp = 146.69
Created on 2022-03-14 by the reprex package (v2.0.0)