I am currently following along with Wooldridge's Introductory Econometrics, in which he performs a 2SLS regression of the wage equation
\text{log(Wage)} = \beta_0+\beta_1\text{Educ}+\beta_2\text{Exper}+\beta_3\text{Exper}^2+u_1
where \text{Educ} is endogenous, whilst \text{Exper} and \text{Exper}^2 are exogenous. The mother's and father's education are also assumed to be uncorrelated with u_1, so we use both of these as instrumental variables for \text{Educ}. Hence, the reduced form equation for \text{Educ} is
\text{Educ}=\pi_0+\pi_1\text{Exper}+\pi_2\text{Exper}^2+\pi_3\text{FatherEduc}+\pi_4\text{MotherEduc}+v_2
If I use ivreg
to perform the 2SLS estimation, then I get the same results as Wooldridge:
Reg2SLS <- ivreg(formula=log(wage)~educ+exper+expersq |
exper+expersq+fatheduc+motheduc,
data=mroz)
summary(Reg2SLS)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0481003 0.4003281 0.120 0.90442
educ 0.0613966 0.0314367 1.953 0.05147 .
exper 0.0441704 0.0134325 3.288 0.00109 **
expersq -0.0008990 0.0004017 -2.238 0.02574 *
However, if I perform the same process manually, like so:
S1OLS <- lm(formula = educ~exper+expersq+fatheduc+motheduc, data = mroz)
frame <- data.frame(Wage = mroz$wage, Educ = mroz$educ,
FittedEduc = S1OLS$fitted.values,
Exper = mroz$exper, ExperSq = mroz$expersq)
S2OLS <- lm(formula = log(Wage)~FittedEduc+Exper+ExperSq, data=frame)
summary(S2OLS)
The results are different to that above:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.1332094 0.3817364 0.349 0.72730
FittedEduc 0.0568605 0.0310692 1.830 0.06793 .
Exper 0.0421082 0.0142860 2.948 0.00338 **
ExperSq -0.0008565 0.0004255 -2.013 0.04477 *
What is the reason for this? I am aware that performing this regression manually invalidates the standard errors and t statistics, but to the best of my understanding, the coefficients should not be effected. So why exactly does my manual regression fail?
PS: The following packages were used:
library(ivreg)
library(wooldridge)