Welcome to the forum Júlio. The short answer is, remove treat
from the zelig
model formula. You're fitting the model only for the matched control observations, that is, those observations that, in this case, all have treat=0
. As a result, treat
shouldn't be an independent variable in the model. See below for more details.
I describe below how to make the ATT calculation work, but first, to make your example run, we need to make a few changes to your code: We'll need the following packages:
library(Zelig)
library(MatchIt)
We also need the correct variable names for the model specification. hisp
should be hispan
and nodegr
should be nodegree
.
Now, to address the error you're getting: In z.out
you have (after making the corrections described above):
z.out <- zelig (re78 ~ treat + age + educ + black + nodegree + hispan + married + re74 + re75,
data = match.data(m.out, "control") , model = "ls")
This results in the model being run only with matched data rows that have treat=0
. Yet treat
is also included in the model formula. Since treat
has only one value, no coefficient for treat
is estimated in the model. That's what is ultimately causing the error you're getting.
Here's what I get for z.out
when I run your code (with the changes described at the beginning):
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.714e+03 4.246e+03 -0.875 0.3830
treat NA NA NA NA
age -1.791e+01 4.624e+01 -0.387 0.6991
educ 6.774e+02 2.634e+02 2.572 0.0109
black 1.444e+02 1.118e+03 0.129 0.8974
nodegree 2.435e+03 1.395e+03 1.746 0.0826
hispan 1.529e+03 1.294e+03 1.182 0.2390
married -1.197e+03 1.203e+03 -0.995 0.3210
re74 1.574e-02 1.348e-01 0.117 0.9072
re75 4.253e-01 2.079e-01 2.046 0.0423
As described on page 12-13 of the MatchIt
vignette, to calculate the Average Treatment Effect on the Treated (ATT), we fit the model just for the matched control group observations, which is what you've done. However, we need to exclude treat
from the z.out
model formula, since treat
has only one value. If the matching procedure has controlled for selection bias, then this model gives us the counterfactual (what re78
would be for the treated group if it had not been treated). Then we apply the coefficients from this model to the matched treated observations (the matched observations for which treat=1
) to get the ATT.
library(Zelig)
#> Loading required package: survival
library(MatchIt)
m.out <- matchit(treat ~ educ + age + black + hispan + married + nodegree + re74 + re75,
data = lalonde, method = "nearest", ratio = 1)
z.out <- zelig(re78 ~ age + educ + black + nodegree + hispan + married + re74 + re75,
data = match.data(m.out, "control"), model = "ls")
x.out <- setx(z.out, data = match.data(m.out, "treat"), cond = TRUE)
s.out <- sim(z.out, x = x.out)
m.out
#>
#> Call:
#> matchit(formula = treat ~ educ + age + black + hispan + married +
#> nodegree + re74 + re75, data = lalonde, method = "nearest",
#> ratio = 1)
#>
#> Sample sizes:
#> Control Treated
#> All 429 185
#> Matched 185 185
#> Unmatched 244 0
#> Discarded 0 0
z.out
#> Model:
#>
#> Call:
#> z5$zelig(formula = re78 ~ age + educ + black + nodegree + hispan +
#> married + re74 + re75, data = match.data(m.out, "control"))
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -9411 -4362 -1854 2639 17392
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -3.714e+03 4.246e+03 -0.875 0.3830
#> age -1.791e+01 4.624e+01 -0.387 0.6991
#> educ 6.774e+02 2.634e+02 2.572 0.0109
#> black 1.444e+02 1.118e+03 0.129 0.8974
#> nodegree 2.435e+03 1.395e+03 1.746 0.0826
#> hispan 1.529e+03 1.294e+03 1.182 0.2390
#> married -1.197e+03 1.203e+03 -0.995 0.3210
#> re74 1.574e-02 1.348e-01 0.117 0.9072
#> re75 4.253e-01 2.079e-01 2.046 0.0423
#>
#> Residual standard error: 5910 on 176 degrees of freedom
#> Multiple R-squared: 0.09413, Adjusted R-squared: 0.05296
#> F-statistic: 2.286 on 8 and 176 DF, p-value: 0.02365
#>
#> Next step: Use 'setx' method
x.out
#> setx:
#> (Intercept) age educ black nodegree hispan married re74 re75
#> 1 1 25.3 10.6 0.47 0.638 0.216 0.211 2342 1615
#>
#> Next step: Use 'sim' method
s.out
#>
#> sim x :
#> -----
#> ev
#> mean sd 50% 2.5% 97.5%
#> 1 5437.778 425.0799 5432.914 4611.991 6227.689
#> pv
#> mean sd 50% 2.5% 97.5%
#> [1,] 5285.28 5799.457 5368.345 -6447.408 15900.29
Created on 2019-07-13 by the reprex package (v0.3.0)