Chapter 3 Difference estimators

The term “treatment effect” refers to the causal effect of a binary (0-1) variable on an outcome variable of scientific or policy interest (e.g., the effects of government programs and policies, such as those that subsidize training for disadvantaged workers, and the effects of individual choices like college attendance). The principal econometric problem in the estimation of treatment effects is selection bias, which arises from the fact that treated individuals differ from the non-treated for reasons other than treatment status. Treatment effects can be estimated using a variety of methodologies (e.g., social experiments, regression models, matching estimators, instrumental variables).

In a simple but well-designed experiment, a comparison of mean outcomes from treated subjects and mean outcomes from non-treated subjects (i.e., the “difference” estimator) can itself speak to efficacy. This is justified on the grounds that experimental randomization implies that systematic differences should not exist in any other pre-treatment variable.

“Difference” designs try to mimic an experiment setting, finding equivalents of “treatment” and “control” groups in which everything apart from the variable of interest (or other things that can be controlled for) are assumed to be the same. This is often a very difficult claim to make—largely impossible to do this perfectly. As such, the researcher can easily fail to exclude other omitted factors explaining the observed differences between the two groups.

Define \(\mu_{it}\) to be the mean of the outcome in group \(i\) at time \(t\). Define \(i=0\) for the control group and \(i=1\) for the treatment group. Define \(t=0\) to be a pre-treatment period and \(t=1\) to be the post-treatment period (though only the \(i=1\) group receives the treatment). The single-difference estimator (or just, difference estimator) simply uses the difference in post-treatment means between treatment and control groups as the estimate of the treatment effect (i.e., it uses an estimate of \(\mu_{11} - \mu_{01}\)). However, this assumes that the treatment and control groups have no other differences apart from the treatment—a very strong assumption with non-experimental data.

Here, healthy skepticism would base inference on the assumption that no unobservable heterogeneity exists between \(i=0\) and \(i=1\) types.

  • Showing that these types are similar in observable ways is comforting, but it is not a test. (Such would typically be referred to as a balance test.)
  • Showing that they are different in observable ways scares people—we can condition on these observables, but it makes it very easy to imagine that \(i=0\) and \(i=1\) types are also different in unobservable ways (which we cannot condition on). This is also not a test.