Chapter 14 Manuscript review

Whether reviewing a manuscript or preparing to write a manuscript, it pays to consider the following five questions.

What is the research question?
Why do I (or, why should anyone else) care about it?
What methodology is adopted?
What is the paper’s key estimating equation?
What are the main findings?

Internal validity: Should the study be believed?

If the study is not claiming to have identified a causal implication the nature of these questions changes a bit. That said, let us assume that it is a causal relationship that is claimed.

A study has internal validity if one can conclude from statistical results that one variable causally affects another variable within the context of the study.

Is the model accurately measuring what it claims to measure?
Should the study be believed?

Below are nine general “threats to internal validity” that may undermine causal interpretations. When one reads empirical research, one should consider how each of these affects one’s ability to draw causal interpretations.

1. Omitted variables: When one or more explanatory variables have been omitted from a regression model there may be bias in the estimates.

Bias can cause one to misrepresent (i.e., overestimate or underestimate) the nature of the relationship of a variable of interest and the dependent variable.

2. Trends in outcomes: Many variables have a tendency to grow over time (e.g., wages, inflation, age). One must account for this time trend to draw/sell causal inferences that rest on time-series variation in the data.

Ignoring the fact that two sequences are trending in the same (or opposite) directions can lead to false conclusions that changes in one variable are actually caused by changes in another variable.

3. Miss-specified variances: When outcomes for individuals within groups are correlated, but one runs an OLS regression using individuals as the unit of analysis, one risks biasing the standard errors downward. This can lead to overstating the significance of statistical tests and inferring causality when none exists. Typically, group error terms should be incorporated into the model to account for this.

By extension, if a treatment is applied at the state level, it might make more sense to analyze the data at the state level rather than at the individual level.

4. Mismeasurement: When one does not have actual values for regression variables, one often relies on reported values collected through surveys. These values will often differ from actual values, being influenced by survey methods such as word choice or question order. One can attempt to mitigate this problem through improved survey techniques, but some amount of measurement error is likely to remain. Measurement error can lead to bias.

5. Political economy: Endogeneity of policy changes due to governmental responses to variables associated with past or expected future outcomes can bias estimated effects.

Did the policy cause the effect, or was the policy a response to an effect or expected effect?

6. Simultaneity: When at least one explanatory variable in a regression model is determined jointly with the dependent variable, bias can arise.

For example, this would likely arise if one were estimating the relationship between crime and the number of police officers.

7. Selection: When the process that assigns people to treatment and control groups causes a correlation between treatment and the outcome, we have selection bias.

For example, people that enter a drug treatment program may be those who have made the resolution to turn themselves around. Causality cannot be inferred in the presence of selection bias. Selection into datasets and/or samples on which estimation is to be based is another source of similar concern.

8. Attrition: When there is a loss of respondents over time one must pause to consider the potential influence of this attrition.

When there is a differential loss of respondents from treatment and control groups, one’s concern should be greater. For example, patients may drop out of a study because of side effects of the intervention. Excluding these patients from the analysis could result in an overestimate of the effectiveness of the intervention.

9. Omitted interactions: Differential trends in treatment and control groups or omitted variables that change in different ways for treatment and control groups can bias results. For difference-in-differences, for example, the assumption is that the trend for both groups is the same.

External validity: Can the study’s results be reasonably generalized?

A study has external validity if the results of an experiment can be generalized to different individuals, contexts and outcomes. In other words, it is the extent to which the results can/should be applied to the outside world. There are three general threats to external validity: people, place, and time.

1. Interaction of selection and treatment: The treatment group may not be representative of the population that one would like to examine. If this is the case, the particular characteristics of the selected group may bias their performance with respect to the population of interest. The study’s results would then not be applicable to the population, or to any other group that more-accurately represents the characteristics of the population.

2. Interaction of setting and treatment: The geographic or institutional setting may affect the estimated effect of the treatment. If the site of a study is associated with certain unobservable variables then outcomes may be affected.

For example, if an educational study was conducted in a college town with lots of high-achieving, educationally oriented youth, the treatment effect may be over-estimated.

3. Interaction of history and treatment: The timing of a treatment may affect the estimated effect of that treatment. For example, a smoking cessation study conducted the week after the Surgeon General issues the well-publicized results of the latest smoking and cancer studies might produce different results than if it had been done the week before.

Literature

Articles that might serve nicely for you to consider applying this review mechanism (and should be read anyway) include Bertrand and Mullainathan (2004), Duggan and Levitt (2002), Fishman and Miguel (2007), and Sacerdote (2007).