Ignore the title of this article; it is simply a nice rhetorical trick to get economists to start using modern tools of the type Judea Pearl has developed to discuss causality. Economists know Haavelmo (winnter of the ’89 Nobel) for his “simultaneous equations” paper, in which he notes that regression cannot identify supply and demand simultaneously from a series of (price, quantity) bundles for the simple reason that regression of intersections between supply and demand won’t identify whether the supply curve or the demand curve has shifted. Theoretical assumptions about what changes to the economy affect demand and what affect supply – that is, economics, not statistics – solve the identification problem. (A side note: there is some interesting history on why econometrics comes about as late at is does. Economists until the 40s or so, including Keynes, essentially rejected statistical work in social science. They may have done so with good reason, though! Theories of stochastic processes that were needed to make sense of inference on non-IID variables like an economics time series weren’t yet developed, and economists rightly noted the non-IIDness of their data.)

Haavelmo’s *other* famous paper is his 1944 work on the probabilistic approach to economics. He notes that a system of theoretical equations is of interest not because of regression estimate itself, but because of the counterfactual where we vary one parameter keeping another one the same. That is, if we have in our data a joint distribution of X and Y, we are interested in more than simply that joint distribution; rather, we are interested in the counterfactual world where we could control one of those two variables. This is explicitly *outside* the statistical relationship between X and Y.

With Haavelmo 1944 as a suitable primer, Pearl presents the basic idea of his Structural Causal Models (SCM). This consists of a model M (usually a set of structural equations), a set of assumptions A (omitted factors, exclusion restrictions, correlations, etc.), a set of queries for the model to answer, and some data which is perhaps generated in accordance with A. The outputs are the logical implications of A, a set of data-dependent claims concerning model-dependent magnitudes or likelihoods of each of the queries, and a set of testable implications of the model itself answering questions like “to what extent do the model assumptions match the data?” I’ll ignore in the post, and Pearl generally ignores in the present paper, the much broader question ofwhen failures of that question matter, and further what the word “match” even means.

What’s cool about SCM and causal calculus more generally is that you can answer a bunch of questions without assuming anything about the functional form of relationships between variables – all you need are the causal arrows. Take a model of observed variables plus unobserved exogenous variables. Assume the latter to be independent. The model might be that X is a function of Y, W and an unobserved variable U1, Y is a function of V, W and U2, V is a function of U3 and W is a function of U4. You can draw a graph of causal arrows relating any of these concepts. With that graph in hand, you can answer a huge number of questions of interest to the econometrician. For instance: what are the testable implications of the model if only X and W are measured? Which variables can be used together to get an unbiased estimate of the effect of any one variable on another? Which variables must be measured if we wish to measure the direct effect of any variable on any other? There are many more, with answers found in Pearl’s 2009 textbook. Pearl also comes down pretty harshly on experimentalists of the Angrist type. He notes correctly that experimental potential-outcome studies also rely on a ton of underlying assumptions – concerning external validity, in particular – and at heart structural models just involve stating those assumptions clearly.

Worth a look – and if you find the paper interesting, grab the 2009 book as well.

http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf (December 2011 working paper)

Well but we can do even better using simple non-parametric methods…

It’s not obvious to me what nonparametric methods – which I interpret as techniques allowing for arbitrary functional relationships between sets of variables – has to do with causality, which in the Lewis sense is explicitly linked to counterfactuals. That is, in traditional statistics, parametric or nonparametric, I fit an equation on a set of variables. But looking at the joint distribution from previously gathered data tells me nothing about what happens when I run a policy change, since the policy change itself can change the functional relationship itself. Think of the Lucas Critique applied more broadly.

Thanks for your answer. I follow your blog for a while and I think it is great. Being a first year graduate student your blog is very useful

In my view the Lucas critique was valid because the Econometrics that economists use are mainly parametric, so it only tells us what is the fit of the relationship we are imposing. And that imposition is almost always wrong in some sense. So by changing policy we are not able to know what will happen because we never knew in the first place what was going on.

As such the inferences (causal relations) that we attribute to the parametric estimation are biased, wrong or both. Thus causality from parametric estimations are not trustworthy, but if we use non-parametric methods we can trust our results.

Obviously the non-parametric approach alone is not enough, it must be followed by some semi-parametric technique in order to provide good forecasts.

Maybe I’m off topic but I think that most of the confusion surrounding causality and Econometric methods is due to the almost exclusive parametric method application.

Glad you enjoy the site.

I want to reiterate, though, that the argument above (and in the Lucas Critique) has *nothing* to do with parametrics. The principle behind parametric and nonparametric statistics is that I have seen draws (X,Y) from a population distribution. I then fit a function on my sample from that distribution. What Lucas and the above paper are stressing is that the function itself – which is to say, the underlying joint distribution – can change when we consider counterfactuals. That is, the statistical link between inflation and unemployment allowed, it was thought, policymakers to choose a point along a perhaps non-parametric curve of possible inflation-unemployment bundles. This is false, because the very policies which government would enact to move along that curve also change the shape of the curve.

Yeah your right, I misunderstood the paper. Thanks for your answer