Today’s 2021 Clark Medal goes to the Harvard econometrician Isaiah Andrews, and no surprise. Few young econometricians have produced such a volume of work so quickly. And while Andrews has a number of papers on traditional econometric topics – how to do high-powered inference on non-linear models, for instance – I want to focus here on his work on what you might call “strategic statistics”.

To understand what we mean by that term, we need to first detour a bit and understand what econometrics is anyway. The great Joseph Schumpeter, in a beautiful short introduction to the new Econometric Society in 1933, argues that economics is not only the most mathematical of the social or moral sciences, but of *all *sciences. How can that be? Concepts in physics like mass or velocity are surely quantitative, but must be measured before we can put some number of them. However, concepts in economics are fundamentally quantitative: our basic building blocks are prices and quantities. From these numerical concepts comes the natural desire to investigate the relationship between them: estimates of demand curves go back at least to Gregory King and Charles D’Avenant in the 17th century! The issue is not that economics is amenable to theoretical investigation. Rather, from King on forward through von Thunen, Cournot, Walras, Fisher and many more, economics is a science where numerical data from the practice of markets is combined with theory.

Econometrics, then, is not simply statistics, a problem of computing standard errors and developing new estimators. Rather, as historians of thought like Mary Morgan and Ray Epstein have pointed out, econometrics is a unique subfield of statistics because of its focus on identification is models where different variables are simultaneously determined. Consider estimating how a change in the price of iron will affect sales. You observe a number of points of data from prior years: at 20 dollars per ton, 40 tons were sold; at 30 dollars per ton, 45 tons were sold. Perhaps you have similar data from different countries with disconnected markets. Even in the 1800s, the tool of linear regression with controls was known. Look at the numbers above: 40 tons sold at $20, and 45 tons at $30. The demand curve slopes up! The naïve analyst goes to Mr. Carnegie and suggests, on the basis of past data, that if he increases the price of iron, he will sell even more!

The problem with running this regression, though, should be clear if one starts with theory. The price of iron depends on the conjunction of supply and demand, on Marshall’s famous “scissors”. Our observational data cannot tell us whether changes in the price-quantity pairs observed happened because demand or supply shifted. This conflation is common in public reasoning: we observe that house prices are rising very quickly at the same time as many new condos are being built in the neighborhood, and think the latter is causing the former. Correctly, however, both the price increase and the new construction can occur if demand for living in the neighborhood increases and the marginal cost of construction is increasing. Supply and demand is not the only simultaneous stochastic equation model in economics, of course: anything with strategic behavior that determines an equilibrium will be as well.

This causal identification problem goes back at least to what Trygve Haavelmo pointed out in 1943. The past relationship between prices and quantities sold is not informative as to what will happen if I *choose *to raise prices. Likewise, though rising prices and new construction correlate, if we choose to increase construction in an area, prices will fall. Though there is an empirical link between rising prices and a strong economy, we cannot generate a strong economy in the long run just by inflating the currency. Econometrics, then, is largely concerned with the particular statistical problem of *identifying* certain parameters that explain what will happen if we change one part of the system through policy or when we place people with different known preferences, costs, and so on in the same strategic situation.

How we can do that is an oft-told story, but roughly we can identify a parameter in a simultaneously determined model with statistical assumptions or with structural assumptions. In the context of supply and demand, if we randomly increase the price a certain good is sold at in a bunch of markets, that experiment can help identify the price elasticity of demand, holding demand constant (but tells us nothing about what happens to price and quantity if consumer demand changes!). If we use “demand” or “supply” shifters – random factors that affect price only via their effect of firm costs or consumer demand – these “instrumental variables” allow us to split the supply and demand curve in past observational data. If we assume more structure, such as that there are a set of firms who price according to Cournot, then we can back out firm costs and look at counterfactuals like “what if a merger reduced the number of firms in this market by one”. The important thing to realize is no matter where an empirical technique lies on the spectrum from purely statistical to heavily theory-driven, there are underlying assumptions being made by the econometrician to identify the parameters of interest. The exclusion restriction in an IV, that the shifter in question only affects price via the supply or demand side, is as much an untestable assumption as the argument that firms are profit-maximizing Cournot players.

This brings us back to Isaiah Andrews. How do scientists communicate their results to the public, particularly when different impossible-to-avoid assumptions give different results? How can we ensure the powerful statistical tools we use for *internal validity*, meaning causally-relevant insight in the particular setting from which the empirical is drawn, do not mislead about *external validity*, the potential for applying those estimates when participants have scope for self-selection or researchers select convenient non-representative times or locations for their study? When our estimation is driven by the assumptions of a model, what does it mean when we say our model “fits the data” or “explains key variation in the data”? These questions are interesting essentially because of the degrees of freedom the researcher holds in moving from a collection of observations to a “result”. Differences of opinion in economics are not largely about the precision of estimated data, a la high energy physics, but about the particular assumptions used by the analyst to move from data to estimated parameters of interest. Taking this seriously is what I mean above by “strategic statistics”: the fact that identification in economics requires choices by the analyst means we need to take the implications of those choices seriously. Andrews’ work has touched on each of the questions above in highly creative ways. I should also note that, by the standards of high-rigor econometrics, his papers tend to be quite readable and also quite concise.

Let’s begin with scientific communication. As we are all aware from the disastrous Covid-related public science in the past year (see Zeynep Tufekci’s writing for countless examples), there is often a tension between reporting results truthfully and the decisions taken based on those results. Andrews and Shapiro model this as a Wald-style game where scientists collect data and provide an estimate of some parameter, then a decision is made following that report. The estimate is of course imprecise: science involves uncertainty. The critical idea is that the “communications model” – where scientists report an estimate and different agents take actions based on that report – differs from the “decision model” where the scientist selects the actions (or, alternatively, the government chooses a common policy for all end-users on the basic of scientist recommendations). Optimal communication depends heavily on which setting you are in. Imagine that a costly drug is weakly known to improve health, but the exact benefit is unknown. When they can choose, users take the drug if the benefit exceeds their personal cost of taking it. In an RCT, because of sampling error, sometimes you’ll get that the drug is harmful when you try to estimate how “beneficial” it is. In a communications model, the readers adjust for sampling error, so you just report truthfully: there is still useful information in that “negative” estimate because it still tells you that the effect of the drug is likely to be close to zero the more negative the point estimate. No reason to hide that from readers! In a “decision model”, you would essentially be forcing a tax on the drug just because of sampling error, even though you know this is harmful, so optimally you censor the reporting and just give “no effect” in your scientific communications. There is a really interesting link between decision theory and econometrics going back to Wald’s classic paper. The tension between open communication of results to users with different preferences, and recommended decisions to those same users is well worth further investigation.

How to communicate results also hinges on internal versus external validity. A study done in Tanzania on schooling may find that paying parents to send kids to school increases attendance 16%. This study may be totally randomized with the region. What would the effect be of paying parents in Norway? If the only difference across families depends on observables within the support of the data in the experiment, we can simply reweight results. This seems implausible, though – there are many unobservable differences between Oslo and Dodoma. In theory, though, if all those unobservables were known, we again just have a reweighting problem. Emily Oster and Andrews show that bounds on the externally valid effect of a policy can be constructed if you assume the share of covariance between selection/participation and the estimated treatment effects (the idea here is not far off from the well-known Oster bounds for omitted variable bias). For instance, in the Bloom et al work from home in China paper, call center workers who choose to work-from-home see a nontrivial increase in productivity. Perhaps they select to work-from-home because they know they can do so efficiently, however. Using the Oster-Andrews bound, to get a negative effect of work-from-home for this call center, unobservable differences across workers would have to be 14.7 times more informative about treatment effect heterogeneity than observables.

In addition to unobservables making our estimates hard to apply outside very specific contexts, structural assumptions can also “drive” results. Structural models often use a complex set of assumptions to identify a model, where “identify” means that distinct estimated outcomes of interest depend on distinct underlying data (the “traditional” definition). But which assumptions are critical? What changes if we modify one of them? This is a very hard question to answer: as every structural economist knows, we often don’t know how to “close” the model so that it can be estimated if we change the assumptions. Many authors loosely say that “x is identified by y” when the estimated x is very sensitive to changes in y, where y might be an a priori assumption, or a particular type of data. In that sense, “what is critical to the estimate in this structural model” is asking “how can I trust the author that y in fact identifies x”? In a paper in JBES, Andrews and coauthors sum up this problem in a guide to practical sensitivity analysis in structural models: “A reader who accepted the full list of assumptions could walk away having learned a great deal. A reader who questioned even one of the assumptions might learn very little, as they would find it hard or impossible to predict how the conclusions might change under alternative assumptions.” Seems like a problem! However, Andrews has shown, in a 2017 QJE with Gentzkow and Shapiro, that the formal sensitivity of structural estimates is in fact possible.

The starting point is that some statistical techniques are *transparent*: if you regress wages on education, we all understand that omitting skill biases this relationship upward, and that if we know the covariance of skill and education, we have some idea of the magnitude of the bias. Andrews’ idea is to take the same idea more broadly to any *moment-based* estimate. If you have some guess about how an assumption affects particular moments of the data, then you can use a particular matrix to approximate how changes in those moments affect the parameters we care about. Consider this example. In a well-known paper, Dellavigna et al find that door-to-door donations to charity are often just based on social pressure. That is, we give a few bucks to get this person off our doorstep without being a jerk, not because we care about the charity. The model uses variation in whether you knew the person was coming to your doorstep alongside an assumption that, basically, social pressure drives small donations with a different distribution from altruistic/warm glow donations. In particular, the estimate of social pressure turns out to be quite sensitive to donations of exactly ten dollars. Using the easy-to-compute matrix in Andrews et al, you can easily answer, as a reader, questions like “how does the estimate of social pressure change if 10% of households just default to giving ten bucks because it is a single bill, regardless of social pressure vs. warm glow?” I think there will be much more role for ex-post dashboard/webapp type analyses by readers in the future: why should a paper restrict to the particular estimates and robustness the authors choose? Just as open data is not often required, I wouldn’t be surprised if “open analysis” in the style of this paper becomes common as well.

A few remaining bagatelles: Andrews’ work is much broader than just what has been discussed here, of course. First, in a very nice applied theory paper in the AER, Andrews and Dan Barron show how a firm like Toyota can motivate its suppliers to work hard even when output is not directly contractible. Essentially, recent high performers become “favored suppliers” who are chosen whenever the planner believes their productivity in the current period is likely quite high. Payoffs to the firm with this rule are strictly higher than just randomly choosing some supplier that is expected to be productive today, due to the need to dynamically provide incentives to avoid moral hazard. Second, in work with his dissertation advisor Anna Mikusheva, Andrews has used results from differential geometry to perform high-powered inference when the link between structural parameters and the outcome of interest is highly non-linear. Third, in work with Max Kasy, Andrews shows a much more powerful way to identify the effect of publication bias that simply comparisons of the distribution of p-values around “significance” cutoffs. Fourth, this is actually the second major prize for econometrics this year, as Silvana Tenreyro won the “European Clark”, the Yrjo Jahnsson Award, this year alongside Ricardo Reis. Tenreyro is well-known for the ppml estimator is her “log of gravity” paper with Santos Silva. One wonders who will be the next Nobel winner in pure econometrics, however: a prize has not gone to that subfield since Engle and Granger in 2003. I could see it going two ways: a more “traditional” prize to someone like Manski, Hausman, or Phillips, or a “modern causal inference” prize to any number of contributors to that influential branch. Finally, I realize I somehow neglected to cover the great Melissa Dell’s Clark prize last year – to be rectified soon!