Author Archives: afinetheorem

How We Create and Destroy Growth: A Nobel for Romer and Nordhaus

Occasionally, the Nobel Committee gives a prize which is unexpected, surprising, yet deft in how it points out underappreciated research. This year, they did no such thing. Both William Nordhaus and Paul Romer have been running favorites for years in my Nobel betting pool with friends at the Federal Reserve. The surprise, if anything, is that the prize went to both men together: Nordhaus is best known for his environmental economics, and Romer for his theory of “endogenous” growth.

On reflection, the connection between their work is obvious. But it is the connection that makes clear how inaccurate many of today’s headlines – “an economic prize for climate change” – really is. Because it is not the climate that both winners build on, but rather a more fundamental economic question: economic growth. Why are some places and times rich and others poor? And what is the impact of these differences? Adam Smith’s “The Wealth of Nations” is formally titled “An Inquiry into the Nature and Causes of the Wealth of Nations”, so these are certainly not new questions in economics. Yet the Classical economists did not have the same conception of economic growth that we have; they largely lived in a world of cycles, of ebbs and flows, with income per capita facing the constraint of agricultural land. Schumpeter, who certainly cared about growth, notes that Smith’s discussion of the “different progress of opulence in different nations” is “dry and uninspired”, perhaps only a “starting point of a sort of economic sociology that was never written.”

As each generation became richer than the one before it – at least in a handful of Western countries and Japan – economists began to search more deeply for the reason. Marx saw capital accumulation as the driver. Schumpeter certainly saw innovation (though not invention, as he always made clear) as important, though he had no formal theory. It was two models that appear during and soon after World War II – that of Harrod-Domar, and Solow-Swan-Tinbergen – which began to make real progress. In Harrod-Domar, economic output is a function of capital Y=f(K), nothing is produced without capital f(0)=0, the economy is constant returns to scale in capital df/dK=c, and the change in capital over time depends on what is saved from output minus what depreciates dK/dt=sY-zK, where z is the rate of depreciation. Put those assumptions together and you will see that growth, dY/dt=sc-z. Since c and z are fixed, the only way to grow is to crank up the savings rate, Soviet style. And no doubt, capital deepening has worked in many places.

Solow-type models push further. They let the economy be a function of “technology” A(t), the capital stock K(t), and labor L(t), where output Y(t)=K^a*(A(t)L(t))^(1-a) – that is, that production is constant returns to scale in capital and labor. Solow assumes capital depends on savings and depreciation as in Harrod-Domar, that labor grows at a constant rate n, and that “technology” grows at constant rate g. Solving this model gets you that the economy grows such that dY/dt=sy-k(n+z+g), and that output is exactly proportional to capital. You can therefore just run a regression: we observe the amount of labor and capital, and Solow shows that there is not enough growth in those factors to explain U.S. growth. Instead, growth seems to be largely driven by change in A(t), what Abramovitz called “the measure of our ignorance” but which we often call “technology” or “total factor productivity”.

Well, who can see that fact, as well as the massive corporate R&D facilities of the post-war era throwing out inventions like the transistor, and not think: surely the factors that drive A(t) are endogenous, meaning “from within”, to the profit-maximizing choices of firms? If firms produce technology, what stops other firms from replicating these ideas, a classic positive externality which would lead the rate of technology in a free market to be too low? And who can see the low level of convergence of poor country incomes to rich, and not think: there must be some barrier to the spread of A(t) around the world, since otherwise the return to capital must be extraordinary in places with access to great technology, really cheap labor, and little existing capital to combine with it. And another question: if technology – productivity itself! – is endogenous, then ought we consider not just the positive externality that spills over to other firms, but also the negative externality of pollution, especially climate change, that new technologies both induce and help fix? Finally, if we know how to incentivize new technology, and how growth harms the environment, what is the best way to mitigate the great environmental problem of our day, climate change, without stopping the wondrous increase in living standards growth keeps providing? It is precisely for helping answer these questions that Romer and Nordhaus won the Nobel.

Romer and Endogenous Growth

Let us start with Paul Romer. You know you have knocked your Ph.D. thesis out of the park when the great economics journalist David Warsh writes an entire book hailing your work as solving the oldest puzzle in economics. The two early Romer papers, published in 1986 and 1990, have each been cited more than 25,000 times, which is an absolutely extraordinary number by the standards of economics.

Romer’s achievement was writing a model where inventors spend money to produce inventions with increasing returns to scale, other firms use those inventions to produce goods, and a competitive Arrow-Debreu equilibrium still exists. If we had such a model, we could investigate what policies a government might wish to pursue if it wanted to induce firms to produce growth-enhancing inventions.

Let’s be more specific. First, innovation is increasing returns to scale because ideas are nonrival. If I double the amount of labor and capital, holding technology fixed, I double output, but if I double technology, labor, and capital, I more than double output. That is, give one person a hammer, and they can build, say, one staircase a day. Give two people two hammers, and they can build two staircases by just performing exactly the same tasks. But give two people two hammers, and teach them a more efficient way to combine nail and wood, and they will be able to build more than two staircases. Second, if capital and labor are constant returns to scale and are paid their marginal product in a competitive equilibrium, then there is no output left to pay inventors anything for their ideas. That is, it is not tough to model in partial equilibrium the idea of nonrival ideas, and indeed the realization that a single invention improves productivity for all is also an old one: as Thomas Jefferson wrote in 1813, “[h]e who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.” The difficulty is figuring out how to get these positive spillovers yet still have “prices” or some sort of rent for the invention. Otherwise, why would anyone pursue costly invention?

We also need to ensure that growth is not too fast. There is a stock of existing technology in the world. I use that technology to create new innovations which grow the economy. With more people over time and more innovations over time, you may expect the growth rate to be higher in bigger and more technologically advanced societies. It is in part, as Michael Kremer points out in his One Million B.C. paper. Nonetheless, the rate of growth is not asymptotically increasing by any stretch (see, e.g., Ben Jones on this point). Indeed, growth is nearly constant, abstracting from the business cycle, in the United States, despite a big growth in population and the stock of existing technology.

Romer’s first attempt at endogenous growth was based on his thesis and published in the JPE in 1986. Here, he adds “learning by doing” to Solow: technology is a function of the capital stock A(t)=bK(t). As each firm uses capital, they generate learning which spills over to other firms. Even if population is constant, with appropriate assumptions on production functions and capital depreciation, capital, output, and technology grow over time. There is a problem here, however, and one that is common to any model based on learning-by-doing which partially spills over to other firms. As Dasgupta and Stiglitz point out, if there is learning-by-doing which only partially spills over, the industry is a natural monopoly. And even if it starts competitively, as I learn more than you, dynamically I can produce more efficiently, lower my prices, and take market share from you. A decentralized competitive equilibrium with endogenous technological growth is unsustainable!

Back to the drawing board, then. We want firms to intentionally produce technology in a competitive market as they would other goods. We want technology to be nonrival. And we want technology production to lead to growth. Learning-by-doing allows technology to spill over, but would simply lead to a monopoly producer. Pure constant-returns-to-scale competitive production, where technology is just an input like capital produced with a “nonconvexity” – only the initial inventor pays the fixed cost of invention – means that there is no output left to pay for invention once other factors get their marginal product. A natural idea, well known to Arrow 1962 and others, emerges: we need some source of market power for inventors.

Romer’s insight is that inventions are nonrival, yes, but they are also partially excludable, via secrecy, patents, or other means. In his blockbuster 1990 JPE Endogenous Technological Change, he lets inventions be given an infinite patent, but also be partially substitutable by other inventions, constraining price (this is just a Spence-style monopolistic competition model). The more inventions there are, the more efficiently final goods can be made. Future researchers can use present technology as an input to their invention for free. Invention is thus partially excludable in the sense that my exact invention is “protected” from competition, but also spills over to other researchers by making it easier for them to invent other things. Inventions are therefore neither public nor private goods, and also not “club goods” (nonrival but excludable) since inventors cannot exclude future inventors from using their good idea to motivate more invention. Since there is free entry into invention, the infinite stream of monopoly rents from inventions is exactly equal to their opportunity cost.

From the perspective of final goods producers, there are just technologies I can license as inputs, which I then use in a constant returns to scale way to produce goods, as in Solow. Every factor is paid its marginal product, but inventions are sold for more than their marginal cost due to monopolistic excludability from secrecy or patents. The model is general equilibrium, and gives a ton of insight about policy: for instance, if you subsidize capital goods, do you get more or less growth? In Romer (1986), where all growth is learning-by-doing, cheaper capital means more learning means more growth. In Romer (1990), capital subsidies can be counterproductive!

There are some issues to be worked out: the Romer models still have “scale effects” where growth is not constant, roughly true in the modern world, despite changes in population and the stock of technology (see Chad Jones’ 1995 and 1999 papers). The neo-Schumpeterian models of Aghion-Howitt and Grossman-Helpman add the important idea that new inventions don’t just add to the stock of knowledge, but also make old inventions less valuable. And really critically, the idea that institutions and not just economic fundamentals affect growth – meaning laws, culture, and so on – is a massive field of research at present. But it was Romer who first cracked the nut of how to model invention in general equilibrium, and I am unaware of any later model which solves this problem in a more satisfying way.

Nordhaus and the Economic Solution to Pollution

So we have, with Romer, a general equilibrium model for thinking about why people produce new technology. The connection with Nordhaus comes in a problem that is both caused by, and potentially solved by, growth. In 2018, even an ignoramus knows the terms “climate change” and “global warming”. This was not at all the case when William Nordhaus began thinking about how the economy and the environment interrelate in the early 1970s.

Growth as a policy goal was fairly unobjectionable as a policy goal in 1960: indeed, a greater capability of making goods, and of making war, seemed a necessity for both the Free and Soviet worlds. But by the early 1970s, environmental concerns arose. The Club of Rome warned that we were going to run out of resources if we continued to use them so unsustainably: resources are of course finite, and there are therefore “limits to growth”. Beyond just running out of resources, growth could also be harmful because of negative externalities on the environment, particularly the newfangled idea of global warming an MIT report warned about in 1970.

Nordhaus treated those ideas both seriously and skeptically. In a 1974 AER P&P, he notes that technological progress or adequate factor substitution allow us to avoid “limits to growth”. To put it simply, whales are limited in supply, and hence whale oil is as well, yet we light many more rooms than we did in 1870 due to new technologies and substitutes for whale oil. Despite this skepticism, Nordhaus does show concern for the externalities of growth on global warming, giving a back-of-the-envelope calculation that along a projected Solow-type growth path, the amount of carbon in the atmosphere will reach a dangerous 487ppm by 2030, surprisingly close to our current estimates. In a contemporaneous essay with Tobin, and in a review of an environmentalist’s “system dynamics” predictions of future economic collapse, Nordhaus reaches a similar conclusion: substitutable factors mean that running out of resources is not a huge concern, but rather the exact opposite, that we will have access to and use too many polluting resources, should worry us. That is tremendous foresight for someone writing in 1974!

Before turning back to climate change, can we celebrate again the success of economics against the Club of Rome ridiculousness? There were widespread predictions, from very serious people, that growth would not just slow but reverse by the end of the 1980s due to “unsustainable” resource use. Instead, GDP per capita has nearly doubled since 1990, with the most critical change coming for the very poorest. There would have been no greater disaster for the twentieth century than had we attempted to slow the progress and diffusion of technology, in agriculture, manufacturing and services alike, in order to follow the nonsense economics being promulgated by prominent biologists and environmental scientists.

Now, being wrong once is no guarantee of being wrong again, and the environmentalists appear quite right about climate change. So it is again a feather in the cap of Nordhaus to both be skeptical of economic nonsense, and also sound the alarm about true environmental problems where economics has something to contribute. As Nordhaus writes, “to dismiss today’s ecological concerns out of hand would be reckless. Because boys have mistakenly cried “wolf’ in the past does not mean that the woods are safe.”

Just as we can refute Club of Rome worries with serious economics, so too can we study climate change. The economy affects the climate, and the climate effects the economy. What we need an integrated model to assess how economic activity, including growth, affects CO2 production and therefore climate change, allowing us to back out the appropriate Pigouvian carbon tax. This is precisely what Nordhaus did with his two celebrated “Integrated Assessment Models”, which built on his earlier simplified models (e.g., 1975’s Can We Control Carbon Dioxide?). These models have Solow-type endogenous savings, and make precise the tradeoffs of lower economic growth against lower climate change, as well as making clear the critical importance of the social discount rate and the micro-estimates of the cost of adjustment to climate change.

The latter goes well beyond the science of climate change holding the world constant: the Netherlands, in a climate sense, should be underwater, but they use dikes to restraint the ocean. Likewise, the cost of adjusting to an increase in temperature is something to be estimated empirically. Nordhaus takes climate change very seriously, but he is much less concerned about the need for immediate action than the famous Stern report, which takes fairly extreme positions about the discount rate (1000 generations in the future are weighed the same as us, in Stern) and the costs of adjustment.

Consider the following “optimal path” for carbon from Nordhaus’ most recent run of the model, where the blue line is his optimum.

Note that he permits much more carbon than Stern or a policy which mandates temperatures stay below a 2.5 C rise forever. The reason is the costs to growth in the short term are high: the world is still very poor in many places! There was a vitriolic debate following the Stern report about who was correct: whether the appropriate social discount rate is zero or something higher is a quasi-philosophical debate going back to Ramsey (1928). But you can see here how important the calibration is.

There are other minor points of disagreement between Nordhaus and Stern, and my sense is that there has been some, though not full, convergence if their beliefs about optimal policy. But there is no disagreement whatsoever between the economic and environmental community that the appropriate way to estimate the optimal response to climate change is via an explicit model incorporating some sort of endogeneity of economic reaction to climate policy. The power of the model is that we can be extremely clear about what points of disagreement remain, and we can examine the sensitivity of optimal policy to factors like climate “tipping points”.

There is one other issue: in Nordhaus’ IAMs, and in Stern, you limit climate change by imposing cap and trade or carbon taxes. But carbon harms cross borders. How do you stop free riding? Nordhaus, in a 2015 AER, shows theoretically that there is no way to generate optimal climate abatement without sanctions for non-participants, but that relatively small trade penalties work quite well. This is precisely what Emmanuel Macron is currently proposing!

Let’s wrap up by linking Nordhaus even more tightly back to Romer. It should be noted that Nordhaus was very interested in the idea of pure endogenous growth, as distinct from any environmental concerns, from the very start of his career. His thesis was on the topic (leading to a proto-endogenous growth paper in the AER P&P in 1969), and he wrote a skeptical piece in the QJE in 1973 about the then-leading theories of what factors induce certain types of innovation (objections which I think have been fixed by Acemoglu 2002). Like Romer, Nordhaus has long worried that inventors do not receive enough of the return to their invention, and that we measure innovation poorly – see his classic NBER chapter on inventions in lighting, and his attempt to estimate how much of how much of society’s output goes to innovators.

The connection between the very frontier of endogenous growth models, and environmental IAMs, has not gone unnoticed by other scholars. Nordhaus IAMs tend to have limited incorporation of endogenous innovation in dirty or clean sectors. But a fantastic paper by Acemoglu, Aghion, Bursztyn, and Hemous combines endogenous technical change with Nordhaus-type climate modeling to suggest a middle ground between Stern and Nordhaus: use subsidies to get green energy close to the technological frontier, then use taxes once their distortion is relatively limited because a good green substitute exists. Indeed, since this paper first started floating around 8 or so years ago, massive subsidies to green energy sources like solar by many countries have indeed made the “cost” of stopping climate change much lower than if we’d relied solely on taxes, since now production of very low cost solar, and mass market electric cars, is in fact economically viable.

It may indeed be possible to solve climate change – what Stern called “the greatest market failure” man has ever seen – by changing the incentives for green innovation, rather than just by making economic growth more expensive by taxing carbon. Going beyond just solving the problem of climate change, to solving it in a way that minimizes economic harm, is a hell of an accomplishment, and more than worthy of the Nobel prizes Romer and Nordhaus won for showing us this path!

Some Further Reading

In my PhD class on innovation, the handout I give on the very first day introduces Romer’s work and why non-mathematical models of endogenous innovation mislead. Paul Romer himself has a nice essay on climate optimism, and the extent to which endogenous invention matters for how we stop global warming. On why anyone signs climate change abatement agreements, instead of just free riding, see the clever incomplete contracts insight of Battaglini and Harstad. Romer has also been greatly interested in the policy of “high-growth” places, pushing the idea of Charter Cities. Charter Cities involve Hong Kong like exclaves of a developing country where the institutions and legal systems are farmed out to a more stable nation. Totally reasonable, but in fact quite controversial: a charter city proposal in Madagascar led to a coup, and I can easily imagine that the Charter City controversy delayed Romer’s well-deserved Nobel laurel. The New York Times points out that Nordhaus’ brother helped write the Clean Air Act of 1970. Finally, as is always true with the Nobel, the official scientific summary is lucid and deep in its exploration of the two winners’ work.

Advertisements

The 2018 Fields Medal and its Surprising Connection to Economics!

The Fields Medal and Nevanlinna Prizes were given out today. They represent the highest honor possible for young mathematicians and theoretical computer scientists, and are granted only once every four years. The mathematics involved is often very challenging for outsiders. Indeed, the most prominent of this year’s winners, the German Peter Scholze, is best known for his work on “perfectoid spaces”, and I honestly have no idea how to begin explaining them aside from saying that they are useful in a number of problems in algebraic geometry (the lovely field mapping results in algebra – what numbers solve y=2x – and geometry – noting that those solutions to y=2x form a line). Two of this year’s prizes, however, the Fields given to Alessio Figalli and the Nevanlinna to Constantinos Daskalakis, have a very tight connection to an utterly core question in economics. Indeed, both of those men have published work in economics journals!

The problem of interest concerns how best to sell an object. If you are a monopolist hoping to sell one item to one consumer, where the consumer’s valuation of the object is only known to the consumer but commonly known to come from a distribution F, the mechanism that maximizes revenue is of course the Myerson auction from his 1981 paper in Math OR. The solution is simple: make a take it or leave it offer at a minimum price (or “reserve price”) which is a simple function of F. If you are selling one good and there are many buyers, then revenue is maximized by running a second-price auction with the exact same reserve price. In both cases, no potential buyer has any incentive to lie about their true valuation (the auction is “dominant strategy incentive compatible”). And further, seller revenue and expected payments for all players are identical to the Myerson auction in any other mechanism which allocates goods the same way in expectation, with minor caveats. This result is called “revenue equivalence”.

The Myerson paper is an absolute blockbuster. The revelation principle, the revenue equivalence theorem, and a solution to the optimal selling mechanism problem all in the same paper? I would argue it’s the most important result in economics since Arrow-Debreu-McKenzie, with the caveat that many of these ideas were “in the air” in the 1970s with the early ideas of mechanism design and Bayesian game theory. The Myerson result is also really worrying if you are concerned with general economic efficiency. Note that the reserve price means that the seller is best off sometimes not selling the good to anyone, in case all potential buyers have private values below the reserve price. But this is economically inefficient! We know that there exists an allocation mechanism which is socially efficient even when people have private information about their willingness to pay: the Vickrey-Clarke-Groves mechanism. This means that market power plus asymmetric information necessarily destroys social surplus. You may be thinking we know this already: an optimal monopoly price is classic price theory generates deadweight loss. But recall that a perfectly-price-discriminating monopolist sells to everyone whose willingness-to-pay exceeds the seller’s marginal cost of production, hence the only reason monopoly generates deadweight loss in a world with perfect information is that we constrain them to a “mechanism” called a fixed price. Myerson’s result is much worse: letting a monopolist use any mechanism, and price discriminate however they like, asymmetric information necessarily destroys surplus!

Despite this great result, there remain two enormous open problems. First, how should we sell a good when we will interact with the same buyer(s) in the future? Recall the Myerson auction involves bidders truthfully revealing their willingness to pay. Imagine that tomorrow, the seller will sell the same object. Will I reveal my willingness to pay truthfully today? Of course not! If I did, tomorrow the seller would charge the bidder with the highest willingness-to-pay exactly that amount. Ergo, today bidders will shade down their bids. This is called the “ratchet effect”, and despite a lot of progress in dynamic mechanism design, we have still not fully solved for the optimal dynamic mechanism in all cases.

The other challenging problem is one seller selling many goods, where willingness to pay for one good is related to willingness to pay for the others. Consider, for example, selling cable TV. Do you bundle the channels together? Do you offer a menu of possible bundles? This problem is often called “multidimensional screening”, because you are attempting to “screen” buyers such that those with high willingness to pay for a particular good actually pay a high price for that good. The optimal multidimensional screen is a devil of a problem. And it is here that we return to the Fields and Nevanlinna prizes, because they turn out to speak precisely to this problem!

What could possibly be the connection between high-level pure math and this particular pricing problem? The answer comes from the 18th century mathematician Gaspard Monge, founder of the Ecole Polytechnique. He asked the following question: what is the cheapest way to move mass from X to Y, such as moving apples from a bunch of distribution centers to a bunch of supermarkets. It turns out that without convexity or linearity assumptions, this problem is very hard, and it was not solved until the late 20th century. Leonid Kantorovich, the 1975 Nobel winner in economics, paved the way for this result by showing that there is a “dual” problem where instead of looking for the map from X to Y, you look for the probability that a given mass in Y comes from X. This dual turns out to be useful in that there exists an object called a “potential” which helps characterize the optimal transport problem solution in a much more tractable way than searching across any possible map.

Note the link between this problem and our optimal auction problem above, though! Instead of moving mass most cheaply from X to Y, we are looking to maximize revenue by assigning objects Y to people with willingness-to-pay drawn from X. So no surprise, the solution to the optimal transport problem when X has a particular structure and the solution to the revenue maximizing mechanism problem are tightly linked. And luckily for us economists, many of the world’s best mathematicians, including 2010 Fields winner Cedric Villani, and this year’s winner Alessio Figalli, have spent a great deal of effort working on exactly this problem. Ivar Ekeland has a nice series of notes explaining the link between the two problems in more detail.

In a 2017 Econometrica, this year’s Nevanlinna winner Daskalakis and his coauthors Alan Deckelbaum and Christos Tzamos, show precisely how to use strong duality in the optimal transport problem to solve the general optimal mechanism problem when selling multiple goods. The paper is very challenging, requiring some knowledge of measure theory, duality theory, and convex analysis. That said, the conditions they give to check an optimal solution, and the method to find the optimal solution, involve a reasonably straightforward series of inequalities. In particular, the optimal mechanism involves dividing the hypercube of potential types into (perhaps infinite) regions who get assigned the same prices and goods (for example, “you get good A and good B together with probability p at price X”, or “if you are unwilling to pay p1 for A, p2 for B, or p for both together, you get nothing”).

This optimal mechanism has some unusual properties. Remember that the Myerson auction for one buyer is “simple”: make a take it or leave it offer at the reserve price. You may think that if you are selling many items to one buyer, you would likewise choose a reserve price for the whole bundle, particularly when the number of goods with independently distributed values becomes large. For instance, if there are 1000 cable channels, and a buyer has value distributed uniformly between 0 and 10 cents for each channel, then by a limit theorem type argument it’s clear that the willingness to pay for the whole bundle is quite close to 50 bucks. So you may think, just price at a bit lower than 50. However, Daskalakis et al show that when there are sufficiently many goods with i.i.d. uniformly-distributed values, it is never optimal to just set a price for the whole bundle! It is also possible to show that the best mechanism often involves randomization, where buyers who report that they are willing to pay X for item a and Y for item b will only get the items with probability less than 1 at specified price. This is quite contrary to my intuition, which is that in most mechanism problems, we can restrict focus to deterministic assignment. It was well-known that multidimensional screening has weird properties; for example, Hart and Reny show that an increase in buyer valuations can cause seller revenue from the optimal mechanism to fall. The techniques Daskalakis and coauthors develop allow us to state exactly what we ought do in these situations previously unknown in the literature, such as when we know we need mechanisms more complicated than “sell the whole bundle at price p”.

The history of economics has been a long series of taking tools from the frontier of mathematics, from the physics-based analogues of the “marginalists” in the 1870s, to the fixed point theorems of the early game theorists, the linear programming tricks used to analyze competitive equilibrium in the 1950s, and the tropical geometry recently introduced to auction theory by Elizabeth Baldwin and Paul Klemperer. We are now making progress on pricing issues that have stumped some of the great theoretical minds in the history of the field. Multidimensional screening is an incredibly broad topic: how ought we regulate a monopoly with private fixed and marginal costs, how ought we tax agents who have private costs of effort and opportunities, how ought a firm choose wages and benefits, and so on. Knowing the optimum is essential when it comes to understanding when we can use simple, nearly-correct mechanisms. Just in the context of pricing, using related tricks to Daskalakis, Gabriel Carroll showed in a recent Econometrica that bundling should be avoided when the principal has limited knowledge about the correlation structure of types, and my old grad school friend Nima Haghpanah has shown, in a paper with Jason Hartline, that firms should only offer high-quality and low-quality versions of their products if consumers’ values for the high-quality good and their relative value for the low versus high quality good are positively correlated. Neither of these results are trivial to prove. Nonetheless, a hearty cheers to our friends in pure mathematics who continue to provide us with the tools we need to answer questions at the very core of economic life!

“Eliminating Uncertainty in Market Access: The Impact of New Bridges in Rural Nicaragua,” W. Brooks & K. Donovan (2018)

It’s NBER Summer Institute season, when every bar and restaurant in East Cambridge, from Helmand to Lord Hobo, is filled with our tribe. The air hums with discussions of Lagrangians and HANKs and robust estimators. And the number of great papers presented, discussed, or otherwise floating around inspires.

The paper we’re discussing today, by Wyatt Brooks at Notre Dame and Kevin Donovan at Yale SOM, uses a great combination of dynamic general equilibrium theory and a totally insane quasi-randomized experiment to help answer an old question: how beneficial is it for villages to be connected to the broader economy? The fundamental insight requires two ideas that are second nature for economists, but are incredibly controversial outside our profession.

First, going back to Nobel winner Arthur Lewis if not much earlier, economists have argued that “structural transformation”, the shift out of low-productivity agriculture to urban areas and non-ag sectors, is fundamental to economic growth. Recent work by Hicks et al is a bit more measured – the individuals who benefit from leaving agriculture generally already have, so Lenin-type forced industrialization is a bad idea! – but nonetheless barriers to that movement are still harmful to growth, even when those barriers are largely cultural as in the forthcoming JPE by Melanie Morton and the well-named Gharad Bryan. What’s so bad about the ag sector? In the developing world, it tends to be small-plot, quite-inefficient, staple-crop production, unlike the growth-generating positive-externality-filled, increasing-returns-type sectors (on this point, Romer 1990). There are zero examples of countries becoming rich without their labor force shifting dramatically out of agriculture. The intuition of many in the public, that Gandhi was right about the village economy and that structural transformation just means dreadful slums, is the intuition of people who lack respect for individual agency. The slums may be bad, but look how they fill up everywhere they exist! Ergo, how bad must the alternative be?

The second related misunderstanding of the public is that credit is unimportant. For folks near subsistence, the danger of economic shocks pushing you near that dangerous cutpoint is so fundamental that it leads to all sorts of otherwise odd behavior. Consider the response of my ancestors (and presumably the author of today’s paper’s ancestors, given that he is a Prof. Donovan) when potato blight hit. Potatoes are an input to growing more potatoes tomorrow, but near subsistence, you have no choice but to eat your “savings” away after bad shocks. This obviously causes problems in the future, prolonging the famine. But even worse, to avoid getting in a situation where you eat all your savings, you save more and invest less than you otherwise would. Empirically, Karlan et al QJE 2014 show large demand for savings instruments in Ghana, and Cynthia Kinnan shows why insurance markets in the developing world are incomplete despite large welfare gains. Indeed, many countries, including India, make it illegal to insure oneself against certain types of negative shocks, as Mobarak and Rosenzweig show. The need to save for low probability, really negative, shocks may even lead people to invest in assets with highly negative annual returns; on this, see the wonderfully-titled Continued Existence of Cows Disproves Central Tenets of Capitalism? This is all to say: the rise of credit and insurance markets unlocks much more productive activity, especially in the developing world, and it is not merely the den of exploitative lenders.

Ok, so insurance against bad shocks matters, and getting out of low-productivity agriculture may matter as well. Let’s imagine you live in a tiny village which is often separated from bigger towns, geographically. What would happen if you somehow lowered the cost of reaching those towns? Well, we’d expect goods-trade to radically change – see the earlier post on Dave Donaldson’s work, or the nice paper on Brazilian roads by Morten and Oliveria. But the benefits of reducing isolation go well beyond just getting better prices for goods.

Why? In the developing world, most people have multiple jobs. They farm during the season, work in the market on occasion, do construction, work as a migrant, and so on. Imagine that in the village, most jobs are just farmwork, and outside, there is always the change for day work at a fixed wage. In autarky, I just work on the farm, perhaps my own. I need to keep a bunch of savings because sometimes farms get a bunch of bad shocks: a fire burns my crops, or an elephant stomps on them. Running out of savings risks death, and there is no crop insurance, so I save precautionarily. Saving means I don’t have as much to spend on fertilizer or pesticide, so my yields are lower.

If I can access the outside world, then when my farm gets bad shocks and my savings runs low, I leave the village and take day work to build them back up. Since I know I will have that option, I don’t need to save as much, and hence I can buy more fertilizer. Now, the wage for farmers in the village (including the implicit wage that would keep me on my own farm) needs to be higher since some of these ex-farmers will go work in town, shifting village labor supply left. This higher wage pushes the amount of fertilizer I will buy down, since high wages reduce the marginal productivity of farm improvements. Whether fertilizer use goes up or down is therefore an empirical question, but at least we can say that those who use more fertilizer, those who react more to bad shocks by working outside the village, and those whose savings drops the most should be the same farmers. Either way, the village winds up richer both for the direct reason of having an outside option, and for the indirect reason of being able to reduce precautionary savings. That is, the harm is coming both from the first moment, the average shock to agricultural productivity, but also the second moment, its variance.

How much does this matter is practice? Brooks and Donovan worked with a NGO that physically builds bridges in remote areas. In Nicaragua, floods during the harvest season are common, isolating villages for days at a time when the riverbed along the path to market turns into a raging torrent. In this area, bridges are unnecessary when the riverbed is dry: the land is fairly flat, and the bridge barely reduces travel time when the riverbed isn’t flooded. These floods generally occur exactly during the growing season, after fertilizer is bought, but before crops are harvested, so the goods market in both inputs and outputs is essentially unaffected. And there is nice quasirandom variation: of 15 villages which the NGO selected as needing a bridge, 9 were ruled out after a visit by a technical advisor found the soil and topography unsuitable for the NGO’s relatively inexpensive bridge.

The authors survey villages the year before and the two years after the bridges are built, as well as surveying a subset of villagers with cell phones every two weeks in a particular year. Although N=15 seems worrying for power, the within-village differences in labor market behavior are sufficient that properly bootstrapped estimates can still infer interesting effects. And what do you find? Villages with bridges have many men shift from working in the village to outside in a given week, the percentage of women working outside nearly doubles with most of the women entering the labor force in order to work, the wages inside the village rise while wages outside the village do not, the use of fertilizer rises, village farm profits rise 76%, and the effect of all this is most pronounced on poorer households physically close to the bridge.

All this is exactly in line with the dynamic general equilibrium model sketched out above. If you assumed that bridges were just about market access for goods, you would have missed all of this. If you assumed the only benefit was additional wages outside the village, you would miss a full 1/3 of the benefit: the general equilibrium effect of shifting out workers who are particularly capable working outside the village causes wages to rise for the farm workers who remain at home. These particular bridges show an internal rate of return of nearly 20% even though they do nothing to improve market access for either inputs and outputs! And there are, of course, further utility benefits from reducing risk, even when that risk reduction does not show up in income through the channel of increased investment.

November 2017 working paper, currently R&R at Econometrica (RePEc IDEAS version. Both authors have a number of other really interesting drafts, of which I’ll mention two. Brooks, in a working paper with Joseph Kaposki and Yao Li, identify a really interesting harm of industrial clusters, but one that Adam Smith would have surely identified: they make collusion easier. Put all the firms in an industry in the same place, and establish regular opportunities for their managers to meet, and you wind up getting much less variance in markups than firms which are induced to locate in these clusters! Donovan, in a recent RED with my friend Chris Herrington, calibrates a model to explain why both college attendance and the relative cognitive ability of college grads rose during the 20th century. It’s not as simple as you might think: a decrease in costs, through student loans of otherwise, only affects marginal students, who are cognitively worse than the average existing college student. It turns out you also need a rising college premium and more precise signals of high schoolers’ academic abilities to get both patterns. Models doing work to extract insight from data – as always, this is the fundamental reason why economics is the queen of the social sciences.

“The First Patent Litigation Explosion,” C. Beauchamp (2016)

There has been a tremendous rise in patent litigation in the past decade (Bessen and Meurer 2005). Many of these lawsuits have come from “non-practicing entities” – also known as patent trolls – who use their patents to sue even as they produce no products themselves. These lawsuits are often targeted at end-users rather than directly infringing manufacturers, supposedly on the grounds that end-users are less able to defend themselves (see my coauthor Erik Hovenkamp on this point). For those who feel the patent system provides too many rights to patent-holders, to the detriment of societal welfare, problems like these are case in point.

But are these worries novel? The economics of innovation and entrepreneurship is, like much of economics, one where history proves illuminating. Nearly everything that we think is new has happened before. Fights over the use of IP to collude by incumbents? See Lampe and Moser 2010 JEH. The importance of venture capital to local ecosystems? In the late 19th century, this was true in the boomtown of Cleveland, as Lamoreaux and her coauthors showed in a 2006 C&S (as to why Cleveland declines as an innovative center, they have a nice paper on that topic as well). The role of patent brokers and other intermediaries? These existed in the 19th century! Open source invention in the early days of a new industry? Tales from the rise of the porter style of beer in the 18th century are not terribly different from the Homebrew Computer Club that led to the personal computer industry. Tradeoffs between secrecy, patenting, and alternative forms of protection? My colleague Alberto Galasso shows that this goes back to Renaissance Italy!

Given these examples, it should not be surprising that the recent boom in patent litigation is a historical rerun. Christopher Beauchamp of Brooklyn Law School, in a 2016 article in the Yale Law Journal, shows that all of the problems with patent litigation mentioned above are not new: indeed, the true heyday of patent litigation was not the 2010s, but the late 1800s! Knowing the number of lawsuits filed, not just the number litigated to decision, requires painstaking archival research. Having dug up these old archives, Beauchamp begins with a striking fact: the Southern District of New York alone had as many total patent lawsuits filed in 1880 as any district in 2010, and on a per patent basis had an order of magnitude more lawsuits. These legal battles were often virulent. For instance, Charles Goodyear’s brother held patents for the use of rubber in dentistry, using attractive young women to find dentists using the technique without a license. The aggressive legal strategy ended only when the Vulcanite Company’s hard-charging treasurer was murdered in San Francisco by a desperate dentist!

These lawsuits were not merely battles between the Apples and Samsungs of the day, but often involved lawsuits demanding small license payments from legally unsophisticated users. Iowa Senator Samuel Kirkwood: patentholders “say to each [farmer], ‘Sir, pay me so much a mile or so much a rod for the wire…or you must go to Des Moines…and defend a suit to be brought against you, the cost of which and the fees in which will in themselves be more than I demand of you…[O]ur people are paying day by day $10, $15, $20, when they do not know a particle more whether they owe the man a dollar or a cent…but paying the money just because it is cheaper to do it than to defend a suit.” Some of these lawsuits were legitimate, but many were making claims far beyond the scope of what a court would consider infringement, just as in the case of patent troll lawsuits today. Also like today, farmers and industry associations formed joint litigation pools to challenge what they considered weak patents.

In an echo of complaints about abuse of the legal system and differential costs of filing lawsuits compared to defending oneself, consider Minnesota Senator William Windom’s comments: “[B]y the authority of the United States you may go to the capital of a State and for a claim of $5 each you may send the United States marshal to a thousand men, or ten thousand…and compel them to travel hundreds of miles to defend against your claim, or, as more frequently occurs, to pay an unjust demand as the cheapest way of meeting it.” Precisely the same complaint applies to modern patent battles.

A question of great relevance to our modern patent litigation debate therefore is immediate: Why did these scattershot individual lawsuits eventually fade away in the late 1800s? Beauchamp is equivocal here, but notes that judicial hostility toward the approach may have decreased win rates, and hence the incentive to file against small, weak defendants. Further, the rise of the modern corporation (see Alfred Chandler’s Scale and Scope) in the late 19th century changed the necessity of sublicensing inventions to local ligitating attorneys, rather that just suing large infringing manufacturers directly.

Of course, not everything historic is a mirror of the present. A major source of patent litigation in the mid-1800s involved patent reissues. Essentially, a patent would be granted with weak scope. An industry would rise up using related non-infringing technology. A sophisticated corporation would buy the initial patent, then file for a “reissue” which expanded the scope of the patent to cover many technologies then in use. Just as “submarine” patents, held secretly in application while an industry grows, are a major problem recently, patent reissues led to frequent 19th century complaints, until changes in jurisprudence in the late 1800s led to greatly decreased deference to the reissued patent.

What does this history tell us about modern innovation policy? As Beauchamp discusses, “[t]o a modern observer, the content of the earlier legal and regulatory reactions can seem strikingly familiar. Many of the measures now proposed or attempted as solutions for the ills of modern patent litigation were proposed or attempted in the nineteenth century as well.” To the extent we are worried about how to stop “patent trolls” from enforcing weak patents against unsophisticated end-users, we ought look at how our 19th century forebears handled the weak barbed wire and well patents filed against small-town farmers. With the (often economically-illiterate) rise of the “Hipster Antitrust” ideas of Lina Khan and her compatriots, will the intersection of patent and antitrust law move from today’s “technocratic air” – Beauchamp’s phrase – to the more political battleground of the 19th century? And indeed, for patent skeptics like myself, how are we to reconcile the litigious era of patenting of 1850-1880 with the undisputed fact that this period was dead in the heart of the Second Industrial Revolution, the incredible rise of electricity and modern chemicals inventions that made the modern world?

Full article is in Yale Law Journal, Feb. 2016.

The 2018 John Bates Clark: Parag Pathak

The most prestigious award a young researcher can receive in economics, the John Bates Clark medal, has been awarded to the incredibly productive Parag Pathak. His CV is one that could only belong a true rocket of a career: he was a full professor at MIT 11 years after he started his PhD, finishing the degree in four years before going to Paul Samuelson route through the Harvard Society of Fellows to become the top young theorist in Cambridge.

Pathak is best known for his work on the nature and effects of how students are allocated to schools. This is, of course, an area where theorists have had incredible influence on public policy, notably via Pathak’s PhD Advisor, the Nobel prize winner Al Roth, the group of Turkish researchers including Atila Abdulkadiroglu, Utku Unver, and Tayfun Sonmez, as well as the work of 2015 Clark medal winner Roland Fryer. Indeed, this group’s work on how to best allocate students to schools in an incentive-compatible way – that is, in a way where parents need only truthfully state which schools they like best – was adopted by the city of Boston, to my knowledge the first-time this theoretically-optimal mechanism was used by an actual school district. As someone born in Boston’s contentious Dorchester neighborhood, it is quite striking how much more successful this reform was than the busing policies of the 1970s which led to incredible amounts of bigoted pushback.

Consider the old “Boston mechanism”. Parents list their preferred schools in order. Everyone would be allocated their first choice if possible. If a school is oversubscribed, some random percentage get their second choice, and if still oversubscribed, their third, and so on. This mechanism gives clear reason for strategic manipulation: you certainly don’t want to list a very popular school as your second choice, since there is almost no chance that it won’t fill up with first choices. The Boston mechanism was replaced by the Gale-Shapley mechanism, which has the property that it is never optimal for a parent to misrepresent preferences. In theory, not only is this more efficient but it is also fairer: parents who do not have access to sophisticated neighbors helping them game the system are, you might reason, most likely to lose out. And this is precisely what Pathak and Sonmez show in a theoretical model: sophisticated parents may prefer the old Boston mechanism because it makes them better off at the expense of the less sophisticated! The latter concern is a tough one for traditional mechanism design to handle, as we generally assume that agents act in their self-interest, including taking advantage of the potential for strategic manipulation. There remains some debate about what is means for a mechanism to be “better” when some agents are unsophisticated or when they do not have strict preferences over all options.

Competition for students may also affect the quality of underlying schools, either because charter schools and for-profits compete on a profit-maximizing basis, or because public schools somehow respond to the incentive to get good students. Where Pathak’s work is particularly rigorous is that he notes how critical both the competitive environment and the exact mechanism for assigning students are for the responsiveness of schools. It is not that “charters are good” or “charters are bad” or “test schools produce X outcome”, but rather the conditionality of these statements on how the choice and assignment mechanism works. A pure empirical study found charters in Boston performed much better for lottery-selected students than public schools, and that attending an elite test school in Boston or New York doesn’t really affect student outcomes. Parents appear unable to evaluate school quality except in terms of peer effects, which can be particularly problematic when good peers enroll is otherwise bad schools.

Pathak’s methodological approach is refreshing. He has a theorist’s toolkit and an empiricist’s interest in policy. For instance, imagine we want to know how well charter schools perform. The obvious worry is that charter students are better than the average student. Many studies take advantage of charter lotteries, where oversubscribed schools assign students from the application class by lottery. This does not identify the full effect of charters, however: whether I enter a lottery at all depends on how I ranked schools, and hence participants in a lottery for School A versus School B are not identical. Pathak, Angrist and Walters show how to combine the data from specific school choice mechanisms with the lottery in a way that ensures we are comparing like-to-like when evaluating lotteries at charters versus non-charters. In particular, they find that Denver’s charter schools do in fact perform better.

Indeed, reading through Pathak’s papers this afternoon, I find myself struck by how empirical his approach has become over time: if a given question requires the full arsenal of choice, he deploys it, and if it is unnecessary, he estimates things in a completely reduced form way. Going beyond the reduced form often produces striking results. For instance, what would be lost if affirmative action based on race were banned in school choice? Chicago’s tier-based plans, where many seats at test schools were reserved for students from low-SES tiers, works dreadfully: not only does it not actually select low-SES students (high-SES students in low-income neighborhoods are selected), but it would require massively dropping entrance score criteria to get a pre-established number of black and hispanic students to attend. This is particularly true for test schools on the north side of the city. Answering the question of what racial representation looks like in a counterfactual world where Chicago doesn’t have to use SES-based criteria to indirectly choose students to get a given racial makeup, and the question of whether the problem is Chicago’s particular mechanism or whether it is fundamental to any location-based selection mechanism, requires theory, and Pathak deploys it wonderfully. Peng Shi and Pathak also do back-testing on their theory-based discrete-choice predictions of the impact of a change in Boston’s mechanism meant to reduce travel times, showing that to the extent the model missed, it was because the student characteristics were unexpected, not because there were not underlying structural preferences. If we are going to deploy serious theoretical methods to applied questions, rather than just to thought experiments, this type of rigorous combination of theory, design-based empirics, and back-testing is essential.

In addition to his education research, Pathak has also contributed, alongside Fuhito Kojima, to the literature on large matching markets. The basic idea is the following. In two-sided matching, where both sides have preferences like in a marriage market, there are no stable matching mechanisms where both sides want to report truthfully. For example, if you use the mechanism currently in place in Boston where students rank schools, schools themselves have the incentive to manipulate the outcome by changing how many slots they offer. Pathak and Kojima show that when the market is large, it is (in a particular sense) an equilibrium for both sides to act truthfully; roughly, even if I screw one student I don’t want out of a slot, in a thick market it is unlikely I wind up with a more-preferred student to replace them. There has more recently been a growing literature on what really matters in matching markets: is it the stability of the matching mechanism, or the thickness of the market, or the timing, and so on.

This award strikes me as the last remaining award, at least in the near term, from the matching/market design boom of the past 20 years. As Becker took economics out of pure market transactions and into a wider world of rational choice under constraints, the work of Al Roth and his descendants, including Parag Pathak, has greatly expanded our ability to take advantage of choice and local knowledge in situations like education and health where, for many reasons, we do not use the price mechanism. That said, there remains quite a bit to do on understanding how to get the benefits of decentralization without price – I am deeply interested in this question when it comes to innovation policy – and I don’t doubt that two decades from now, continued inquiry along these lines will have fruitfully exploited the methods and careful technique that Parag Pathak embodies.

One final note, and this in no way takes away from how deserving Pathak and other recent winners have been. Yet: I would be remiss if I didn’t point out, again, how unusually “micro” the Clark medal has been of late. There literally has not been a pure macroeconomist or econometrician – two of the three “core” fields of economics – since 1999, and only Donaldson and Acemoglu are even arguably close. Though the prize has gone to three straight winners with a theoretical bent, at least in part, the prize is still not reflecting our field as a whole. Nothing for Emi Nakamura, or Victor Chernozhukov, or Emmanuel Farhi, or Ivan Werning, or Amir Sufi, or Chad Syverson, or Marc Melitz? These folks are incredibly influential on our field as a whole, and the Clark medal is failing to reflect the totality of what economists actually do.

The 2017 Nobel: Richard Thaler

A true surprise this morning: the behavioral economist Richard Thaler from the University of Chicago has won the Nobel Prize in economics. It is not a surprise because it is undeserving; rather, it is a surprise because only four years ago, Thaler’s natural co-laureate Bob Shiller won while Thaler was left the bridesmaid. But Thaler’s influence on the profession, and the world, is unquestionable. There are few developed governments who do not have a “nudge” unit of some sort trying to take advantage of behavioral nudges to push people a touch in one way or another, including here in Ontario via my colleagues at BEAR. I will admit, perhaps under the undue influence of too many dead economists, that I am skeptical of nudging and behavioral finance on both positive and normative grounds, so this review will be one of friendly challenge rather than hagiography. I trust that there will be no shortage of wonderful positive reflections on Thaler’s contribution to policy, particularly because he is the rare economist whose work is totally accessible to laymen and, more importantly, journalists.

Much of my skepticism is similar to how Fama thinks about behavioral finance: “I’ve always said they are very good at describing how individual behavior departs from rationality. That branch of it has been incredibly useful. It’s the leap from there to what it implies about market pricing where the claims are not so well-documented in terms of empirical evidence.” In other words, surely most people are not that informed and not that rational much of the time, but repeated experience, market selection, and other aggregative factors mean that this irrationality may not matter much for the economy at large. It is very easy to claim that since economists model “agents” as “rational”, we would, for example, “not expect a gift on the day of the year in which she happened to get married, or be born” and indeed “would be perplexed by the idea of gifts at all” (Thaler 2015). This type of economist caricature is both widespread and absurd, I’m afraid. In order to understand the value of Thaler’s work, we ought first look at situations where behavioral factors matter in real world, equilibrium decisions of consequence, then figure out how common those situations are, and why.

The canonical example of Thaler’s useful behavioral nudges is his “Save More Tomorrow” pension plan, with Benartzi. Many individuals in defined contribution plans save too little, both because they are not good at calculating how much they need to save and because they are biased toward present consumption. You can, of course, force people to save a la Singapore, but we dislike these plans because individuals vary in their need and desire for saving, and because we find the reliance on government coercion to save heavy-handed. Alternatively, you can default defined-contribution plans to involve some savings rate, but it turns out people do not vary their behavior from the default throughout their career, and hence save too little solely because they didn’t want too much removed from their first paycheck. Thaler and Benartzi have companies offer plans where you agree now to having your savings rate increased when you get raises – for instance, if your salary goes up 2%, you will have half of that set into a savings plan tomorrow, until you reach a savings rate that is sufficiently high. In this way, no one takes a nominal post-savings paycut. People can, of course, leave this plan whenever they want. In their field experiments, savings rates did in fact soar (with takeup varying hugely depending on how information about the plan was presented), and attrition in the future from the plan was low.

This policy is what Thaler and Sunstein call “libertarian paternalism”. It is paternalistic because, yes, we think that you may make bad decisions from your own perspective because you are not that bright, or because you are lazy, or because you have many things which require your attention. It is libertarian because there is no compulsion, in that anyone can opt out at their leisure. Results similar to Thaler and Benartzi’s have found by Ashraf et al in a field experiment in the Philippines, and by Karlan et al in three countries where just sending reminder messages which make savings goals more salient modestly increase savings.

So far, so good. We have three issues to unpack, however. First, when is this nudge acceptable on ethical grounds? Second, why does nudging generate such large effects here, and if the effects are large, why doesn’t the market simply provide them? Third, is the 401k savings case idiosyncratic or representative? The idea that the homo economicus, rational calculator, misses important features of human behavior, and would do with some insights from psychology, is not new, of course. Thaler’s prize is, at minimum, the fifth Nobel to go to someone pushing this general idea, since Herb Simon, Maurice Allais, Daniel Kahneman, and the aforementioned Bob Shiller have all already won. Copious empirical evidence, and indeed simple human observation, implies that people have behavioral biases, that they are not perfectly rational – as Thaler has noted, we see what looks like irrationality even in the composition of 100 million dollar baseball rosters. The more militant behavioralists insist that ignoring these psychological factors is unscientific! And yet, and yet: the vast majority of economists, all of whom are by now familiar with these illustrious laureates and their work, still use fairly standard expected utility maximizing agents in nearly all of our papers. Unpacking the three issues above will clarify how that could possibly be so.

Let’s discuss ethics first. Simply arguing that organizations “must” make a choice (as Thaler and Sunstein do) is insufficient; we would not say a firm that defaults consumers into an autorenewal for a product they rarely renew when making an active choice is acting “neutrally”. Nudges can be used for “good” or “evil”. Worse, whether a nudge is good or evil depends on the planner’s evaluation of the agent’s “inner rational self”, as Infante and Sugden, among others, have noted many times. That is, claiming paternalism is “only a nudge” does not excuse the paternalist from the usual moral philosophic critiques! Indeed, as Chetty and friends have argued, the more you believe behavioral biases exist and are “nudgeable”, the more careful you need to be as a policymaker about inadvertently reducing welfare. There is, I think, less controversy when we use nudges rather than coercion to reach some policy goal. For instance, if a policymaker wants to reduce energy usage, and is worried about distortionary taxation, nudges may (depending on how you think about social welfare with non-rational preferences!) be a better way to achieve the desired outcomes. But this goal is very different from the common justification that nudges somehow are pushing people toward policies they actually like in their heart of hearts. Carroll et al have a very nice theoretical paper trying to untangle exactly what “better” means for behavioral agents, and exactly when the imprecision of nudges or defaults given our imperfect knowledge of individual’s heterogeneous preferences makes attempts at libertarian paternalism worse than laissez faire.

What of the practical effects of nudges? How can they be so large, and in what contexts? Thaler has very convincingly shown that behavioral biases can affect real world behavior, and that understanding those biases means two policies which are identical from the perspective of a homo economicus model can have very different effects. But many economic situations involve players doing things repeatedly with feedback – where heuristics approximated by rationality evolve – or involve players who “perform poorly” being selected out of the game. For example, I can think of many simple nudges to get you or me to play better basketball. But when it comes to Michael Jordan, the first order effects are surely how well he takes cares of his health, the teammates he has around him, and so on. I can think of many heuristics useful for understanding how simple physics will operate, but I don’t think I can find many that would improve Einstein’s understanding of how the world works. The 401k situation is unusual because it is a decision with limited short-run feedback, taken by unsophisticated agents who will learn little even with experience. The natural alternative, of course, is to have agents outsource the difficult parts of the decision, to investment managers or the like. And these managers will make money by improving people’s earnings. No surprise that robo-advisors, index funds, and personal banking have all become more important as defined contribution plans have become more common! If we worry about behavioral biases, we ought worry especially about market imperfections that prevent the existence of designated agents who handle the difficult decisions for us.

The fact that agents can exist is one reason that irrationality in the lab may not translate into irrationality in the market. But even without agents, we might reasonably be suspect of some claims of widespread irrationality. Consider Thaler’s famous endowment effect: how much you are willing to pay for, say, a coffee mug or a pen is much less than how much you would accept to have the coffee mug taken away from you. Indeed, it is not unusual in a study to find a ratio of three times or greater between the willingness to pay and willingness to accept amount. But, of course, if these were “preferences”, you could be money pumped (see Yaari, applying a theorem of de Finetti, on the mathematics of the pump). Say you value the mug at ten bucks when you own it and five bucks when you don’t. Do we really think I can regularly get you to pay twice as much by loaning you the mug for free for a month? Do we see car companies letting you take a month-long test drive of a $20,000 car then letting you keep the car only if you pay $40,000, with some consumers accepting? Surely not. Now the reason why is partly what Laibson and Yariv argue, that money pumps do not exist in competitive economies since market pressure will compete away rents: someone else will offer you the car at $20,000 and you will just buy from them. But even if the car company is a monopolist, surely we find the magnitude of the money pump implied here to be on face ridiculous.

Even worse are the dictator games introduced in Thaler’s 1986 fairness paper. Students were asked, upon being given $20, whether they wanted to give an anonymous student half of their endowment or 10%. Many of the students gave half! This experiment has been repeated many, many times, with similar effects. Does this mean economists are naive to neglect the social preferences of humans? Of course not! People are endowed with money and gifts all the time. They essentially never give any of it to random strangers – I feel confident assuming you, the reader, have never been handed some bills on the sidewalk by an officeworker who just got a big bonus! Worse, the context of the experiment matters a ton (see John List on this point). Indeed, despite hundreds of lab experiments on dictator games, I feel far more confident predicting real world behavior following windfalls if we use a parsimonious homo economicus model than if we use the results of dictator games. Does this mean the games are useless? Of course not – studying what factors affect other-regarding preferences is interesting, and important. But how odd to have a branch of our field filled with people who see armchair theorizing of homo economicus as “unscientific”, yet take lab experiments so literally even when they are so clearly contrary to data?

To take one final example, consider Thaler’s famous model of “mental accounting”. In many experiments, he shows people have “budgets” set aside for various tasks. I have my “gas budget” and adjust my driving when gas prices change. I only sell stocks when I am up overall on that stock since I want my “mental account” of that particular transaction to be positive. But how important is this in the aggregate? Take the Engel curve. Budget shares devoted to food fall with income. This is widely established historically and in the cross section. Where is the mental account? Farber (2008 AER) even challenges the canonical account of taxi drivers working just enough hours to make their targeted income. As in the dictator game and the endowment effect, there is a gap between what is real, psychologically, and what is consequential enough to be first-order in our economic understanding of the world.

Let’s sum up. Thaler’s work is brilliant – it is a rare case of an economist taking psychology seriously and actually coming up with policy-relevant consequences like the 401k policy. But Thaler’s work is also dangerous to young economists who see biases everywhere. Experts in a field, and markets with agents and mechanisms and all the other tricks they develop, are very very good at ferreting out irrationality, and economists core skill lies in not missing those tricks.

Some remaining bagatelles: 1) Thaler and his PhD advisor, Sherwin Rosen, have one of the first papers on measuring the “statistical” value of a life, a technique now widely employed in health economics and policy. 2) Beyond his academic work, Thaler has won a modicum of fame as a popular writer (Nudge, written with Cass Sunstein, is canonical here) and for his brief turn as an actor alongside Selena Gomez in “The Big Short”. 3) Dick has a large literature on “fairness” in pricing, a topic which goes back to Thomas Aquinas, if not earlier. Many of the experiments Thaler performs, like the thought experiments of Aquinas, come down to the fact that many perceive market power to be unfair. Sure, I agree, but I’m not sure there’s much more that can be learned than this uncontroversial fact. 4) Law and econ has been massively influenced by Thaler. As a simple example, if endowment effects are real, then the assignment of property rights matters even when there are no transaction costs. Jolls et al 1998 go into more depth on this issue. 5) Thaler’s precise results in so-called behavioral finance are beyond my area of expertise, so I defer to John Cochrane’s comments following the 2013 Nobel. Eugene Fama is, I think, correct when he suggests that market efficiency generated by rational traders with risk aversion is the best model we have of financial behavior, where best is measured by “is this model useful for explaining the world.” The number of behavioral anomalies at the level of the market which persist and are relevant in the aggregate do not strike me as large, while the number of investors and policymakers who make dreadful decisions because they believe markets are driven by behavioral sentiments is large indeed!

“Resetting the Urban Network,” G. Michaels & F. Rauch (2017)

Cities have two important properties: they are enormously consequential for people’s economic prosperity, and they are very sticky. That stickiness is twofold: cities do not change their shape rapidly in response to changing economic or technological opportunities (consider, e.g., Hornbeck and Keniston on the positive effects of the Great Fire of Boston), and people are hesitant to leave their existing non-economic social network (Deryagina et al show that Katrina victims, a third of whom never return to New Orleans, are materially better off as soon as three years after the hurricane, earning more and living in less expensive cities; Shoag and Carollo find that Japanese-Americans randomly placed in internment camps in poor areas during World War 2 see lower incomes and children’s educational outcomes even many years later).

A lot of recent work in urban economics suggests that the stickiness of cities is getting worse, locking path dependent effects in with even more vigor. A tour-de-force by Shoag and Ganong documents that income convergence across cities in the US has slowed since the 1970s, that this only happened in cities with restrictive zoning rules, and that the primary effect has been that as land use restrictions make housing prices elastic to income, working class folks no longer move from poor to rich cities because the cost of housing makes such a move undesirable. Indeed, they suggest a substantial part of growing income inequality, in line with work by Matt Rognlie and others, is due to the fact that owners of land have used political means to capitalize productivity gains into their existing, tax-advantaged asset.

Now, one part of urban stickiness over time may simply be reflecting that certain locations are very productive, that they have a large and valuable installed base of tangible and intangible assets that make their city run well, and hence we shouldn’t be surprised to see cities retain their prominence and nature over time. So today, let’s discuss a new paper by Michaels and Rauch which uses a fantastic historical case to investigate this debate: the rise and fall of the Roman Empire.

The Romans famously conquered Gaul – today’s France – under Caesar, and Britain in stages up through Hadrian (and yes, Mary Beard’s SPQR is worthwhile summer reading; the fact that Nassim Taleb and her do not get along makes it even more self-recommending!). Roman cities popped up across these regions, until the 5th century invasions wiped out Roman control. In Britain, for all practical purposes the entire economic network faded away: cities hollowed out, trade came to a stop, and imports from outside Britain and Roman coin are near nonexistent in the archaeological record for the next century and a half. In France, the network was not so cleanly broken, with Christian bishoprics rising in many of the old Roman towns.

Here is the amazing fact: today, 16 of France’s 20 largest cities are located on or near a Roman town, while only 2 of Britain’s 20 largest are. This difference existed even back in the Middle Ages. So who cares? Well, Britain’s cities in the middle ages are two and a half times more likely to have coastal access than France’s cities, so that in 1700, when sea trade was hugely important, 56% of urban French lived in towns with sea access while 87% of urban Brits did. This is even though, in both countries, cities with sea access grew faster and huge sums of money were put into building artificial canals. Even at a very local level, the France/Britain distinction holds: when Roman cities were within 25km of the ocean or a navigable river, they tended not to move in France, while in Britain they tended to reappear nearer to the water. The fundamental factor for the shift in both places was that developments in shipbuilding in the early middle ages made the sea much more suitable for trade and military transport than the famous Roman Roads which previously played that role.

Now the question, of course, is what drove the path dependence: why didn’t the French simply move to better locations? We know, as in Ganong and Shoag’s paper above, that in the absence of legal restrictions, people move toward more productive places. Indeed, there is a lot of hostility to the idea of path dependence more generally. Consider, for example, the case of the typewriter, which “famously” has its QWERTY form because of an idiosyncracy in the very early days of the typewriter. QWERTY is said to be much less efficient than alternative key layouts like Dvorak. Liebowitz and Margolis put this myth to bed: not only is QWERTY fairly efficient (you can think much faster than you can type for any reasonable key layout), but typewriting companies spent huge amounts of money on training schools and other mechanisms to get secretaries to switch toward the companies’ preferred keyboards. That is, while it can be true that what happened in the past matters, it is also true that there are many ways to coordinate people to shift to a more efficient path if a suitable large productivity improvement exists.

With cities, coordinating on the new productive location is harder. In France, Michaels and Rauch suggest that bishops and the church began playing the role of a provider of public goods, and that the continued provision of public goods in certain formerly-Roman cities led them to grow faster than they otherwise would have. Indeed, Roman cities in France with no bishop show a very similar pattern to Roman cities in Britain: general decline. That sunk costs and non-economic institutional persistence can lead to multiple steady states in urban geography, some of which are strictly worse, has been suggested in smaller scale studies (e.g., Redding et al RESTAT 2011 on Germany’s shift from Berlin to Frankfurt, or the historical work of Engerman and Sokoloff).

I loved this case study, and appreciate the deep dive into history that collecting data on urban locations over this period required. But the implications of this literature broadly are very worrying. Much of the developed world has, over the past forty years, pursued development policies that are very favorable to existing landowners. This has led to stickiness which makes path dependence more important, and reallocation toward more productive uses less likely, both because cities cannot shift their geographic nature and because people can’t move to cities that become more productive. We ought not artificially wind up like Dijon and Chartres in the middle ages, locking our population into locations better suited for the economy of the distant past.

2016 working paper (RePEc IDEAS). Article is forthcoming in Economic Journal. With incredible timing, Michaels and Rauch, alongside two other coauthors, have another working paper called Flooded Cities. Essentially, looking across the globe, there are frequent very damaging floods, occurring every 20 years or so in low-lying areas of cities. And yet, as long as those areas are long settled, people and economic activity simply return to those areas after a flood. Note this is true even in countries without US-style flood insurance programs. The implication is that the stickiness of urban networks, amenities, and so on tends to be very strong, and if anything encouraged by development agencies and governments, yet this stickiness means that we wind up with many urban neighborhoods, and many cities, located in places that are quite dangerous for their residents without any countervailing economic benefit. You will see their paper in action over the next few years: despite some neighborhoods flooding three times in three years, one can bet with confidence that population and economic activity will remain on the floodplains of Houston’s bayou. (And in the meanwhile, ignoring our worries about future economic efficiency, I wish only the best for a safe and quick recovery to friends and colleagues down in Houston!)

Two New Papers on Militarized Police

The so-called militarization of police has become a major issue both in libertarian policy circles and in the civil rights community. Radley Balko has done yeoman’s work showing the harms, including outrageous civil liberty violations, generated by the use of military-grade armor and weapons, the rise of the SWAT team, and the intimidating clothing preferred by many modern police. The photos of tanks on the streets of Ferguson were particularly galling. As a literal card-carrying member of the ACLU, you can imagine my own opinion about this trend.

That said, the new issue of AEJ: Policy has two side-by-side papers – one from a group at the University of Tennessee, and one by researches at Warwick and NHH – that give quite shocking evidence about the effects of militarized police. They both use the “1033 Program”, where surplus military equipment was transferred to police departments, to investigate how military equipment affects crime, citizen complaints, violence by officers, and violence against police. Essentially, when the military has a surplus, such as when the changed a standard gun in 2006, the decommissioned supplies are given to centers located across the country which then send those out to police departments within a few weeks. The application forms are short and straightforward, and are not terribly competitive. About 30 percent of the distributions are things like vests, clothing and first aid kits, while the rest is more tactical: guns, drones, vehicles, and so on.

Causal identification is, of course, a worry here: places that ask for military equipment are obviously unusual. The two papers use rather different identification strategies. The Tennessee paper uses the distance to a distribution center as an instrument, since the military wants to reduce the cost of decommissioning and hence prefers closer departments. Therefore, a first-stage IV will predict whether a sheriff gets new military items on the joint basis of total material decommissioned combined with their distance to decommissioning centers. The Warwick-NHH paper uses the fact that some locations apply frequently for items, and others only infrequently. When military spending is high, there is a lot more excess to decommission. Therefore, an instrument combining overall military spending with previous local requests for “1033” items can serve as a first stage for predicted surplus items received.

Despite the different local margins these two instruments imply, the findings in both papers are nearly identical. In places that get more military equipment, crime falls, particularly for crime that is easy to deter like carjacking or low-level drug crime. Citizen complaints, if anything, go down. Violence against police falls. And there is no increase in officer-caused deaths. In terms of magnitudes, the fall in crime is substantial given the cost: the Warwick-NHH paper finds the value of reduced crime, using standard metrics, is roughly 20 times the cost of the military equipment. Interestingly, places that get this equipment also hire fewer cops, suggesting some sort of substitutability between labor and capital in policing. The one negative finding, in the Tennessee paper, is that arrests for petty crimes appear to rise in a minor way.

Both papers are very clear that these results don’t mean we should militarize all police departments, and both are clear that in places with poor community-police relations, militarization can surely inflame things further. But the pure empirical estimates, that militarization reduces crime without any objectively measured cost in terms of civic unhappiness, are quite mind-blowing in terms of changing my own priors. It is similar to the Doleac-Hansen result that “Ban the Box” leads to worse outcomes for black folks, for reasons that make perfect game theoretic sense; I couldn’t have imagined Ban the Box was a bad policy, but the evidence these serious researchers present is too compelling to ignore.

So how are we to square these results with the well-known problems of police violence, and poor police-citizen relations, in the United States? Consider Roland Fryer’s recent paper on police violence and race, where essentially the big predictor of police violence is interacting with police, not individual characteristics. A unique feature of the US compared to other developed countries is that there really is more violent crime, hence police are rationally more worried about it, therefore people who interact with police are worried about violence from police. Policies that reduce the extent to which police and civilians interact in potentially dangerous settings reduce this cycle. You might argue – I certainly would – that policing is no more dangerous than, say, professional ocean fishing or taxicab driving, and you wouldn’t be wrong. But as long as the perception of a possibility of violence remains, things like military-grade vests or vehicles may help break the violence cycle. We shall see.

The two AEJ: Policy papers are Policeman on the Frontline or a Soldier?” (V. Bove & E. Gavrilova) and Peacekeeping Force: Effects of Providing Tactical Equipment to Local Law Enforcement (M. C. Harris, J. S. Park, D. J. Bruce and M. N. Murray). I am glad to see that the former paper, particularly, cites heavily from the criminology literature. Economics has a reputation in the social sciences both for producing unbiased research (as these two papers, and the Fryer paper, demonstrate) and for refusing to acknowledge quality work done in the sister social sciences, so I am particularly glad to see the latter problem avoided in this case!

“The Development Effects of the Extractive Colonial Economy,” M. Dell & B. Olken (2017)

A good rule of thumb is that you will want to read any working paper Melissa Dell puts out. Her main interest is the long-run path-dependent effect of historical institutions, with rigorous quantitative investigation of the subtle conditionality of the past. For instance, in her earlier work on Peru (Econometrica, 2010), mine slavery in the colonial era led to fewer hacienda style plantations at the end of the era, which led to less political power without those large landholders in the early democratic era, which led to fewer public goods throughout the 20th century, which led to less education and income today in eras that used to have mine slavery. One way to read this is that local inequality is the past may, through political institutions, be a good thing today! History is not as simple as “inequality is the past causes bad outcomes today” or “extractive institutions in the past cause bad outcomes today” or “colonial economic distortions cause bad outcomes today”. But, contra the branch of historians that don’t like to assign causality to any single factor in any given situation, we don’t need to entirely punt on the effects of specific policies in specific places if we apply careful statistical and theoretical analysis.

Dell’s new paper looks at the cultuurstelsel, a policy the Dutch imposed on Java in the mid-19th century. Essentially, the Netherlands was broke and Java was suitable for sugar, so the Dutch required villages in certain regions to use huge portions of their arable land, and labor effort, to produce sugar for export. They built roads and some rail, as well as sugar factories (now generally long gone), as part of this effort, and the land used for sugar production generally became public village land controlled at the behest of local leaders. This was back in the mid-1800s, so surely it shouldn’t affect anything of substance today?

But it did! Take a look at villages near the old sugar plantations, or that were forced to plant sugar, and you’ll find higher incomes, higher education levels, high school attendance rates even back in the late colonial era, higher population densities, and more workers today in retail and manufacturing. Dell and Olken did some wild data matching using a great database of geographic names collected by the US government to match the historic villages where these sugar plants, and these labor requirements, were located with modern village and town locations. They then constructed “placebo” factories – locations along coastal rivers in sugar growing regions with appropriate topography where a plant could have been located but wasn’t. In particular, as in the famous Salop circle, you won’t locate a factory too close to an existing one, but there are many counterfactual equilibria where we just shift all the factories one way or the other. By comparing the predicted effect of distance from the real factory on outcomes today with the predicted effect of distance from the huge number of hypothetical factories, you can isolate the historic local influence of the real factory from other local features which can’t be controlled for.

Consumption right next to old, long-destroyed factories is 14% higher than even five kilometers away, education is 1.25 years longer on average, electrification, road, and rail density are all substantially higher, and industrial production upstream and downstream from sugar (e.g., farm machinery upstream, and processed foods downstream) are also much more likely to be located in villages with historic factories even if there is no sugar production anymore in that region!

It’s not just the factory and Dutch investments that matter, however. Consider the villages, up to 10 kilometers away, which were forced to grow the raw cane. Their elites took private land for this purpose, and land inequality remains higher in villages that were forced to grow cane compared to villages right next door that were outside the Dutch-imposed boundary. But this public land permitted surplus extraction in an agricultural society which could be used for public goods, like schooling, which would later become important! These villages were much more likely to have schools especially before the 1970s, when public schooling in Indonesia was limited, and today are higher density, richer, more educated, and less agricultural than villages nearby which weren’t forced to grow cane. This all has shades of the long debate on “forward linkages” in agricultural societies, where it is hypothesized that agricultural surplus benefits industrialization by providing the surplus necessary for education and capital to be purchased; see this nice paper by Sam Marden showing linkages of this sort in post-Mao China.

Are you surprised by these results? They fascinate me, honestly. Think through the logic: forced labor (in the surrounding villages) and extractive capital (rail and factories built solely to export a crop in little use domestically) both have positive long-run local effects! They do so by affecting institutions – whether villages have the ability to produce public goods like education – and by affecting incentives – the production of capital used up- and downstream. One can easily imagine cases where forced labor and extractive capital have negative long-run effects, and we have great papers by Daron Acemoglu, Nathan Nunn, Sara Lowes and others on precisely this point. But it is also very easy for societies to get trapped in bad path dependent equilibria, for which outside intervention, even ethically shameful ones, can (perhaps inadvertently) cause useful shifts in incentives and institutions! I recall a visit to Babeldaob, the main island in Palau. During the Japanese colonial period, the island was heavily industrialized as part of Japan’s war machine. These factories were destroyed by the Allies in World War 2. Yet despite their extractive history, a local told me many on the island believe that the industrial development of the region was permanently harmed when those factories were damaged. It seems a bit crazy to mourn the loss of polluting, extractive plants whose whole purpose was to serve a colonial master, but the Palauan may have had some wisdom after all!

2017 Working Paper is here (no RePEc IDEAS version). For more on sugar and institutions, I highly recommend Christian Dippel, Avner Greif and Dan Trefler’s recent paper on Caribbean sugar. The price of sugar fell enormously in the late 19th century, yet wages on islands which lost the ability to productively export sugar rose. Why? Planters in places like Barbados had so much money from their sugar exports that they could manipulate local governance and the police, while planters in places like the Virgin Islands became too poor to do the same. This decreased labor coercion, permitting workers on sugar plantations to work small plots or move to other industries, raising wages in the end. I continue to await Suresh Naidu’s book on labor coercion – it is astounding the extent to which labor markets were distorted historically (see, e.g., Eric Foner on Reconstruction), and in some cases still today, by legal and extralegal restrictions on how workers could move on up.

William Baumol: Truly Productive Entrepreneurship

It seems this weblog has become an obituary page rather than a simple research digest of late. I am not even done writing on the legacy of Ken Arrow (don’t worry – it will come!) when news arrives that yet another product of the World War 2 era in New York City, an of the CCNY system, has passed away: the great scholar of entrepreneurship and one of my absolute favorite economists, William Baumol.

But we oughtn’t draw the line on his research simply at entrepreneurship, though I will walk you through his best piece in the area, a staple of my own PhD syllabus, on “creative, unproductive, and destructive” entrepreneurship. Baumol was also a great scholar of the economics of the arts, performing and otherwise, which were the motivation for his famous cost disease argument. He was a very skilled micro theorist, a talented economic historian, and a deep reader of the history of economic thought, a nice example of which is his 2000 QJE on what we have learned since Marshall. In all of these areas, his papers are a pleasure to read, clear, with elegant turns of phrase and the casual yet erudite style of an American who’d read his PhD in London under Robbins and Viner. That he has passed without winning his Nobel Prize is a shame – how great would it have been had he shared a prize with Nate Rosenberg before it was too late for them both?

Baumol is often naively seen as a Schumpeter-esque defender of the capitalist economy and the heroic entrepreneur, and that is only half right. Personally, his politics were liberal, and as he argued in a recent interview, “I am well aware of all the very serious problems, such as inequality, unemployment, environmental damage, that beset capitalist societies. My thesis is that capitalism is a special mechanism that is uniquely effective in accomplishing one thing: creating innovations, applying those innovations and using them to stimulate growth.” That is, you can find in Baumol’s work many discussions of environmental externalities, of the role of government in funding research, in the nature of optimal taxation. You can find many quotes where Baumol expresses interest in the policy goals of the left (though often solved with the mechanism of the market, and hence the right). Yet the core running through much of Baumol’s work is a rigorous defense, historically and theoretically grounded, in the importance of getting incentives correct for socially useful innovation.

Baumol differs from many other prominent economists of innovation because is at his core a neoclassical theorist. He is not an Austrian like Kirzner or an evolutionary economist like Sid Winter. Baumol’s work stresses that entrepreneurs and the innovations they produce are fundamental to understanding the capitalist economy and its performance relative to other economic systems, but that the best way to understand the entrepreneur methodologically was to formalize her within the context of neoclassical equilibria, with innovation rather than price alone being “the weapon of choice” for rational, competitive firms. I’ve always thought of Baumol as being the lineal descendant of Schumpeter, the original great thinker on entrepreneurship and one who, nearing the end of his life and seeing the work of his student Samuelson, was convinced that his ideas should be translated into formal neoclassical theory.

A 1968 essay in the AER P&P laid out Baumol’s basic idea that economics without the entrepreneur is, in a line he would repeat often, like Hamlet without the Prince of Denmark. He clearly understood that we did not have a suitable theory for oligopoly and entry into new markets, or for the supply of entrepreneurs, but that any general economic theory needed to be able to explain why growth is different in different countries. Solow’s famous essay convinced much of the profession that the residual, interpreted then primarily as technological improvement, was the fundamental variable explaining growth, and Baumol, like many, believed those technological improvements came mainly from entrepreneurial activity.

But what precisely should the theory look like? Ironically, Baumol made his most productive step in a beautiful 1990 paper in the JPE which contains not a single formal theorem nor statistical estimate of any kind. Let’s define an entrepreneur as “persons who are ingenious or creative in finding ways to add to their wealth, power, or prestige”. These people may introduce new goods, or new methods of production, or new markets, as Schumpeter supposed in his own definition. But are these ingenious and creative types necessarily going to do something useful for social welfare? Of course not – the norms, institutions, and incentives in a given society may be such that the entrepreneurs perform socially unproductive tasks, such as hunting for new tax loopholes, or socially destructive tasks, such as channeling their energy into ever-escalating forms of warfare.

With the distinction between productive, unproductive, and destructive entrepreneurship in mind, we might imagine that the difference in technological progress across societies may have less to do with the innate drive of the society’s members, and more to do with the incentives for different types of entrepreneurship. Consider Rome, famously wealthy yet with very little in the way of useful technological diffusion: certainly the Romans appear less innovative than either the Greeks or Europe of the Middle Ages. How can a society both invent a primitive steam engine – via Herod of Alexandria – and yet see it used for nothing other than toys and religious ceremonies? The answer, Baumol notes, is that status in Roman society required one to get rich via land ownership, usury, or war; commerce was a task primarily for slaves and former slaves! And likewise in Song dynasty China, where imperial examinations were both the source of status and the ability to expropriate any useful inventions or businesses that happened to appear. In the European middle ages, incentives shift for the clever from developing war implements to the diffusion of technology like the water-mill under the Cistercians back to weapons. These examples were expanded to every society from Ancient Mesopotamia to the Dutch Republic to the modern United States in a series of economically-minded historians in a wonderful collection of essays called “The Invention of Enterprise” which was edited by Baumol alongside Joel Mokyr and David Landes.

Now we are approaching a sort of economic theory of entrepreneurship – no need to rely on the whims of character, but instead focus on relative incentives. But we are still far from Baumol’s 1968 goal: incorporating the entrepreneur into neoclassical theory. The closest Baumol comes is in his work in the early 1980s on contestable markets, summarized in the 1981 AEA Presidential Address. The basic idea is this. Assume industries have scale economies, so oligopoly is their natural state. How worried should we be? Well, if there are no sunk costs and no entry barriers for entrants, and if entrants can siphon off customers quicker than incumbents can respond, then Baumol and his coauthors claimed that the market was contestable: the threat of entry is sufficient to keep the incumbent from exerting their market power. On the one hand, fine, we all agree with Baumol now that industry structure is endogenous to firm behavior, and the threat of entry clearly can restrain market power. But on the other hand, is this “ultra-free entry” model the most sensible way to incorporate entry and exit into a competitive model? Why, as Dixit argued, is it quicker to enter a market than to change price? Why, as Spence argued, does the unrealized threat of entry change equilibrium behavior if the threat is truly unrealized along the equilibrium path?

It seems that what Baumol was hoping this model would lead to was a generalized theory of perfect competition that permitted competition for the market rather than just in the market, since the competition for the market is naturally the domain of the entrepreneur. Contestable markets are too flawed to get us there. But the basic idea, that game-theoretic endogenous market structure, rather than the old fashioned idea that industry structure affects conduct affects performance, is clearly here to stay: antitrust is essentially applied game theory today. And once you have the idea of competition for the market, the natural theoretical model is one where firms compete to innovate in order to push out incumbents, incumbents innovate to keep away from potential entrants, and profits depend on the equilibrium time until the dominant firm shifts: I speak, of course, about the neo-Schumpeterian models of Aghion and Howitt. These models, still a very active area of research, are finally allowing us to rigorously investigate the endogenous rewards to innovation via a completely neoclassical model of market structure and pricing.

I am not sure why Baumol did not find these neo-Schumpeterian models to be the Holy Grail he’d been looking for; in his final book, he credits them for being “very powerful” but in the end holding different “central concerns”. He may have been mistaken in this interpretation. It proved quite interesting to give a careful second read of Baumol’s corpus on entrepreneurship, and I have to say it disappoints in part: the questions he asked were right, the theoretical acumen he possessed was up to the task, the understanding of history and qualitative intuition was second to none, but in the end, he appears to have been just as stymied by the idea of endogenous neoclassical entrepreneurship as the many other doyens of our field who took a crack at modeling this problem without, in the end, generating the model they’d hoped they could write.

Where Baumol has more success, and again it is unusual for a theorist that his most well-known contribution is largely qualitative, is in the idea of cost disease. The concept comes from Baumol’s work with William Bowen (see also this extension with a complete model) on the economic problems of the performing arts. It is a simple idea: imagine productivity in industry rises 4% per year, but “the output per man-hour of a violinist playing a Schubert quarter in a standard concert hall” remains fixed. In order to attract workers into music rather than industry, wages must rise in music at something like the rate they rise in industry. But then costs are increasing while productivity is not, and the arts looks “inefficient”. The same, of course, is said for education, and health care, and other necessarily labor-intensive industries. Baumol’s point is that rising costs in unproductive sectors reflect necessary shifts in equilibrium wages rather than, say, growing wastefulness.

How much can cost disease explain? Because the concept is so widely known by now that it is, in fact, used to excuse stagnant industries. Teaching, for example, requires some labor, but does anybody believe that it is impossible for R&D and complementary inventions (like the internet, for example) to produce massive productivity improvements? Is it not true that movie theaters now show opera live from the world’s great halls on a regular basis? Is it not true that my Google Home can, activated by voice, call up two seconds from now essentially any piece of recorded music I desire, for free? Speculating about industries that are necessarily labor-intensive (and hence grow slowly) from those with rapid technological progress is a very difficult game, and one we ought hesitate to play. But equally, we oughtn’t forget Baumol’s lesson: in some cases, in some industries, what appears to be fixable slack is in fact simply cost disease. We may ask, how was it that Ancient Greece, with its tiny population, put on so many plays, while today we hustle ourselves to small ballrooms in New York and London? Baumol’s answer, rigorously shown: cost disease. The “opportunity cost” of recruiting a big chorus was low, as those singers would otherwise have been idle or working unproductive fields gathering olives. The difference between Athens and our era is not simply that they were “more supportive of the arts”!

Baumol was incredibly prolific, so these suggestions for further reading are but a taste: An interview by Alan Krueger is well worth the read for anecdotes alone, like the fact that apparently one used to do one’s PhD oral defense “over whiskies and sodas at the Reform Club”. I also love his defense of theory, where if he is very lucky, his initial intuition “turn[s] out to be totally wrong. Because when I turn out to be totally wrong, that’s when the best ideas come out. Because if my intuition was right, it’s almost always going to be simple and straightforward. When my intuition turns out to be wrong, then there is something less obvious to explain.” Every theorist knows this: formalization has this nasty habit of refining our intuition and convincing us our initial thoughts actually contain logical fallacies or rely on special cases! Though known as an applied micro theorist, Baumol also wrote a canonical paper, with Bradford, on optimal taxation: essentially, if you need to raise $x in tax, how should you optimally deviate from marginal cost pricing? The history of thought is nicely diagrammed, and of course this 1970 paper was very quickly followed by the classic work of Diamond and Mirrlees. Baumol wrote extensively on environmental economics, drawing in many of his papers on the role nonconvexities in the social production possibilities frontier play when they are generated by externalities – a simple example of this effect, and the limitations it imposes on Pigouvian taxation, is in the link. More recently, Baumol has been writing on international trade with Ralph Gomory (the legendary mathematician behind a critical theorem in integer programming, and later head of the Sloan Foundation); their main theorems are not terribly shocking to those used to thinking in terms of economies of scale, but the core example in the linked paper is again a great example of how nonconvexities can overturn a lot of our intuition, in the case on comparative advantage. Finally, beyond his writing on the economics of the arts, Baumol proved that there is no area in which he personally had stagnant productivity: an art major in college, he was also a fantastic artist in his own right, picking up computer-generated art while in his 80s and teaching for many years a course on woodworking at Princeton!

Advertisements
%d bloggers like this: