The 2017 Nobel: Richard Thaler

A true surprise this morning: the behavioral economist Richard Thaler from the University of Chicago has won the Nobel Prize in economics. It is not a surprise because it is undeserving; rather, it is a surprise because only four years ago, Thaler’s natural co-laureate Bob Shiller won while Thaler was left the bridesmaid. But Thaler’s influence on the profession, and the world, is unquestionable. There are few developed governments who do not have a “nudge” unit of some sort trying to take advantage of behavioral nudges to push people a touch in one way or another, including here in Ontario via my colleagues at BEAR. I will admit, perhaps under the undue influence of too many dead economists, that I am skeptical of nudging and behavioral finance on both positive and normative grounds, so this review will be one of friendly challenge rather than hagiography. I trust that there will be no shortage of wonderful positive reflections on Thaler’s contribution to policy, particularly because he is the rare economist whose work is totally accessible to laymen and, more importantly, journalists.

Much of my skepticism is similar to how Fama thinks about behavioral finance: “I’ve always said they are very good at describing how individual behavior departs from rationality. That branch of it has been incredibly useful. It’s the leap from there to what it implies about market pricing where the claims are not so well-documented in terms of empirical evidence.” In other words, surely most people are not that informed and not that rational much of the time, but repeated experience, market selection, and other aggregative factors mean that this irrationality may not matter much for the economy at large. It is very easy to claim that since economists model “agents” as “rational”, we would, for example, “not expect a gift on the day of the year in which she happened to get married, or be born” and indeed “would be perplexed by the idea of gifts at all” (Thaler 2015). This type of economist caricature is both widespread and absurd, I’m afraid. In order to understand the value of Thaler’s work, we ought first look at situations where behavioral factors matter in real world, equilibrium decisions of consequence, then figure out how common those situations are, and why.

The canonical example of Thaler’s useful behavioral nudges is his “Save More Tomorrow” pension plan, with Benartzi. Many individuals in defined contribution plans save too little, both because they are not good at calculating how much they need to save and because they are biased toward present consumption. You can, of course, force people to save a la Singapore, but we dislike these plans because individuals vary in their need and desire for saving, and because we find the reliance on government coercion to save heavy-handed. Alternatively, you can default defined-contribution plans to involve some savings rate, but it turns out people do not vary their behavior from the default throughout their career, and hence save too little solely because they didn’t want too much removed from their first paycheck. Thaler and Benartzi have companies offer plans where you agree now to having your savings rate increased when you get raises – for instance, if your salary goes up 2%, you will have half of that set into a savings plan tomorrow, until you reach a savings rate that is sufficiently high. In this way, no one takes a nominal post-savings paycut. People can, of course, leave this plan whenever they want. In their field experiments, savings rates did in fact soar (with takeup varying hugely depending on how information about the plan was presented), and attrition in the future from the plan was low.

This policy is what Thaler and Sunstein call “libertarian paternalism”. It is paternalistic because, yes, we think that you may make bad decisions from your own perspective because you are not that bright, or because you are lazy, or because you have many things which require your attention. It is libertarian because there is no compulsion, in that anyone can opt out at their leisure. Results similar to Thaler and Benartzi’s have found by Ashraf et al in a field experiment in the Philippines, and by Karlan et al in three countries where just sending reminder messages which make savings goals more salient modestly increase savings.

So far, so good. We have three issues to unpack, however. First, when is this nudge acceptable on ethical grounds? Second, why does nudging generate such large effects here, and if the effects are large, why doesn’t the market simply provide them? Third, is the 401k savings case idiosyncratic or representative? The idea that the homo economicus, rational calculator, misses important features of human behavior, and would do with some insights from psychology, is not new, of course. Thaler’s prize is, at minimum, the fifth Nobel to go to someone pushing this general idea, since Herb Simon, Maurice Allais, Daniel Kahneman, and the aforementioned Bob Shiller have all already won. Copious empirical evidence, and indeed simple human observation, implies that people have behavioral biases, that they are not perfectly rational – as Thaler has noted, we see what looks like irrationality even in the composition of 100 million dollar baseball rosters. The more militant behavioralists insist that ignoring these psychological factors is unscientific! And yet, and yet: the vast majority of economists, all of whom are by now familiar with these illustrious laureates and their work, still use fairly standard expected utility maximizing agents in nearly all of our papers. Unpacking the three issues above will clarify how that could possibly be so.

Let’s discuss ethics first. Simply arguing that organizations “must” make a choice (as Thaler and Sunstein do) is insufficient; we would not say a firm that defaults consumers into an autorenewal for a product they rarely renew when making an active choice is acting “neutrally”. Nudges can be used for “good” or “evil”. Worse, whether a nudge is good or evil depends on the planner’s evaluation of the agent’s “inner rational self”, as Infante and Sugden, among others, have noted many times. That is, claiming paternalism is “only a nudge” does not excuse the paternalist from the usual moral philosophic critiques! Indeed, as Chetty and friends have argued, the more you believe behavioral biases exist and are “nudgeable”, the more careful you need to be as a policymaker about inadvertently reducing welfare. There is, I think, less controversy when we use nudges rather than coercion to reach some policy goal. For instance, if a policymaker wants to reduce energy usage, and is worried about distortionary taxation, nudges may (depending on how you think about social welfare with non-rational preferences!) be a better way to achieve the desired outcomes. But this goal is very different from the common justification that nudges somehow are pushing people toward policies they actually like in their heart of hearts. Carroll et al have a very nice theoretical paper trying to untangle exactly what “better” means for behavioral agents, and exactly when the imprecision of nudges or defaults given our imperfect knowledge of individual’s heterogeneous preferences makes attempts at libertarian paternalism worse than laissez faire.

What of the practical effects of nudges? How can they be so large, and in what contexts? Thaler has very convincingly shown that behavioral biases can affect real world behavior, and that understanding those biases means two policies which are identical from the perspective of a homo economicus model can have very different effects. But many economic situations involve players doing things repeatedly with feedback – where heuristics approximated by rationality evolve – or involve players who “perform poorly” being selected out of the game. For example, I can think of many simple nudges to get you or me to play better basketball. But when it comes to Michael Jordan, the first order effects are surely how well he takes cares of his health, the teammates he has around him, and so on. I can think of many heuristics useful for understanding how simple physics will operate, but I don’t think I can find many that would improve Einstein’s understanding of how the world works. The 401k situation is unusual because it is a decision with limited short-run feedback, taken by unsophisticated agents who will learn little even with experience. The natural alternative, of course, is to have agents outsource the difficult parts of the decision, to investment managers or the like. And these managers will make money by improving people’s earnings. No surprise that robo-advisors, index funds, and personal banking have all become more important as defined contribution plans have become more common! If we worry about behavioral biases, we ought worry especially about market imperfections that prevent the existence of designated agents who handle the difficult decisions for us.

The fact that agents can exist is one reason that irrationality in the lab may not translate into irrationality in the market. But even without agents, we might reasonably be suspect of some claims of widespread irrationality. Consider Thaler’s famous endowment effect: how much you are willing to pay for, say, a coffee mug or a pen is much less than how much you would accept to have the coffee mug taken away from you. Indeed, it is not unusual in a study to find a ratio of three times or greater between the willingness to pay and willingness to accept amount. But, of course, if these were “preferences”, you could be money pumped (see Yaari, applying a theorem of de Finetti, on the mathematics of the pump). Say you value the mug at ten bucks when you own it and five bucks when you don’t. Do we really think I can regularly get you to pay twice as much by loaning you the mug for free for a month? Do we see car companies letting you take a month-long test drive of a $20,000 car then letting you keep the car only if you pay $40,000, with some consumers accepting? Surely not. Now the reason why is partly what Laibson and Yariv argue, that money pumps do not exist in competitive economies since market pressure will compete away rents: someone else will offer you the car at $20,000 and you will just buy from them. But even if the car company is a monopolist, surely we find the magnitude of the money pump implied here to be on face ridiculous.

Even worse are the dictator games introduced in Thaler’s 1986 fairness paper. Students were asked, upon being given $20, whether they wanted to give an anonymous student half of their endowment or 10%. Many of the students gave half! This experiment has been repeated many, many times, with similar effects. Does this mean economists are naive to neglect the social preferences of humans? Of course not! People are endowed with money and gifts all the time. They essentially never give any of it to random strangers – I feel confident assuming you, the reader, have never been handed some bills on the sidewalk by an officeworker who just got a big bonus! Worse, the context of the experiment matters a ton (see John List on this point). Indeed, despite hundreds of lab experiments on dictator games, I feel far more confident predicting real world behavior following windfalls if we use a parsimonious homo economicus model than if we use the results of dictator games. Does this mean the games are useless? Of course not – studying what factors affect other-regarding preferences is interesting, and important. But how odd to have a branch of our field filled with people who see armchair theorizing of homo economicus as “unscientific”, yet take lab experiments so literally even when they are so clearly contrary to data?

To take one final example, consider Thaler’s famous model of “mental accounting”. In many experiments, he shows people have “budgets” set aside for various tasks. I have my “gas budget” and adjust my driving when gas prices change. I only sell stocks when I am up overall on that stock since I want my “mental account” of that particular transaction to be positive. But how important is this in the aggregate? Take the Engel curve. Budget shares devoted to food fall with income. This is widely established historically and in the cross section. Where is the mental account? Farber (2008 AER) even challenges the canonical account of taxi drivers working just enough hours to make their targeted income. As in the dictator game and the endowment effect, there is a gap between what is real, psychologically, and what is consequential enough to be first-order in our economic understanding of the world.

Let’s sum up. Thaler’s work is brilliant – it is a rare case of an economist taking psychology seriously and actually coming up with policy-relevant consequences like the 401k policy. But Thaler’s work is also dangerous to young economists who see biases everywhere. Experts in a field, and markets with agents and mechanisms and all the other tricks they develop, are very very good at ferreting out irrationality, and economists core skill lies in not missing those tricks.

Some remaining bagatelles: 1) Thaler and his PhD advisor, Sherwin Rosen, have one of the first papers on measuring the “statistical” value of a life, a technique now widely employed in health economics and policy. 2) Beyond his academic work, Thaler has won a modicum of fame as a popular writer (Nudge, written with Cass Sunstein, is canonical here) and for his brief turn as an actor alongside Selena Gomez in “The Big Short”. 3) Dick has a large literature on “fairness” in pricing, a topic which goes back to Thomas Aquinas, if not earlier. Many of the experiments Thaler performs, like the thought experiments of Aquinas, come down to the fact that many perceive market power to be unfair. Sure, I agree, but I’m not sure there’s much more that can be learned than this uncontroversial fact. 4) Law and econ has been massively influenced by Thaler. As a simple example, if endowment effects are real, then the assignment of property rights matters even when there are no transaction costs. Jolls et al 1998 go into more depth on this issue. 5) Thaler’s precise results in so-called behavioral finance are beyond my area of expertise, so I defer to John Cochrane’s comments following the 2013 Nobel. Eugene Fama is, I think, correct when he suggests that market efficiency generated by rational traders with risk aversion is the best model we have of financial behavior, where best is measured by “is this model useful for explaining the world.” The number of behavioral anomalies at the level of the market which persist and are relevant in the aggregate do not strike me as large, while the number of investors and policymakers who make dreadful decisions because they believe markets are driven by behavioral sentiments is large indeed!


“Resetting the Urban Network,” G. Michaels & F. Rauch (2017)

Cities have two important properties: they are enormously consequential for people’s economic prosperity, and they are very sticky. That stickiness is twofold: cities do not change their shape rapidly in response to changing economic or technological opportunities (consider, e.g., Hornbeck and Keniston on the positive effects of the Great Fire of Boston), and people are hesitant to leave their existing non-economic social network (Deryagina et al show that Katrina victims, a third of whom never return to New Orleans, are materially better off as soon as three years after the hurricane, earning more and living in less expensive cities; Shoag and Carollo find that Japanese-Americans randomly placed in internment camps in poor areas during World War 2 see lower incomes and children’s educational outcomes even many years later).

A lot of recent work in urban economics suggests that the stickiness of cities is getting worse, locking path dependent effects in with even more vigor. A tour-de-force by Shoag and Ganong documents that income convergence across cities in the US has slowed since the 1970s, that this only happened in cities with restrictive zoning rules, and that the primary effect has been that as land use restrictions make housing prices elastic to income, working class folks no longer move from poor to rich cities because the cost of housing makes such a move undesirable. Indeed, they suggest a substantial part of growing income inequality, in line with work by Matt Rognlie and others, is due to the fact that owners of land have used political means to capitalize productivity gains into their existing, tax-advantaged asset.

Now, one part of urban stickiness over time may simply be reflecting that certain locations are very productive, that they have a large and valuable installed base of tangible and intangible assets that make their city run well, and hence we shouldn’t be surprised to see cities retain their prominence and nature over time. So today, let’s discuss a new paper by Michaels and Rauch which uses a fantastic historical case to investigate this debate: the rise and fall of the Roman Empire.

The Romans famously conquered Gaul – today’s France – under Caesar, and Britain in stages up through Hadrian (and yes, Mary Beard’s SPQR is worthwhile summer reading; the fact that Nassim Taleb and her do not get along makes it even more self-recommending!). Roman cities popped up across these regions, until the 5th century invasions wiped out Roman control. In Britain, for all practical purposes the entire economic network faded away: cities hollowed out, trade came to a stop, and imports from outside Britain and Roman coin are near nonexistent in the archaeological record for the next century and a half. In France, the network was not so cleanly broken, with Christian bishoprics rising in many of the old Roman towns.

Here is the amazing fact: today, 16 of France’s 20 largest cities are located on or near a Roman town, while only 2 of Britain’s 20 largest are. This difference existed even back in the Middle Ages. So who cares? Well, Britain’s cities in the middle ages are two and a half times more likely to have coastal access than France’s cities, so that in 1700, when sea trade was hugely important, 56% of urban French lived in towns with sea access while 87% of urban Brits did. This is even though, in both countries, cities with sea access grew faster and huge sums of money were put into building artificial canals. Even at a very local level, the France/Britain distinction holds: when Roman cities were within 25km of the ocean or a navigable river, they tended not to move in France, while in Britain they tended to reappear nearer to the water. The fundamental factor for the shift in both places was that developments in shipbuilding in the early middle ages made the sea much more suitable for trade and military transport than the famous Roman Roads which previously played that role.

Now the question, of course, is what drove the path dependence: why didn’t the French simply move to better locations? We know, as in Ganong and Shoag’s paper above, that in the absence of legal restrictions, people move toward more productive places. Indeed, there is a lot of hostility to the idea of path dependence more generally. Consider, for example, the case of the typewriter, which “famously” has its QWERTY form because of an idiosyncracy in the very early days of the typewriter. QWERTY is said to be much less efficient than alternative key layouts like Dvorak. Liebowitz and Margolis put this myth to bed: not only is QWERTY fairly efficient (you can think much faster than you can type for any reasonable key layout), but typewriting companies spent huge amounts of money on training schools and other mechanisms to get secretaries to switch toward the companies’ preferred keyboards. That is, while it can be true that what happened in the past matters, it is also true that there are many ways to coordinate people to shift to a more efficient path if a suitable large productivity improvement exists.

With cities, coordinating on the new productive location is harder. In France, Michaels and Rauch suggest that bishops and the church began playing the role of a provider of public goods, and that the continued provision of public goods in certain formerly-Roman cities led them to grow faster than they otherwise would have. Indeed, Roman cities in France with no bishop show a very similar pattern to Roman cities in Britain: general decline. That sunk costs and non-economic institutional persistence can lead to multiple steady states in urban geography, some of which are strictly worse, has been suggested in smaller scale studies (e.g., Redding et al RESTAT 2011 on Germany’s shift from Berlin to Frankfurt, or the historical work of Engerman and Sokoloff).

I loved this case study, and appreciate the deep dive into history that collecting data on urban locations over this period required. But the implications of this literature broadly are very worrying. Much of the developed world has, over the past forty years, pursued development policies that are very favorable to existing landowners. This has led to stickiness which makes path dependence more important, and reallocation toward more productive uses less likely, both because cities cannot shift their geographic nature and because people can’t move to cities that become more productive. We ought not artificially wind up like Dijon and Chartres in the middle ages, locking our population into locations better suited for the economy of the distant past.

2016 working paper (RePEc IDEAS). Article is forthcoming in Economic Journal. With incredible timing, Michaels and Rauch, alongside two other coauthors, have another working paper called Flooded Cities. Essentially, looking across the globe, there are frequent very damaging floods, occurring every 20 years or so in low-lying areas of cities. And yet, as long as those areas are long settled, people and economic activity simply return to those areas after a flood. Note this is true even in countries without US-style flood insurance programs. The implication is that the stickiness of urban networks, amenities, and so on tends to be very strong, and if anything encouraged by development agencies and governments, yet this stickiness means that we wind up with many urban neighborhoods, and many cities, located in places that are quite dangerous for their residents without any countervailing economic benefit. You will see their paper in action over the next few years: despite some neighborhoods flooding three times in three years, one can bet with confidence that population and economic activity will remain on the floodplains of Houston’s bayou. (And in the meanwhile, ignoring our worries about future economic efficiency, I wish only the best for a safe and quick recovery to friends and colleagues down in Houston!)

Two New Papers on Militarized Police

The so-called militarization of police has become a major issue both in libertarian policy circles and in the civil rights community. Radley Balko has done yeoman’s work showing the harms, including outrageous civil liberty violations, generated by the use of military-grade armor and weapons, the rise of the SWAT team, and the intimidating clothing preferred by many modern police. The photos of tanks on the streets of Ferguson were particularly galling. As a literal card-carrying member of the ACLU, you can imagine my own opinion about this trend.

That said, the new issue of AEJ: Policy has two side-by-side papers – one from a group at the University of Tennessee, and one by researches at Warwick and NHH – that give quite shocking evidence about the effects of militarized police. They both use the “1033 Program”, where surplus military equipment was transferred to police departments, to investigate how military equipment affects crime, citizen complaints, violence by officers, and violence against police. Essentially, when the military has a surplus, such as when the changed a standard gun in 2006, the decommissioned supplies are given to centers located across the country which then send those out to police departments within a few weeks. The application forms are short and straightforward, and are not terribly competitive. About 30 percent of the distributions are things like vests, clothing and first aid kits, while the rest is more tactical: guns, drones, vehicles, and so on.

Causal identification is, of course, a worry here: places that ask for military equipment are obviously unusual. The two papers use rather different identification strategies. The Tennessee paper uses the distance to a distribution center as an instrument, since the military wants to reduce the cost of decommissioning and hence prefers closer departments. Therefore, a first-stage IV will predict whether a sheriff gets new military items on the joint basis of total material decommissioned combined with their distance to decommissioning centers. The Warwick-NHH paper uses the fact that some locations apply frequently for items, and others only infrequently. When military spending is high, there is a lot more excess to decommission. Therefore, an instrument combining overall military spending with previous local requests for “1033” items can serve as a first stage for predicted surplus items received.

Despite the different local margins these two instruments imply, the findings in both papers are nearly identical. In places that get more military equipment, crime falls, particularly for crime that is easy to deter like carjacking or low-level drug crime. Citizen complaints, if anything, go down. Violence against police falls. And there is no increase in officer-caused deaths. In terms of magnitudes, the fall in crime is substantial given the cost: the Warwick-NHH paper finds the value of reduced crime, using standard metrics, is roughly 20 times the cost of the military equipment. Interestingly, places that get this equipment also hire fewer cops, suggesting some sort of substitutability between labor and capital in policing. The one negative finding, in the Tennessee paper, is that arrests for petty crimes appear to rise in a minor way.

Both papers are very clear that these results don’t mean we should militarize all police departments, and both are clear that in places with poor community-police relations, militarization can surely inflame things further. But the pure empirical estimates, that militarization reduces crime without any objectively measured cost in terms of civic unhappiness, are quite mind-blowing in terms of changing my own priors. It is similar to the Doleac-Hansen result that “Ban the Box” leads to worse outcomes for black folks, for reasons that make perfect game theoretic sense; I couldn’t have imagined Ban the Box was a bad policy, but the evidence these serious researchers present is too compelling to ignore.

So how are we to square these results with the well-known problems of police violence, and poor police-citizen relations, in the United States? Consider Roland Fryer’s recent paper on police violence and race, where essentially the big predictor of police violence is interacting with police, not individual characteristics. A unique feature of the US compared to other developed countries is that there really is more violent crime, hence police are rationally more worried about it, therefore people who interact with police are worried about violence from police. Policies that reduce the extent to which police and civilians interact in potentially dangerous settings reduce this cycle. You might argue – I certainly would – that policing is no more dangerous than, say, professional ocean fishing or taxicab driving, and you wouldn’t be wrong. But as long as the perception of a possibility of violence remains, things like military-grade vests or vehicles may help break the violence cycle. We shall see.

The two AEJ: Policy papers are Policeman on the Frontline or a Soldier?” (V. Bove & E. Gavrilova) and Peacekeeping Force: Effects of Providing Tactical Equipment to Local Law Enforcement (M. C. Harris, J. S. Park, D. J. Bruce and M. N. Murray). I am glad to see that the former paper, particularly, cites heavily from the criminology literature. Economics has a reputation in the social sciences both for producing unbiased research (as these two papers, and the Fryer paper, demonstrate) and for refusing to acknowledge quality work done in the sister social sciences, so I am particularly glad to see the latter problem avoided in this case!

“The Development Effects of the Extractive Colonial Economy,” M. Dell & B. Olken (2017)

A good rule of thumb is that you will want to read any working paper Melissa Dell puts out. Her main interest is the long-run path-dependent effect of historical institutions, with rigorous quantitative investigation of the subtle conditionality of the past. For instance, in her earlier work on Peru (Econometrica, 2010), mine slavery in the colonial era led to fewer hacienda style plantations at the end of the era, which led to less political power without those large landholders in the early democratic era, which led to fewer public goods throughout the 20th century, which led to less education and income today in eras that used to have mine slavery. One way to read this is that local inequality is the past may, through political institutions, be a good thing today! History is not as simple as “inequality is the past causes bad outcomes today” or “extractive institutions in the past cause bad outcomes today” or “colonial economic distortions cause bad outcomes today”. But, contra the branch of historians that don’t like to assign causality to any single factor in any given situation, we don’t need to entirely punt on the effects of specific policies in specific places if we apply careful statistical and theoretical analysis.

Dell’s new paper looks at the cultuurstelsel, a policy the Dutch imposed on Java in the mid-19th century. Essentially, the Netherlands was broke and Java was suitable for sugar, so the Dutch required villages in certain regions to use huge portions of their arable land, and labor effort, to produce sugar for export. They built roads and some rail, as well as sugar factories (now generally long gone), as part of this effort, and the land used for sugar production generally became public village land controlled at the behest of local leaders. This was back in the mid-1800s, so surely it shouldn’t affect anything of substance today?

But it did! Take a look at villages near the old sugar plantations, or that were forced to plant sugar, and you’ll find higher incomes, higher education levels, high school attendance rates even back in the late colonial era, higher population densities, and more workers today in retail and manufacturing. Dell and Olken did some wild data matching using a great database of geographic names collected by the US government to match the historic villages where these sugar plants, and these labor requirements, were located with modern village and town locations. They then constructed “placebo” factories – locations along coastal rivers in sugar growing regions with appropriate topography where a plant could have been located but wasn’t. In particular, as in the famous Salop circle, you won’t locate a factory too close to an existing one, but there are many counterfactual equilibria where we just shift all the factories one way or the other. By comparing the predicted effect of distance from the real factory on outcomes today with the predicted effect of distance from the huge number of hypothetical factories, you can isolate the historic local influence of the real factory from other local features which can’t be controlled for.

Consumption right next to old, long-destroyed factories is 14% higher than even five kilometers away, education is 1.25 years longer on average, electrification, road, and rail density are all substantially higher, and industrial production upstream and downstream from sugar (e.g., farm machinery upstream, and processed foods downstream) are also much more likely to be located in villages with historic factories even if there is no sugar production anymore in that region!

It’s not just the factory and Dutch investments that matter, however. Consider the villages, up to 10 kilometers away, which were forced to grow the raw cane. Their elites took private land for this purpose, and land inequality remains higher in villages that were forced to grow cane compared to villages right next door that were outside the Dutch-imposed boundary. But this public land permitted surplus extraction in an agricultural society which could be used for public goods, like schooling, which would later become important! These villages were much more likely to have schools especially before the 1970s, when public schooling in Indonesia was limited, and today are higher density, richer, more educated, and less agricultural than villages nearby which weren’t forced to grow cane. This all has shades of the long debate on “forward linkages” in agricultural societies, where it is hypothesized that agricultural surplus benefits industrialization by providing the surplus necessary for education and capital to be purchased; see this nice paper by Sam Marden showing linkages of this sort in post-Mao China.

Are you surprised by these results? They fascinate me, honestly. Think through the logic: forced labor (in the surrounding villages) and extractive capital (rail and factories built solely to export a crop in little use domestically) both have positive long-run local effects! They do so by affecting institutions – whether villages have the ability to produce public goods like education – and by affecting incentives – the production of capital used up- and downstream. One can easily imagine cases where forced labor and extractive capital have negative long-run effects, and we have great papers by Daron Acemoglu, Nathan Nunn, Sara Lowes and others on precisely this point. But it is also very easy for societies to get trapped in bad path dependent equilibria, for which outside intervention, even ethically shameful ones, can (perhaps inadvertently) cause useful shifts in incentives and institutions! I recall a visit to Babeldaob, the main island in Palau. During the Japanese colonial period, the island was heavily industrialized as part of Japan’s war machine. These factories were destroyed by the Allies in World War 2. Yet despite their extractive history, a local told me many on the island believe that the industrial development of the region was permanently harmed when those factories were damaged. It seems a bit crazy to mourn the loss of polluting, extractive plants whose whole purpose was to serve a colonial master, but the Palauan may have had some wisdom after all!

2017 Working Paper is here (no RePEc IDEAS version). For more on sugar and institutions, I highly recommend Christian Dippel, Avner Greif and Dan Trefler’s recent paper on Caribbean sugar. The price of sugar fell enormously in the late 19th century, yet wages on islands which lost the ability to productively export sugar rose. Why? Planters in places like Barbados had so much money from their sugar exports that they could manipulate local governance and the police, while planters in places like the Virgin Islands became too poor to do the same. This decreased labor coercion, permitting workers on sugar plantations to work small plots or move to other industries, raising wages in the end. I continue to await Suresh Naidu’s book on labor coercion – it is astounding the extent to which labor markets were distorted historically (see, e.g., Eric Foner on Reconstruction), and in some cases still today, by legal and extralegal restrictions on how workers could move on up.

William Baumol: Truly Productive Entrepreneurship

It seems this weblog has become an obituary page rather than a simple research digest of late. I am not even done writing on the legacy of Ken Arrow (don’t worry – it will come!) when news arrives that yet another product of the World War 2 era in New York City, an of the CCNY system, has passed away: the great scholar of entrepreneurship and one of my absolute favorite economists, William Baumol.

But we oughtn’t draw the line on his research simply at entrepreneurship, though I will walk you through his best piece in the area, a staple of my own PhD syllabus, on “creative, unproductive, and destructive” entrepreneurship. Baumol was also a great scholar of the economics of the arts, performing and otherwise, which were the motivation for his famous cost disease argument. He was a very skilled micro theorist, a talented economic historian, and a deep reader of the history of economic thought, a nice example of which is his 2000 QJE on what we have learned since Marshall. In all of these areas, his papers are a pleasure to read, clear, with elegant turns of phrase and the casual yet erudite style of an American who’d read his PhD in London under Robbins and Viner. That he has passed without winning his Nobel Prize is a shame – how great would it have been had he shared a prize with Nate Rosenberg before it was too late for them both?

Baumol is often naively seen as a Schumpeter-esque defender of the capitalist economy and the heroic entrepreneur, and that is only half right. Personally, his politics were liberal, and as he argued in a recent interview, “I am well aware of all the very serious problems, such as inequality, unemployment, environmental damage, that beset capitalist societies. My thesis is that capitalism is a special mechanism that is uniquely effective in accomplishing one thing: creating innovations, applying those innovations and using them to stimulate growth.” That is, you can find in Baumol’s work many discussions of environmental externalities, of the role of government in funding research, in the nature of optimal taxation. You can find many quotes where Baumol expresses interest in the policy goals of the left (though often solved with the mechanism of the market, and hence the right). Yet the core running through much of Baumol’s work is a rigorous defense, historically and theoretically grounded, in the importance of getting incentives correct for socially useful innovation.

Baumol differs from many other prominent economists of innovation because is at his core a neoclassical theorist. He is not an Austrian like Kirzner or an evolutionary economist like Sid Winter. Baumol’s work stresses that entrepreneurs and the innovations they produce are fundamental to understanding the capitalist economy and its performance relative to other economic systems, but that the best way to understand the entrepreneur methodologically was to formalize her within the context of neoclassical equilibria, with innovation rather than price alone being “the weapon of choice” for rational, competitive firms. I’ve always thought of Baumol as being the lineal descendant of Schumpeter, the original great thinker on entrepreneurship and one who, nearing the end of his life and seeing the work of his student Samuelson, was convinced that his ideas should be translated into formal neoclassical theory.

A 1968 essay in the AER P&P laid out Baumol’s basic idea that economics without the entrepreneur is, in a line he would repeat often, like Hamlet without the Prince of Denmark. He clearly understood that we did not have a suitable theory for oligopoly and entry into new markets, or for the supply of entrepreneurs, but that any general economic theory needed to be able to explain why growth is different in different countries. Solow’s famous essay convinced much of the profession that the residual, interpreted then primarily as technological improvement, was the fundamental variable explaining growth, and Baumol, like many, believed those technological improvements came mainly from entrepreneurial activity.

But what precisely should the theory look like? Ironically, Baumol made his most productive step in a beautiful 1990 paper in the JPE which contains not a single formal theorem nor statistical estimate of any kind. Let’s define an entrepreneur as “persons who are ingenious or creative in finding ways to add to their wealth, power, or prestige”. These people may introduce new goods, or new methods of production, or new markets, as Schumpeter supposed in his own definition. But are these ingenious and creative types necessarily going to do something useful for social welfare? Of course not – the norms, institutions, and incentives in a given society may be such that the entrepreneurs perform socially unproductive tasks, such as hunting for new tax loopholes, or socially destructive tasks, such as channeling their energy into ever-escalating forms of warfare.

With the distinction between productive, unproductive, and destructive entrepreneurship in mind, we might imagine that the difference in technological progress across societies may have less to do with the innate drive of the society’s members, and more to do with the incentives for different types of entrepreneurship. Consider Rome, famously wealthy yet with very little in the way of useful technological diffusion: certainly the Romans appear less innovative than either the Greeks or Europe of the Middle Ages. How can a society both invent a primitive steam engine – via Herod of Alexandria – and yet see it used for nothing other than toys and religious ceremonies? The answer, Baumol notes, is that status in Roman society required one to get rich via land ownership, usury, or war; commerce was a task primarily for slaves and former slaves! And likewise in Song dynasty China, where imperial examinations were both the source of status and the ability to expropriate any useful inventions or businesses that happened to appear. In the European middle ages, incentives shift for the clever from developing war implements to the diffusion of technology like the water-mill under the Cistercians back to weapons. These examples were expanded to every society from Ancient Mesopotamia to the Dutch Republic to the modern United States in a series of economically-minded historians in a wonderful collection of essays called “The Invention of Enterprise” which was edited by Baumol alongside Joel Mokyr and David Landes.

Now we are approaching a sort of economic theory of entrepreneurship – no need to rely on the whims of character, but instead focus on relative incentives. But we are still far from Baumol’s 1968 goal: incorporating the entrepreneur into neoclassical theory. The closest Baumol comes is in his work in the early 1980s on contestable markets, summarized in the 1981 AEA Presidential Address. The basic idea is this. Assume industries have scale economies, so oligopoly is their natural state. How worried should we be? Well, if there are no sunk costs and no entry barriers for entrants, and if entrants can siphon off customers quicker than incumbents can respond, then Baumol and his coauthors claimed that the market was contestable: the threat of entry is sufficient to keep the incumbent from exerting their market power. On the one hand, fine, we all agree with Baumol now that industry structure is endogenous to firm behavior, and the threat of entry clearly can restrain market power. But on the other hand, is this “ultra-free entry” model the most sensible way to incorporate entry and exit into a competitive model? Why, as Dixit argued, is it quicker to enter a market than to change price? Why, as Spence argued, does the unrealized threat of entry change equilibrium behavior if the threat is truly unrealized along the equilibrium path?

It seems that what Baumol was hoping this model would lead to was a generalized theory of perfect competition that permitted competition for the market rather than just in the market, since the competition for the market is naturally the domain of the entrepreneur. Contestable markets are too flawed to get us there. But the basic idea, that game-theoretic endogenous market structure, rather than the old fashioned idea that industry structure affects conduct affects performance, is clearly here to stay: antitrust is essentially applied game theory today. And once you have the idea of competition for the market, the natural theoretical model is one where firms compete to innovate in order to push out incumbents, incumbents innovate to keep away from potential entrants, and profits depend on the equilibrium time until the dominant firm shifts: I speak, of course, about the neo-Schumpeterian models of Aghion and Howitt. These models, still a very active area of research, are finally allowing us to rigorously investigate the endogenous rewards to innovation via a completely neoclassical model of market structure and pricing.

I am not sure why Baumol did not find these neo-Schumpeterian models to be the Holy Grail he’d been looking for; in his final book, he credits them for being “very powerful” but in the end holding different “central concerns”. He may have been mistaken in this interpretation. It proved quite interesting to give a careful second read of Baumol’s corpus on entrepreneurship, and I have to say it disappoints in part: the questions he asked were right, the theoretical acumen he possessed was up to the task, the understanding of history and qualitative intuition was second to none, but in the end, he appears to have been just as stymied by the idea of endogenous neoclassical entrepreneurship as the many other doyens of our field who took a crack at modeling this problem without, in the end, generating the model they’d hoped they could write.

Where Baumol has more success, and again it is unusual for a theorist that his most well-known contribution is largely qualitative, is in the idea of cost disease. The concept comes from Baumol’s work with William Bowen (see also this extension with a complete model) on the economic problems of the performing arts. It is a simple idea: imagine productivity in industry rises 4% per year, but “the output per man-hour of a violinist playing a Schubert quarter in a standard concert hall” remains fixed. In order to attract workers into music rather than industry, wages must rise in music at something like the rate they rise in industry. But then costs are increasing while productivity is not, and the arts looks “inefficient”. The same, of course, is said for education, and health care, and other necessarily labor-intensive industries. Baumol’s point is that rising costs in unproductive sectors reflect necessary shifts in equilibrium wages rather than, say, growing wastefulness.

How much can cost disease explain? Because the concept is so widely known by now that it is, in fact, used to excuse stagnant industries. Teaching, for example, requires some labor, but does anybody believe that it is impossible for R&D and complementary inventions (like the internet, for example) to produce massive productivity improvements? Is it not true that movie theaters now show opera live from the world’s great halls on a regular basis? Is it not true that my Google Home can, activated by voice, call up two seconds from now essentially any piece of recorded music I desire, for free? Speculating about industries that are necessarily labor-intensive (and hence grow slowly) from those with rapid technological progress is a very difficult game, and one we ought hesitate to play. But equally, we oughtn’t forget Baumol’s lesson: in some cases, in some industries, what appears to be fixable slack is in fact simply cost disease. We may ask, how was it that Ancient Greece, with its tiny population, put on so many plays, while today we hustle ourselves to small ballrooms in New York and London? Baumol’s answer, rigorously shown: cost disease. The “opportunity cost” of recruiting a big chorus was low, as those singers would otherwise have been idle or working unproductive fields gathering olives. The difference between Athens and our era is not simply that they were “more supportive of the arts”!

Baumol was incredibly prolific, so these suggestions for further reading are but a taste: An interview by Alan Krueger is well worth the read for anecdotes alone, like the fact that apparently one used to do one’s PhD oral defense “over whiskies and sodas at the Reform Club”. I also love his defense of theory, where if he is very lucky, his initial intuition “turn[s] out to be totally wrong. Because when I turn out to be totally wrong, that’s when the best ideas come out. Because if my intuition was right, it’s almost always going to be simple and straightforward. When my intuition turns out to be wrong, then there is something less obvious to explain.” Every theorist knows this: formalization has this nasty habit of refining our intuition and convincing us our initial thoughts actually contain logical fallacies or rely on special cases! Though known as an applied micro theorist, Baumol also wrote a canonical paper, with Bradford, on optimal taxation: essentially, if you need to raise $x in tax, how should you optimally deviate from marginal cost pricing? The history of thought is nicely diagrammed, and of course this 1970 paper was very quickly followed by the classic work of Diamond and Mirrlees. Baumol wrote extensively on environmental economics, drawing in many of his papers on the role nonconvexities in the social production possibilities frontier play when they are generated by externalities – a simple example of this effect, and the limitations it imposes on Pigouvian taxation, is in the link. More recently, Baumol has been writing on international trade with Ralph Gomory (the legendary mathematician behind a critical theorem in integer programming, and later head of the Sloan Foundation); their main theorems are not terribly shocking to those used to thinking in terms of economies of scale, but the core example in the linked paper is again a great example of how nonconvexities can overturn a lot of our intuition, in the case on comparative advantage. Finally, beyond his writing on the economics of the arts, Baumol proved that there is no area in which he personally had stagnant productivity: an art major in college, he was also a fantastic artist in his own right, picking up computer-generated art while in his 80s and teaching for many years a course on woodworking at Princeton!

A John Bates Clark Prize for Economic History!

A great announcement last week, as Dave Donaldson, an economic historian and trade economist, has won the 2017 John Bates Clark medal! This is an absolutely fantastic prize: it is hard to think of any young economist whose work is as serious as Donaldson’s. What I mean by that is that in nearly all of Donaldson’s papers, there is a very specific and important question, a deep collection of data, and a rigorous application of theory to help identify the precise parameters we are most concerned with. It is the modern economic method at its absolute best, and frankly is a style of research available to very few researchers, as the specific combination of theory knowledge and empirical agility required to employ this technique is very rare.

A canonical example of Donaldson’s method is his most famous paper, written back when he was a graduate student: “The Railroads of the Raj”. The World Bank today spends more on infrastructure than on health, education, and social services combined. Understanding the link between infrastructure and economic outcomes is not easy, and indeed has been a problem that has been at the center of economic debates since Fogel’s famous accounting on the railroad. Further, it is not obvious either theoretically or empirically that infrastructure is good for a region. In the Indian context, no less a sage than the proponent of traditional village life Mahatma Gandhi felt the British railroads, rather than help village welfare, “promote[d] evil”, and we have many trade models where falling trade costs plus increasing returns to scale can decrease output and increase income volatility.

Donaldson looks at the setting of British India, where 67,000 kilometers of rail were built, largely for military purposes. India during the British Raj is particularly compelling as a setting due to its heterogeneous nature. Certain seaports – think modern Calcutta – were built up by the British as entrepots. Many internal regions nominally controlled by the British were left to rot via, at best, benign neglect. Other internal regions were quasi-independent, with wildly varying standards of governance. The most important point, though, is that much of the interior was desperately poor and in a de facto state of autarky: without proper roads or rail until the late 1800s, goods were transported over rough dirt paths, leading to tiny local “marketing regions” similar to what Skinner found in his great studies of China. British India is also useful since data on goods shipped, local on weather conditions, and agricultural prices were rigorously collected by the colonial authorities. Nearly all that local economic data is in dusty tomes in regional offices across the modern subcontinent, but it is at least in principle available.

Let’s think about how many competent empirical microeconomists would go about investigating the effects of the British rail system. It would be a lot of grunt work, but many economists would spend the time collecting data from those dusty old colonial offices. They would then worry that railroads are endogenous to economic opportunity, so would hunt for reasonable instruments or placebos, such as railroads that were planned yet unbuilt, or railroad segments that skipped certain areas because of temporary random events. They would make some assumptions on how to map agricultural output into welfare, probably just restricting the dependent variable in their regressions to some aggregate measure of agricultural output normalized by price. All that would be left to do is run some regressions and claim that the arrival of the railroad on average raised agricultural income by X percent. And look, this wouldn’t be a bad paper. The setting is important, the data effort heroic, the causal factors plausibly exogenous: a paper of this form would have a good shot at a top journal.

When I say that Donaldson does “serious” work, what I mean is that he didn’t stop with those regressions. Not even close! Consider what we really want to know. It’s not “What is the average effect of a new railroad on incomes?” but rather, “How much did the railroad reduce shipping costs, in each region?”, “Why did railroads increase local incomes?”, “Are there alternative cheaper policies that could have generated the same income benefit?” and so on. That is, there are precise questions, often involving counterfactuals, which we would like to answer, and these questions and counterfactuals necessarily involve some sort of model mapping the observed data into hypotheticals.

Donaldson leverages both reduced-form, well-identified evidence, and that broader model we suggested was necessary, and does so with a paper which is beautifully organized. First, he writes down an Eaton-Kortum style model of trade (Happy 200th Birthday to the theory of comparative advantage!) where districts get productivity draws across goods then trade subject to shipping costs. Consider this intuition: if a new rail line connect Gujarat to Bihar, then the existence of this line will change Gujarat’s trade patterns with every other state, causing those other states to change their own trade patterns, causing a whole sequence of shifts in relative prices that depend on initial differences in trade patterns, the relative size of states, and so on. What Donaldson notes is that if you care about welfare in Gujarat, all of those changes only affect Gujaratis if they affect what Gujaratis end up consuming, or equivalently if it affects the real income they earn from their production. Intuitively, if pre-railroad Gujarat’s local consumption was 90% locally produced, and after the railroad was 60% locally produced, then declining trade costs permitted the magic of comparative advantage to permit additional specialization and hence additional Ricardian rents. This is what is sometimes called a sufficient statistics approach: the model suggests that the entire effect of declining trade costs on welfare can be summarized by knowing agricultural productivity for each crop in each area, the local consumption share which is imported, and a few elasticity parameters. Note that the sufficient statistic is a result, not an assumption: the Eaton-Kortum model permits taste for variety, for instance, so we are not assuming away any of that. Now of course the model can be wrong, but that’s something we can actually investigate directly.

So here’s what we’ll do: first, simply regress time and region dummies plus a dummy for whether rail has arrived in a region on real agricultural production in that region. This regression suggests a rail line increases incomes by 16%, whereas placebo regressions for rail lines that were proposed but canceled see no increase at all. 16% is no joke, as real incomes in India over the period only rose 22% in total! All well and good. But what drives that 16%? Is it really Ricardian trade? To answer that question, we need to estimate the parameters in that sufficient statistics approach to the trade model – in particular, we need the relative agricultural productivity of each crop in each region, elasticities of trade flows to trade costs (and hence the trade costs themselves), and the share of local consumption which is locally produced (the “trade share”). We’ll then note that in the model, real income in a region is entirely determined by an appropriately weighted combination of local agricultural productivity and changes in the weighted trade share, hence if you regress real income minus the weighted local agricultural productivity shock on a dummy for the arrival of a railroad and the trade share, you should find a zero coefficient on the rail dummy if, in fact, the Ricardian model is capturing why railroads affect local incomes. And even more importantly, if we find that zero, then we understand that efficient infrastructure benefits a region through the sufficient statistic of the trade share, and we can compare the cost-benefit ratio of the railroad to other hypothetical infrastructure projects on the basis of a few well-known elasticities.

So that’s the basic plot. All that remains is to estimate the model parameters, a nontrivial task. First, to get trade costs, one could simply use published freight rates for boats, overland travel, and rail, but this wouldn’t be terribly compelling; bandits, and spoilage, and all the rest of Samuelson’s famous “icebergs” like linguistic differences raise trade costs as well. Donaldson instead looks at the differences in origin and destination prices for goods produced in only one place – particular types of salt – before and after the arrival of a railroad. He then uses a combination of graph theory and statistical inference to estimate the decline in trade costs between all region pairs. Given massive heterogeneity in trade costs by distance – crossing the Western Ghats is very different from shipping a boat down the Ganges! – this technique is far superior to simply assuming trade costs linear in distance for rail, road, or boat.

Second, he checks whether lowered trade costs actually increased trade volume, and at what elasticity, using local rainfall as a proxy for local productivity shocks. The use of rainfall data is wild: for each district, he gathers rainfall deviations for the sowing to harvest times individually for each crop. This identifies the agricultural productivity distribution parameters by region, and therefore, in the Eaton-Kortum type model, lets us calculate the elasticity of trade volume to trade shocks. Salt shipments plus crop-by-region specific rain shocks give us all of the model parameters which aren’t otherwise available in the British data. Throwing these parameters into the model regression, we do in fact find that once agricultural productivity shocks and the weighted trade share are accounted for, the effect of railroads on local incomes are not much different from zero. The model works, and note that real incomes changes based on the timing of the railroad were at no point used to estimate any of the model parameters! That is, if you told me that Bihar had positive rain shocks which increased output on their crops by 10% in the last ten years, and that the share of local production which is eaten locally went from 60 to 80%, I could tell you with quite high confidence the change in local real incomes without even needing to know when the railroad arrived – this is the sense in which those parameters are a “sufficient statistic” for the full general equilibrium trade effects induced by the railroad.

Now this doesn’t mean the model has no further use: indeed, that the model appears to work gives us confidence to take it more seriously when looking at counterfactuals like, what if Britain had spent money developing more effective seaports instead? Or building a railroad network to maximize local economic output rather than on the basis of military transit? Would a noncolonial government with half the resources, but whose incentives were aligned with improving the domestic economy, have been able to build a transport network that improved incomes more even given their limited resources? These are first order questions about economic history which Donaldson can in principle answer, but which are fundamentally unavailable to economists who do not push theory and data as far as he was willing to push them.

The Railroads of the Raj paper is canonical, but far from Donaldson’s only great work. He applies a similar Eaton-Kortum approach to investigate how rail affected the variability of incomes in India, and hence the death rate. Up to 35 million people perished in famines in India in the second half of the 19th century, as the railroad was being built, and these famines appeared to end (1943 being an exception) afterwards. Theory is ambiguous about whether openness increases or decreases the variance of your welfare. On the one hand, in an open economy, the price of potatoes is determined by the world market and hence the price you pay for potatoes won’t swing wildly up and down depending on the rain in a given year in your region. On the other hand, if you grow potatoes and there is a bad harvest, the price of potatoes won’t go up and hence your real income can be very low during a drought. Empirically, less variance in prices in the market after the railroad arrives tends to be more important for real consumption, and hence for mortality, than the lower prices you can get for your own farm goods when there is a drought. And as in the Railroads of the Raj paper, sufficient statistics from a trade model can fully explain the changes in mortality: the railroad decreased the effect of bad weather on mortality completely through Ricardian trade.

Leaving India, Donaldson and Richard Hornbeck took Fogel’s intuition that the the importance of the railroad to the US depends on trade that is worthwhile when the railroad exists versus trade that is worthwhile when only alternatives like better canals or roads exist. That is, if it costs $9 to ship a wagonful of corn by canal, and $8 to do the same by rail, then even if all corn is shipped by rail once the railroad is built, we oughtn’t ascribe all of that trade to the rail. Fogel assumed relationships between land prices and the value of the transportation network. Hornbeck and Donaldson alternatively estimate that relationship, again deriving a sufficient statistic for the value of market access. The intuition is that adding a rail link from St. Louis to Kansas City will also affect the relative prices, and hence agricultural production, in every other region of the country, and these spatial spillovers can be quite important. Adding the rail line to Kansas City affects market access costs in Kansas City as well as relative prices, but clever application of theory can still permit a Fogel-style estimate of the value of rail to be made.

Moving beyond railroads, Donaldson’s trade work has also been seminal. With Costinot and Komunjer, he showed how to rigorously estimate the empirical importance of Ricardian trade for overall gains from trade. Spoiler: it isn’t that important, even if you adjust for how trade affects market power, a result seen in a lot of modern empirical trade research which suggests that aspects like variety differences are more important than Ricardian productivity differences for gains from international trade. There are some benefits to Ricardian trade across countries being relatively unimportant: Costinot, Donaldson and Smith show that changes to what crops are grown in each region can massively limit the welfare harms of climate change, whereas allowing trade patterns to change barely matters. The intuition is that there is enough heterogeneity in what can be grown in each country when climate changes to make international trade relatively unimportant for mitigating these climate shifts. Donaldson has also rigorously studied in a paper with Atkin the importance of internal rather than international trade costs, and has shown in a paper with Costinot that economic integration has been nearly as important as productivity improvements in increasing the value created by American agriculture over the past century.

Donaldson’s CV is a testament to how difficult this style of work is. He spent eight years at LSE before getting his PhD, and published only one paper in a peer reviewed journal in the 13 years following the start of his graduate work. “Railroads of the Raj” has been forthcoming at the AER for literally half a decade, despite the fact that this work is the core of what got Donaldson a junior position at MIT and a tenured position at Stanford. Is it any wonder that so few young economists want to pursue a style of research that is so challenging and so difficult to publish? Let us hope that Donaldson’s award encourages more of us to fully exploit both the incredible data we all now have access to, but also the beautiful body of theory that induces deep insights from that data.

Kenneth Arrow Part II: The Theory of General Equilibrium

The first post in this series discussed Ken Arrow’s work in the broad sense, with particular focus on social choice. In this post, we will dive into his most famous accomplishment, the theory of general equilibrium (1954, Econometrica). I beg the reader to offer some sympathy for the approximations and simplifications that will appear below: the history of general equilibrium is, by this point, well-trodden ground for historians of thought, and the interpretation of history and theory in this area is quite contentious.

My read of the literature on GE following Arrow is as follows. First, the theory of general equilibrium is an incredible proof that markets can, in theory and in certain cases, work as efficiently as an all-powerful planner. That said, the three other hopes of general equilibrium theory since the days of Walras are, in fact, disproven by the work of Arrow and its followers. Market forces will not necessarily lead us toward these socially optimal equilibrium prices. Walrasian demand does not have empirical content derived from basic ordinal utility maximization. We cannot rigorously perform comparative statics on general equilibrium economic statistics without assumptions that go beyond simple utility maximization. From my read of Walras and the early general equilibrium theorists, all three of those results would be a real shock.

Let’s start at the beginning. There is an idea going back to Adam Smith and the invisible hand, an idea that individual action will, via the price system, lead to an increase or even maximization of economic welfare (an an aside, Smith’s own use of “invisible hand” trope is overstated, as William Grampp among others has convincingly argued). The kind of people who denigrate modern economics – the neo-Marxists, the back-of-the-room scribblers, the wannabe-contrarian-dilletantes – see Arrow’s work, and the idea of using general equilibrium theory to “prove that markets work”, as a barbarism. We know, and have known well before Arrow, that externalities exist. We know, and have known well before Arrow, that the distribution of income depends on the distribution of endowments. What Arrow was interested in was examining not only whether the invisible hand argument “is true, but whether it could be true”. That is, if we are to claim markets are uniquely powerful at organizing economic activity, we ought formally show that the market could work in such a manner, and understand the precise conditions under which it won’t generate these claimed benefits. How ought we do this? Prove the precise conditions under which there exists a price vector where markets clear, show the outcome satisfies some welfare criterion that is desirable, and note exactly why each of the conditions are necessary for such an outcome.

The question is, how difficult is it to prove these prices exist? The term “general equilibrium” has had many meanings in economics. Today, it is often used to mean “as opposed to partial equilibrium”, meaning that we consider economic effects allowing all agents to adjust to a change in the environment. For instance, a small random trial of guaranteed incomes has, as its primary effect, an impact on the incomes of the recipients; the general equilibrium effects of making such a policy widespread on the labor market will be difficult to discern. In the 19th and early 20th century, however, the term was much more concerned with the idea of the economy as a self-regulating system. Arrow put it very nicely in an encyclopedia chapter he wrote in 1966: general equilibrium is both “the simple notion of determinateness, that the relations which describe the economic system must form a system sufficiently complete to determine the values of its variables and…the more specific notion that each relation represents a balance of forces.”

If you were a classical, a Smith or a Marx or a Ricardo, the problem of what price will obtain in a market is simple to solve: ignore demand. Prices are implied by costs and a zero profit condition, essentially free entry. And we more or less think like this now in some markets. With free entry and every firm producing at the identical minimum efficient scale, price is entirely determined by the supply side, and only quantity is determined by demand. With one factor, labor where the Malthusian condition plays the role of free entry, or labor and land in the Ricardian system, this classical model of value is well-defined. How to handle capital and differentiated labor is a problem to be assumed away, or handled informally; Samuelson has many papers where he is incensed by Marx’s handling of capital as embodied labor.

The French mathematical economist Leon Walras finally cracked the nut by introducing demand and price-taking. There are household who produce and consume. Equilibrium involves supply and demand equating in each market, hence price is where margins along the supply and demand curves equate. Walras famously (and informally) proposed a method by which prices might actually reach equilibrium: the tatonnement. An auctioneer calls out a price vector: in some markets there is excess demand and in some excess supply. Prices are then adjusted one at a time. Of course each price change will affect excess demand and supply in other markets, but you might imagine things can “converge” if you adjust prices just right. Not bad for the 1870s – there is a reason Schumpeter calls this the “Magna Carta” of economic theory in his History of Economic Analysis. But Walras was mistaken on two counts: first, knowing whether there even exists an equilibrium that clears every market simultaneously is, it turns out, equivalent to a problem in Poincare’s analysis situs beyond the reach of mathematics in the 19th century, and second, the conditions under which tatonnement actually converges are a devilish problem.

The equilibrium existence problem is easy to understand. Take the simplest case, with all j goods made up of the linear combination of k factors. Demand equals supply just says that Aq=e, where q is the quantity of each good produced, e is the endowment of each factor, and A is the input-output matrix whereby product j is made up of some combination of factors k. Also, zero profit in every market will imply Ap(k)=p(j), where p(k) are the factor prices and p(j) the good prices. It was pointed out that even in this simple system where everything is linear, it is not at all trivial to ensure that prices and quantities are not negative. It would not be until Abraham Wald in the mid-1930s – later Arrow’s professor at Columbia and a fellow Romanian, links that are surely not a coincidence! – that formal conditions were shown giving existence of general equilibrium in a simple system like this one, though Wald’s proof greatly simplified by the general problem by imposing implausible restrictions on aggregate demand.

Mathematicians like Wald, trained in the Vienna tradition, were aghast at the state of mathematical reasoning in economics at the time. Oskar Morgenstern absolutely hammered the great economist John Hicks in a 1941 review of Hicks’ Value and Capital, particularly over the crazy assertion (similar to Walras!) that the number of unknowns and equations being identical in a general equilibrium system sufficed for a solution to exist (if this isn’t clear to you in a nonlinear system, a trivial example with two equations and two unknowns is here). Von Neumann apparently said (p. 85) to Oskar, in reference to Hicks and those of his school, “if those books are unearthed a hundred years hence, people will not believe they were written in our time. Rather they will think they are about contemporary with Newton, so primitive is the mathematics.” And Hicks was quite technically advanced compared to his contemporary economists, bringing the Keynesian macroeconomics and the microeconomics of indifference curves and demand analysis together masterfully. Arrow and Hahn even credit their initial interest in the problems of general equilibrium to the serendipity of coming across Hicks’ book.

Mathematics had advanced since Walras, however, and those trained at the mathematical frontier finally had the tools to tackle Walras’ problem seriously. Let D(p) be a vector of demand for all goods given price p, and e be initial endowments of each good. Then we simply need D(p)=e or D(p)-e=0 in each market. To make things a bit harder, we can introduce intermediate and factor goods with some form of production function, but the basic problem is the same: find whether there exists a vector p such that a nonlinear equation is equal to zero. This is the mathematics of fixed points, and Brouwer had, in 1912, given a nice theorem: every continuous function from a compact convex subset to itself has a fixed point. Von Neumann used this in the 1930s to prove a similar result to Wald. A mathematician named Shizuo Kakutani, inspired by von Neumann, extended the Brouwer result to set-valued mappings called correspondences, and John Nash in 1950 used that result to show, in a trivial proof, the existence of mixed equilibria in noncooperative games. The math had arrived: we had the tools to formally state when non-trivial non-linear demand and supply systems had a fixed point, and hence a price that cleared all markets. We further had techniques for handling “corner solutions” where demand for a given good was zero at some price, surely a common outcome in the world: the idea of the linear program and complementary slackness, and its origin in convex set theory as applied to the dual, provided just the mathematics Arrow and his contemporaries would need.

So here we stood in the early 1950s. The mathematical conditions necessary to prove that a set-valued function has an equilibrium have been worked out. Hicks, in Value and Capital, has given Arrow the idea that relating the future to today is simple: just put a date on every commodity and enlarge the commodity space. Indeed, adding state-contingency is easy: put an index for state in addition to date on every commodity. So we need not only zero excess demand in apples, or in apples delivered in May 1955, but in apples delivered in May 1955 if Eisenhower loses his reelection bid. Complex, it seems, but no matter: the conditions for the existence of a fixed point will be the same in this enlarged commodity space.

With these tools in mind, Arrow and Debreu can begin their proof. They first define a generalization of an n-person game where the feasible set of actions for each player depends on the actions of every other player; think of the feasible set as “what can I afford given the prices that will result for the commodities I am endowed with?” The set of actions is an n-tuple where n is the number of date and state indexed commodities a player could buy. Debreu showed in 1952 PNAS that these generalized games have an equilibrium as long as each payoff function varies continuously with other player’s actions, the feasible set of choices convex and varies continuously in other player’s actions, and the set of actions which improve a player’s payoff are convex for every action profile. Arrow and Debreu then show that the usual implications on individual demand are sufficient to aggregate up to the conditions Debreu’s earlier paper requires. This method is much, much different from what is done by McKenzie or other early general equilibrium theorists: excess demand is never taken as a primitive. This allows the Arrow-Debreu proof to provide substantial economic intuition as Duffie and Sonnenschein point out in a 1989 JEL. For instance, showing that the Arrow-Debreu equilibrium exists even with taxation is trivial using their method but much less so in methods that begin with excess demand functions.

This is already quite an accomplishment: Arrow and Debreu have shown that there exists a price vector that clears all markets simultaneously. The nature of their proof, as later theorists will point out, relies less on convexity on preferences and production sets as on the fact that every agent is “small” relative to the market (convexity is used to get continuity in the Debreu game, and you can get this equally well by making all consumers infinitesimal and then randomizing allocations to smooth things out; see Duffie and Sonnenschein above for an example). At this point, it’s the mid-1950s, heyday of the Neoclassical synthesis: surely we want to be able to answer questions like, when there is a negative demand shock, how will the economy best reach a Pareto-optimal equilibrium again? How do different speeds of adjustment due to sticky prices or other frictions affect the rate at which optimal is regained? Those types of question implicitly assume that the equilibrium is unique (at least locally) so that we actually can “return” to where we were before the shock. And of course we know some of the assumptions needed for the Arrow-Debreu proof are unrealistic – e.g., no fixed costs in production – but we would at least like to work out how to manipulate the economy in the “simple” case before figuring out how to deal with those issues.

Here is where things didn’t work out as hoped. Uzawa (RESTUD, 1960) proved that not only could Brouwer’s theorem be used to prove the existence of general equilibrum, but that the opposite was true as well: the existence of general equilibrium was logically equivalent to Brouwer. A result like this certainly makes one worry about how much one could say about prices in general equilibrium. The 1970s brought us the Sonnenschein-Mantel-Debreu “Anything Goes” theorem: aggregate excess demand functions do not inherit all the properties of individual excess demand functions because of wealth effects (when relative prices change, the value of one’s endowment changes as well). For any aggregate excess demand function satisfying a couple minor restrictions, there exists an economy with individual preferences generating that function; in particular, fewer restrictions than are placed on individual excess demand as derived from individual preference maximization. This tells us, importantly, that there is no generic reason for equilibria to be unique in an economy.

Multiplicity of equilibria is a problem: if the goal of GE was to be able to take underlying primitives like tastes and technology, calculate “the” prices that clear the market, then examine how those prices change (“comparative statics”), we essentially lose the ability to do all but local comparative statics since large changes in the environment may cause the economy to jump to a different equilibrium (luckily, Debreu (1970, Econometrica) at least generically gives us a finite number of equilibria, so we may at least be able to say something about local comparative statics for very small shocks). Indeed, these analyses are tough without an equilibrium selection mechanism, which we don’t really have even now. Some would say this is no big deal: of course the same technology and tastes can generate many equilibria, just as cars may wind up all driving on either the left or the right in equilibrium. And true, all of the Arrow-Debreu equilibria are Pareto optimal. But it is still far afield from what might have been hoped for in the 1930s when this quest for a modern GE theory began.

Worse yet is stability, as Arrow and his collaborators (1958, Ecta; 1959, Ecta) would help discover. Even if we have a unique equilibrium, Herbert Scarf (IER, 1960) showed, via many simple examples, how Walrasian tatonnement can lead to cycles which never converge. Despite a great deal of the intellectual effort in the 1960s and 1970s, we do not have a good model of price adjustment even now. I should think we are unlikely to ever have such a theory: as many theorists have pointed out, if we are in a period of price adjustment and not in an equilibrium, then the zero profit condition ought not apply, ergo why should there be “one” price rather than ten or a hundred or a thousand?

The problem of multiplicity and instability for comparative static analysis ought be clear, but it should also be noted how problematic they are for welfare analysis. Consider the Second Welfare Theorem: under the Arrow-Debreu system, for every Pareto optimal allocation, there exists an initial endowment of resources such that that allocation is an equilibrium. This is literally the main justification for the benefits of the market: if we reallocate endowments, free exchange can get us to any Pareto optimal point, ergo can get us to any reasonable socially optimal point no matter what social welfare function you happen to hold. How valid is this justification? Call x* the allocation that maximizes some social welfare function. Let e* be an initial endowment for which x* is an equilibrium outcome – such an endowment must exist via Arrow-Debreu’s proof. Does endowing agents with e* guarantee we reach that social welfare maximum? No: x* may not be unique. Even if it unique, will we reach it? No: if it is not a stable equilibrium, it is only by dint of luck that our price adjustment process will ever reach it.

So let’s sum up. In the 1870s, Walras showed us that demand and supply, with agents as price takers, can generate supremely useful insights into the economy. Since demand matters, changes in demand in one market will affect other markets as well. If the price of apples rises, demand for pears will rise, as will their price, whose secondary effect should be accounted for in the market for apples. By the 1930s we have the beginnings of a nice model of individual choice based on constrained preference maximization. Taking prices as given, individual demands have well-defined forms, and excess demand in the economy can be computed by a simple summing up. So we now want to know: is there in fact a price that clears the market? Yes, Arrow and Debreu show, there is, and we needn’t assume anything strange about individual demand to generate this. These equilibrium prices always give Pareto optimal allocations, as had long been known, but there also always exist endowments such that every Pareto optimal allocation is an equilibria. It is a beautiful and important result, and a triumph for the intuition of the invisible hand it its most formal sense.

Alas, it is there we reach a dead end. Individual preferences alone do not suffice to tell us what equilibria we are at, nor that any equilibria will be stable, nor that any equilibria will be reached by an economically sensible adjustment process. To say anything meaningful about aggregate economic outcomes, or about comparative statics after modest shocks, or about how technological changes change price, we need to make assumptions that go beyond individual rationality and profit maximization. This is, it seems to me, a shock for the economists of the middle of the century, and still a shock for many today. I do not think this means “general equilibrium is dead” or that the mathematical exploration in the field was a waste. We learned a great deal about precisely when markets could even in principle achieve the first best, and that education was critical for the work Arrow would later do on health care, innovation, and the environment, which I will discuss in the next two posts. And we needn’t throw out general equilibrium analysis because of uniqueness or stability problems, any more than we would throw out game theoretic analysis because of the same problems. But it does mean that individual rationality as the sole paradigm of economic analysis is dead: it is mathematically proven that postulates of individual rationality will not allow us to say anything of consequence about economic aggregates or game theoretic outcomes in the frequent scenarios where we do not have a unique equilibria with a well-defined way to get there (via learning in games, or a tatonnament process in GE, or something of a similar nature). Arrow himself (1986, J. Business) accepts this: “In the aggregate, the hypothesis of rational behavior has in general no implications.” This is an opportunity for economists, not a burden, and we still await the next Arrow who can guide us on how to proceed.

Some notes on the literature: For those interested in the theoretical development of general equilibrium, I recommend General Equilibrium Analysis by Roy Weintraub, a reformed theorist who now works in the history of thought. Wade Hands has a nice review of the neoclassical synthesis and the ways in which Keynesianism and GE analysis were interrelated. On the battle for McKenzie to be credited alongside Arrow and Debreu, and the potentially scandalous way Debreu may have secretly been responsible for the Arrow and Debreu paper being published first, see the fine book Finding Equilibrium by Weintraub and Duppe; both Debreu and McKenzie have particularly wild histories. Till Duppe, a scholar of Debreu, also has a nice paper in the JHET on precisely how Arrow and Debreu came to work together, and what the contribution of each to their famous ’54 paper was.

The Greatest Living Economist Has Passed Away: Notes on Kenneth Arrow Part I

It is amazing how quickly the titans of the middle of the century have passed. Paul Samuelson and his mathematization, Ronald Coase and his connection of law to economics, Gary Becker and his incorporation of choice into the full sphere of human behavior, John Nash and his formalization of strategic interaction, Milton Friedman and his defense of the market in the precarious post-war period, Robert Fogel and his cliometric revolution: the remaining titan was Kenneth Arrow, the only living economist who could have won a second Nobel Prize without a whit of complaint from the gallery. These figures ruled as economics grew from a minor branch of moral philosophy into the most influential, most prominent, and most advanced of the social sciences. It is hard to imagine our field will ever again have such a collection of scholars rise in one generation, and with the tragic news that Ken has now passed away as well, we have, with great sadness and great rapidity, lost the full set.

Though he was 95 years old, Arrow was still hard at work; his paper with Kamran Bilir and Alan Sorensen was making its way around the conference circuit just last year. And beyond incredible productivity, Arrow had a legendary openness with young scholars. A few years ago, a colleague and I were debating a minor point in the history of economic thought, one that Arrow had played some role in; with the debate deadlocked, it was suggested that I simply email the protagonist to learn the truth. No reply came; perhaps no surprise, given how busy he was and how unknown I was. Imagine my surprise when, two months letter, a large manila envelope showed up in my mailbox at Northwestern, with a four page letter Ken had written inside! Going beyond a simple answer, he patiently walked me through his perspective on the entire history of mathematical economics, the relative centrality of folks like Wicksteed and Edgeworth to the broader economic community, the work he did under Hotelling and the Cowles Commission, and the nature of formal logic versus price theory. Mind you, this was his response to a complete stranger.

This kindness extended beyond budding economists: Arrow was a notorious generator of petitions on all kinds of social causes, and remained so late in life, signing the Economists Against Trump that many of us supported last year. You will be hardpressed to find an open letter or amicus curiae, on any issue from copyright term extension to the use of nuclear weapons, which Arrow was unaware of. The Duke Library holds the papers of both Arrow and Paul Samuelson – famously they became brothers-in-law – and the frequency with which their correspondence involves this petition or that, with Arrow in general the instigator and Samuelson the deflector, is unmistakable. I recall a great series of letters where Arrow queried Samuelson as to who had most deserved the Nobel but had died too early to receive it. Arrow at one point proposed Joan Robinson, which sent Samuelson into convulsions. “But she was a communist! And besides, her theory of imperfect competition was subpar.” You get the feeling in these letters of Arrow making gentle comments and rejoinders while Samuelson exercises his fists in the way he often did when battling everyone from Friedman to the Marxists at Cambridge to (worst of all, for Samuelson) those who were ignorant of their history of economic thought. Their conversation goes way back: you can find in one of the Samuelson boxes his recommendation that the University of Michigan bring in this bright young fellow named Arrow, a missed chance the poor Wolverines must still regret!

Arrow is so influential, in some many areas of economics, that it is simply impossible to discuss his contributions in a single post. For this reason, I will break the post into four parts, with one posted each day this week. We’ll look at Arrow’s work in choice theory today, his work on general equilibrium tomorrow, his work on innovation on Thursday, and some selected topics where he made seminal contributions (the economics of the environment, the principal-agent problem, and the economics of health care, in particular) on Friday. I do not lightly say that Arrow was the greatest living economist, and in my reckoning second only to Samuelson for the title of greatest economist of all time. Arrow wrote the foundational paper of general equilibrium analysis, the foundational paper of social choice and voting, the foundational paper justifying government intervention in innovation, and the foundational paper in the economics of health care. His legacy is the greatest legacy possible for the mathematical approach pushed by the Cowles Commission, the Econometric Society, Irving Fisher, and the mathematician-cum-economist Harold Hotelling. And so it is there that we must begin.

Arrow was born in New York City, a CCNY graduate like many children of the Great Depression, who went on to study mathematics in graduate school at Columbia. Economics in the United States in the 1930s was not a particularly mathematical science. The formalism of von Neumann, the late-life theoretical conversion of Schumpeter, Samuelson’s Foundations, and the soft nests at Cowles and the Econometric Society were in their infancy.

The usual story is that Arrow’s work on social choice came out of his visit to RAND in 1948. But this misstates the intellectual history: Arrow’s actual encouragement comes from his engagement with a new form of mathematics, the expansions of formal logic beginning with people like Peirce and Boole. While a high school student, Arrow read Bertrand Russell’s text on mathematical logic, and was enthused with the way that set theory permitted logic to go well beyond the syllogisms of the Greeks. What a powerful tool for the generation of knowledge! His Senior year at CCNY, Arrow took the advanced course on relational logic taught by Alfred Tarski, where the eminent philosopher took pains to reintroduce the ideas of Charles Sanders Peirce, the greatest yet most neglected American philosopher. The idea of relations are familiar to economists: give some links between a set (i.e, xRy and yRz) and some properties to the relation (i.e., it is well-ordered), and you can then perform logical operations on the relation to derive further properties. Every trained economist sees an example of this when first learning about choice and utility, but of course things like “greater than” and “less than” are relations as well. In 1940, one would have had to be extraordinarily lucky to encounter this theory: Tarski’s own books were not even translated.

But what great training this would be! For Arrow joined a graudate program in mathematical statistics at Columbia, where one of the courses was taught by Hotelling from the economics department. Hotelling was an ordinalist, rare in those days, and taught his students demand theory from a rigorous basis in ordinal preferences. But what are these? Simply relations with certain properties! Combined with a statistician’s innate ability to write proofs using inequalities, Arrow greatly impressed Hotelling, and switched to a PhD in economics with inspiration in the then-new subfield on mathematical economics that Hotelling, Samuelson, and Hicks were helping to expand.

After his wartime service doing operation research related to weather and flight planning, and a two year detour into capital theory with little to show for it, Arrow took a visiting position at the Cowles Commission, a center of research in mathematical economics then at the University of Chicago. In 1948, Arrow spent the summer at RAND, still yet to complete his dissertation, or even to strike on a worthwhile idea. RAND in Santa Monica was the world center for applied game theory: philosophers, economists, and mathematicians prowled the halls working through the technical basics of zero-sum games, but also the application of strategic decision theory to problems of serious global importance. Arrow had been thinking about voting a bit, and had written a draft of a paper, similar to that of Duncan Black’s 1948 JPE, essentially suggesting that majority voting “works” when preferences are single-peaked; that is, if everyone can rank options from “left to right”, and simply differ on which point is their “peak” of preference, then majority voting reflects individual preferences in a formal sense. At RAND, the philosopher Olaf Helmer pointed out that a similar concern mattered in international relations: how are we to say that the Soviet Union or the United States have preferences? They are collections of individuals, not individuals themselves.

Right, Arrow agreed. But economists had thought about collective welfare, from Pareto to Bergson-Samuelson. The Bergson-Samuelson idea is simple. Let all individuals in society have preferences over states of the world. If we all prefer state A to state B, then the Pareto criterion suggests society should as well. Of course, tradeoffs are inevitable, so what are we to do? We could assume cardinal utility (e.g., “how much money are willing to be paid to accept A if you prefer B to A and society goes toward A?”) as in the Kaldor-Hicks criterion (though the technically minded will know that Kaldor-Hicks does not define an order on states of the world, so isn’t really great for social choice). But let’s assume all people have is their own ordinal utility, their own rank-order of states, an order that is naturally hard to compare across people. Let’s assume for some pairs we have Pareto dominance: we all prefer A to C, and Q to L, and Z to X, but for other pairs there is no such dominance. A great theorem due to the Polish mathematician Szpilrain, and I believe popularized among economists by Blackwell, says that if you have a quasiorder R that is transitive, then there exists an order R’ which completes it. In simple terms, if you can rank some pairs, and the pairs you do rank do not have any intransitivity, then you can generate a complete rankings of all pairs which respects the original incomplete ordering. Since individuals have transitive preferences, Pareto ranks are transitive, and hence we know there exist social welfare functions which “extend” Pareto. The implications of this are subtle: for instance, as I discuss in the link earlier in this paragraph, it implies that pure monetary egalitarianism can never be socially optimal even if the only requirement is to respect Pareto dominance.

So aren’t we done? We know what it means, via Bergson-Samuelson, for the Soviet Union to “prefer” X to Y. But alas, Arrow was clever and attacked the problem from a separate view. His view was to, rather than taking preference orderings of individuals as given and constructing a social ordering, to instead ask whether there is any mechanism for constructing a social ordering from arbitrary individual preferences that satisfies certain criteria. For instance, you may want to rule out a rule that says “whatever Kevin prefers most is what society prefers, no matter what other preferences are” (non-dictatorship). You may want to require Pareto dominance to be respected so that if everyone likes A more than B, A must be chosen (Pareto criterion). You may want to ensure that “irrelevant options” do not matter, so that if giving an option to choose “orange” in addition to “apple” and “pear” does not affect any individual’s ranking of apples and pears, then the orange option also oughtn’t affect society’s rankings of apples and pears (IIA). Arrow famously proved that if we do not restrict what types of preferences individuals may have over social outcomes, there is no system that can rank outcomes socially and still satisfy those three criteria. It has been known that majority voting suffers a problem of this sort since Condorcet in the 18th century, but the general impossibility was an incredible breakthrough, and a straightforward one once Arrow was equipped with the ideas of relational logic.

It was with this result, in the 1951 book-length version of the idea, that social choice as a field distinct from welfare economics really took off. It is a startling result in two ways. First, in pure political theory, it rather simply killed off two centuries of blather about what the “best” voting system was: majority rule, Borda counts, rank-order voting, or whatever you like, every system must violate one of the Arrow axioms. And indeed, subsequent work has shown that the axioms can be relaxed and still generate impossibility. In the end, we do need to make social choices, so what should we go with? If you’re Amartya Sen, drop the Pareto condition. Others have quibbled with IIA. The point is that there is no right answer. The second startling implication is that welfare economics may be on pretty rough footing. Kaldor-Hicks conditions, which in practice motivate all sorts of regulatory decisions in our society, both rely on the assumption of cardinal or interpersonally-comparable utility, and do not generate an order over social options. Any Bergson-Samuelson social welfare function, a really broad class, must violate some pretty natural conditions on how they treat “equivalent” people (see, e.g., Kemp and Ng 1976). One questions whether we are back in the pre-Samuelson state where, beyond Pareto dominance, we can’t say much with any rigor about whether something is “good” or “bad” for society without dictatorially imposing our ethical standard, individual preferences be damned. Arrow’s theorem is a remarkable achievement for a man as young as he was when he conceived it, one of those rare philosophical ideas that will enter the canon alongside the categorical imperative or Hume on induction, a rare idea that will without question be read and considered decades and centuries hence.

Some notes to wrap things up:

1) Most call the result “Arrow’s Impossibility Theorem”. After all, he did prove the impossibility of a certain form of social choice. But Tjalling Koopmans actually convinced Arrow to call the theorem a “Possibility Theorem” out of pure optimism. Proof that the author rarely gets to pick the eventual name!

2) The confusion between Arrow’s theorem and the existence of social welfare functions in Samuelson has a long and interesting history: see this recent paper by Herrada Igersheim. Essentially, as I’ve tried to make clear in this post, Arrow’s result does not prove that Bergson-Samuelson social welfare functions do not exist, but rather implicitly imposes conditions on the indifference curves which underlie the B-S function. Much more detail in the linked paper.

3) So what is society to do in practice given Arrow? How are we to decide? There is much to recommend in Posner and Weyl’s quadratic voting when preferences can be assumed to have some sort of interpersonally comparable cardinal structure, yet are unknown. When interpersonal comparisons are impossible and we do not know people’s preferences, the famous Gibbard-Satterthwaite Theorem says that we have no voting system that can avoid getting people to sometimes vote strategically. We might then ask, ok, fine, what voting or social choice system works “the best” (e.g., satisfies some desiderata) over the broadest possible sets of individual preferences? Partha Dasgupta and Eric Maskin recently proved that, in fact, good old fashioned majority voting works best! But the true answer as to the “best” voting system depends on the distribution of underlying preferences you expect to see – it is a far less simple question than it appears.

4) The conditions I gave above for Arrow’s Theorem are actually different from the 5 conditions in the original 1950 paper. The reason is that Arrow’s original proof is actually incorrect, as shown by Julian Blau in a 1957 Econometrica. The basic insight of the proof is of course salvageable.

5) Among the more beautiful simplifications of Arrow’s proof is Phil Reny’s “side by side” proof of Arrow and Gibbard-Satterthwaite, where he shows just how related the underlying logic of the two concepts is.

We turn to general equilibrium theory tomorrow. And if it seems excessive to need four days to cover the work on one man – even in part! – that is only because I understate the breadth of his contributions. Like Samuelson’s obscure knowledge of Finnish ministers which I recounted earlier this year, Arrow’s breadth of knowledge was also notorious. There is a story Eric Maskin has claimed to be true, where some of Arrow’s junior colleagues wanted to finally stump the seemingly all-knowing Arrow. They all studied the mating habits of whales for days, and then, when Arrow was coming down the hall, faked a vigorous discussion on the topic. Arrow stopped and turned, remaining silent at first. The colleagues had found a topic he didn’t fully know! Finally, Arrow interrupted: “But I thought Turner’s theory was discredited by Spenser, who showed that the supposed homing mechanism couldn’t possibly work”! And even this intellectual feat hardly matches Arrow’s well-known habit of sleeping through the first half of seminars, waking up to make the most salient point of the whole lecture, then falling back asleep again (as averred by, among others, my colleague Joshua Gans, a former student of Ken’s).

A Note on the Trump Immigration Policy

This site is seven years old, during which time I have not written a single post which is not explicitly about economics research. The posts have collectively reached well over a half million readers in this time, and I have been incredibly encouraged to see how many folks, even outside of academia, are interested in how economics, and economic theory in particular, can help explain the social world.

I hope you’ll permit me to take one post where I break the “economic research only” rule. The executive order issued yesterday banning entry into the United States for citizens of seven nations is an abomination, and directly contrary to both the words of Lazarus’ poem on the Statue of Liberty and the 1965 immigration reform which banned discrimination on the basis of national origin. It is an absolute disgrace, particularly to me as an American who, like the majority of my countrymen, see the immigrant experience as the greatest source of pride the country has to offer. Every academic, including myself, has friends and colleagues and coauthors from the countries included on this ban.

I understand that there are citizens of the affected countries worried about how their studies will be able to continue given these immigration restrictions. While my hope is that the courts will overturn this un-American executive order, I want our friends from these countries to know that there are currently plans in the works to assist you. If you are a economics or strategy student affected by this order, or have students in those fields who may need temporary academic accommodation elsewhere, please email me at . This is of particular importance for students from the affected countries who are unable to return to the United States from present foreign travel. I can’t make any promises, but I have been in contact with a number of universities who may be able to help. If you are a PhD program director who may be able to help, I’d ask you to also contact me and I can keep you informed as to how things are progressing and how you can assist.

There is a troubling, nativist, anti-liberal (in the sense of Hume and Smith and Mill) streak in the world at the moment. The progress of knowledge depends on an open, free, and international system of cooperation. We in academia must stand up for this system, and for our friends who are being shut out of it.

Nobel Prize 2016 Part II: Oliver Hart

The Nobel Prize in Economics was given yesterday to two wonderful theorists, Bengt Holmstrom and Oliver Hart. I wrote a day ago about Holmstrom’s contributions, many of which are simply foundational to modern mechanism design and its applications. Oliver Hart’s contribution is more subtle and hence more of a challenge to describe to a nonspecialist; I am sure of this because no concept gives my undergraduate students more headaches than Hart’s “residual control right” theory of the firm. Even stranger, much of Hart’s recent work repudiates the importance of his most famous articles, a point that appears to have been entirely lost on every newspaper discussion of Hart that I’ve seen (including otherwise very nice discussions like Applebaum’s in the New York Times). A major reason he has changed his beliefs, and his research agenda, so radically is not simply the whims of age or the pressures of politics, but rather the impact of a devastatingly clever, and devastatingly esoteric, argument made by the Nobel winners Eric Maskin and Jean Tirole. To see exactly what’s going on in Hart’s work, and why there remains many very important unsolved questions in this area, let’s quickly survey what economists mean by “theory of the firm”.

The fundamental strangeness of firms goes back to Coase. Markets are amazing. We have wonderful theorems going back to Hurwicz about how competitive market prices coordinate activity efficiently even when individuals only have very limited information about how various things can be produced by an economy. A pencil somehow involves graphite being mined, forests being explored and exploited, rubber being harvested and produced, the raw materials brought to a factory where a machine puts the pencil together, ships and trains bringing the pencil to retail stores, and yet this decentralized activity produces a pencil costing ten cents. This is the case even though not a single individual anywhere in the world knows how all of those processes up the supply chain operate! Yet, as Coase pointed out, a huge amount of economic activity (including the majority of international trade) is not coordinated via the market, but rather through top-down Communist-style bureaucracies called firms. Why on Earth do these persistent organizations exist at all? When should firms merge and when should they divest themselves of their parts? These questions make up the theory of the firm.

Coase’s early answer is that something called transaction costs exist, and that they are particularly high outside the firm. That is, market transactions are not free. Firm size is determined at the point where the problems of bureaucracy within the firm overwhelm the benefits of reducing transaction costs from regular transactions. There are two major problems here. First, who knows what a “transaction cost” or a “bureaucratic cost” is, and why they differ across organizational forms: the explanation borders on tautology. Second, as the wonderful paper by Alchian and Demsetz in 1972 points out, there is no reason we should assume firms have some special ability to direct or punish their workers. If your supplier does something you don’t like, you can keep them on, or fire them, or renegotiate. If your in-house department does something you don’t like, you can keep them on, or fire them, or renegotiate. The problem of providing suitable incentives – the contracting problem – does not simply disappear because some activity is brought within the boundary of the firm.

Oliver Williamson, a recent Nobel winner joint with Elinor Ostrom, has a more formal transaction cost theory: some relationships generate joint rents higher than could be generated if we split ways, unforeseen things occur that make us want to renegotiate our contract, and the cost of that renegotiation may be lower if workers or suppliers are internal to a firm. “Unforeseen things” may include anything which cannot be measured ex-post by a court or other mediator, since that is ultimately who would enforce any contract. It is not that everyday activities have different transaction costs, but that the negotiations which produce contracts themselves are easier to handle in a more persistent relationship. As in Coase, the question of why firms do not simply grow to an enormous size is largely dealt with by off-hand references to “bureaucratic costs” whose nature was largely informal. Though informal, the idea that something like transaction costs might matter seemed intuitive and had some empirical support – firms are larger in the developing world because weaker legal systems means more “unforeseen things” will occur outside the scope of a contract, hence the differential costs of holdup or renegotiation inside and outside the firm are first order when deciding on firm size. That said, the Alchian-Demsetz critique, and the question of what a “bureaucratic cost” is, are worrying. And as Eric van den Steen points out in a 2010 AER, can anyone who has tried to order paper through their procurement office versus just popping in to Staples really believe that the reason firms exist is to lessen the cost of intrafirm activities?

Grossman and Hart (1986) argue that the distinction that really makes a firm a firm is that it owns assets. They retain the idea that contracts may be incomplete – at some point, I will disagree with my suppliers, or my workers, or my branch manager, about what should be done, either because a state of the world has arrived not covered by our contract, or because it is in our first-best mutual interest to renegotiate that contract. They retain the idea that there are relationship-specific rents, so I care about maintaining this particular relationship. But rather than rely on transaction costs, they simply point out that the owner of the asset is in a much better bargaining position when this disagreement occurs. Therefore, the owner of the asset will get a bigger percentage of rents after renegotiation. Hence the person who owns an asset should be the one whose incentive to improve the value of the asset is most sensitive to that future split of rents.

Baker and Hubbard (2004) provide a nice empirical example: when on-board computers to monitor how long-haul trucks were driven began to diffuse, ownership of those trucks shifted from owner-operators to trucking firms. Before the computer, if the trucking firm owns the truck, it is hard to contract on how hard the truck will be driven or how poorly it will be treated by the driver. If the driver owns the truck, it is hard to contract on how much effort the trucking firm dispatcher will exert ensuring the truck isn’t sitting empty for days, or following a particularly efficient route. The computer solves the first problem, meaning that only the trucking firm is taking actions relevant to the joint relationship which are highly likely to be affected by whether they own the truck or not. In Grossman and Hart’s “residual control rights” theory, then, the introduction of the computer should mean the truck ought, post-computer, be owned by the trucking firm. If these residual control rights are unimportant – there is no relationship-specific rent and no incompleteness in contracting – then the ability to shop around for the best relationship is more valuable than the control rights asset ownership provides. Hart and Moore (1990) extends this basic model to the case where there are many assets and many firms, suggesting critically that sole ownership of assets which are highly complementary in production is optimal. Asset ownership affects outside options when the contract is incomplete by changing bargaining power, and splitting ownership of complementary assets gives multiple agents weak bargaining power and hence little incentive to invest in maintaining the quality of, or improving, the assets. Hart, Schleifer and Vishny (1997) provide a great example of residual control rights applied to the question of why governments should run prisons but not garbage collection. (A brief aside: note the role that bargaining power plays in all of Hart’s theories. We do not have a “perfect” – in a sense that can be made formal – model of bargaining, and Hart tends to use bargaining solutions from cooperative game theory like the Shapley value. After Shapley’s prize alongside Roth a few years ago, this makes multiple prizes heavily influenced by cooperative games applied to unexpected problems. Perhaps the theory of cooperative games ought still be taught with vigor in PhD programs!)

There are, of course, many other theories of the firm. The idea that firms in some industries are big because there are large fixed costs to enter at the minimum efficient scale goes back to Marshall. The agency theory of the firm going back at least to Jensen and Meckling focuses on the problem of providing incentives for workers within a firm to actually profit maximize; as I noted yesterday, Holmstrom and Milgrom’s multitasking is a great example of this, with tasks being split across firms so as to allow some types of workers to be given high powered incentives and others flat salaries. More recent work by Bob Gibbons, Rebecca Henderson, Jon Levin and others on relational contracting discusses how the nexus of self-enforcing beliefs about how hard work today translates into rewards tomorrow can substitute for formal contracts, and how the credibility of these “relational contracts” can vary across firms and depend on their history.

Here’s the kicker, though. A striking blow was dealt to all theories which rely on the incompleteness or nonverifiability of contracts by a brilliant paper of Maskin and Tirole (1999) in the Review of Economic Studies. Theories relying on incomplete contracts generally just hand-waved that there are always events which are unforeseeable ex-ante or impossible to verify in court ex-post, and hence there will always scope for disagreement about what to do when those events occur. But, as Maskin and Tirole correctly point out, agent don’t care about anything in these unforeseeable/unverifiable states except for what the states imply about our mutual valuations from carrying on with a relationship. Therefore, every “incomplete contract” should just involve the parties deciding in advance that if a state of the world arrives where you value keeping our relationship in that state at 12 and I value it at 10, then we should split that joint value of 22 at whatever level induces optimal actions today. Do this same ex-ante contracting for all future profit levels, and we are done. Of course, there is still the problem of ensuring incentive compatibility – why would the agents tell the truth about their valuations when that unforeseen event occurs? I will omit the details here, but you should read the original paper where Maskin and Tirole show a (somewhat convoluted but still working) mechanism that induces truthful revelation of private value by each agent. Taking the model’s insight seriously but the exact mechanism less seriously, the paper basically suggests that incomplete contracts don’t matter if we can truthfully figure out ex-post who values our relationship at what amount, and there are many real-world institutions like mediators who do precisely that. If, as Maskin and Tirole prove (and Maskin described more simply in a short note), incomplete contracts aren’t a real problem, we are back to square one – why have persistent organizations called firms?

What should we do? Some theorists have tried to fight off Maskin and Tirole by suggesting that their precise mechanism is not terribly robust to, for instance, assumptions about higher-order beliefs (e.g., Aghion et al (2012) in the QJE). But these quibbles do not contradict the far more basic insight of Maskin and Tirole, that situations we think of empirically as “hard to describe” or “unlikely to occur or be foreseen”, are not sufficient to justify the relevance of incomplete contracts unless we also have some reason to think that all mechanisms which split rent on the basis of future profit, like a mediator, are unavailable. Note that real world contracts regularly include provisions that ex-ante describe how contractual disagreement ex-post should be handled.

Hart’s response, and this is both clear from his CV and from his recent papers and presentations, is to ditch incompleteness as the fundamental reason firms exist. Hart and Moore’s 2007 AER P&P and 2006 QJE are very clear:

Although the incomplete contracts literature has generated some useful insights about firm boundaries, it has some shortcomings. Three that seem particularly important to us are the following. First, the emphasis on noncontractible ex ante investments seems overplayed: although such investments are surely important, it is hard to believe that they are the sole drivers of organizational form. Second, and related, the approach is ill suited to studying the internal organization of firms, a topic of great interest and importance. The reason is that the Coasian renegotiation perspective suggests that the relevant parties will sit down together ex post and bargain to an efficient outcome using side payments: given this, it is hard to see why authority, hierarchy, delegation, or indeed anything apart from asset ownership matters. Finally, the approach has some foundational weaknesses [pointed out by Maskin and Tirole (1999)].

To my knowledge, Oliver Hart has written zero papers since Maskin-Tirole was published which attempt to explain any policy or empirical fact on the basis of residual control rights and their necessary incomplete contracts. Instead, he has been primarily working on theories which depend on reference points, a behavioral idea that when disagreements occur between parties, the ex-ante contracts are useful because they suggest “fair” divisions of rent, and induce shading and other destructive actions when those divisions are not given. These behavioral agents may very well disagree about what the ex-ante contract means for “fairness” ex-post. The primary result is that flexible contracts (e.g., contracts which deliberately leave lots of incompleteness) can adjust easily to changes in the world but will induce spiteful shading by at least one agent, while rigid contracts do not permit this shading but do cause parties to pursue suboptimal actions in some states of the world. This perspective has been applied by Hart to many questions over the past decade, such as why it can be credible to delegate decision making authority to agents; if you try to seize it back, the agent will feel aggrieved and will shade effort. These responses are hard, or perhaps impossible, to justify when agents are perfectly rational, and of course the Maskin-Tirole critique would apply if agents were purely rational.

So where does all this leave us concerning the initial problem of why firms exist in a sea of decentralized markets? In my view, we have many clever ideas, but still do not have the perfect theory. A perfect theory of the firm would need to be able to explain why firms are the size they are, why they own what they do, why they are organized as they are, why they persist over time, and why interfirm incentives look the way they do. It almost certainly would need its mechanisms to work if we assumed all agents were highly, or perfectly, rational. Since patterns of asset ownership are fundamental, it needs to go well beyond the type of hand-waving that makes up many “resource” type theories. (Firms exist because they create a corporate culture! Firms exist because some firms just are better at doing X and can’t be replicated! These are outcomes, not explanations.) I believe that there are reasons why the costs of maintaining relationships – transaction costs – endogenously differ within and outside firms, and that Hart is correct is focusing our attention on how asset ownership and decision making authority affects incentives to invest, but these theories even in their most endogenous form cannot do everything we wanted a theory of the firm to accomplish. I think that somehow reputation – and hence relational contracts – must play a fundamental role, and that the nexus of conflicting incentives among agents within an organization, as described by Holmstrom, must as well. But we still lack the precise insight to clear up this muddle, and give us a straightforward explanation for why we seem to need “little Communist bureaucracies” to assist our otherwise decentralized and almost magical market system.

%d bloggers like this: