“The First Patent Litigation Explosion,” C. Beauchamp (2016)

There has been a tremendous rise in patent litigation in the past decade (Bessen and Meurer 2005). Many of these lawsuits have come from “non-practicing entities” – also known as patent trolls – who use their patents to sue even as they produce no products themselves. These lawsuits are often targeted at end-users rather than directly infringing manufacturers, supposedly on the grounds that end-users are less able to defend themselves (see my coauthor Erik Hovenkamp on this point). For those who feel the patent system provides too many rights to patent-holders, to the detriment of societal welfare, problems like these are case in point.

But are these worries novel? The economics of innovation and entrepreneurship is, like much of economics, one where history proves illuminating. Nearly everything that we think is new has happened before. Fights over the use of IP to collude by incumbents? See Lampe and Moser 2010 JEH. The importance of venture capital to local ecosystems? In the late 19th century, this was true in the boomtown of Cleveland, as Lamoreaux and her coauthors showed in a 2006 C&S (as to why Cleveland declines as an innovative center, they have a nice paper on that topic as well). The role of patent brokers and other intermediaries? These existed in the 19th century! Open source invention in the early days of a new industry? Tales from the rise of the porter style of beer in the 18th century are not terribly different from the Homebrew Computer Club that led to the personal computer industry. Tradeoffs between secrecy, patenting, and alternative forms of protection? My colleague Alberto Galasso shows that this goes back to Renaissance Italy!

Given these examples, it should not be surprising that the recent boom in patent litigation is a historical rerun. Christopher Beauchamp of Brooklyn Law School, in a 2016 article in the Yale Law Journal, shows that all of the problems with patent litigation mentioned above are not new: indeed, the true heyday of patent litigation was not the 2010s, but the late 1800s! Knowing the number of lawsuits filed, not just the number litigated to decision, requires painstaking archival research. Having dug up these old archives, Beauchamp begins with a striking fact: the Southern District of New York alone had as many total patent lawsuits filed in 1880 as any district in 2010, and on a per patent basis had an order of magnitude more lawsuits. These legal battles were often virulent. For instance, Charles Goodyear’s brother held patents for the use of rubber in dentistry, using attractive young women to find dentists using the technique without a license. The aggressive legal strategy ended only when the Vulcanite Company’s hard-charging treasurer was murdered in San Francisco by a desperate dentist!

These lawsuits were not merely battles between the Apples and Samsungs of the day, but often involved lawsuits demanding small license payments from legally unsophisticated users. Iowa Senator Samuel Kirkwood: patentholders “say to each [farmer], ‘Sir, pay me so much a mile or so much a rod for the wire…or you must go to Des Moines…and defend a suit to be brought against you, the cost of which and the fees in which will in themselves be more than I demand of you…[O]ur people are paying day by day $10, $15, $20, when they do not know a particle more whether they owe the man a dollar or a cent…but paying the money just because it is cheaper to do it than to defend a suit.” Some of these lawsuits were legitimate, but many were making claims far beyond the scope of what a court would consider infringement, just as in the case of patent troll lawsuits today. Also like today, farmers and industry associations formed joint litigation pools to challenge what they considered weak patents.

In an echo of complaints about abuse of the legal system and differential costs of filing lawsuits compared to defending oneself, consider Minnesota Senator William Windom’s comments: “[B]y the authority of the United States you may go to the capital of a State and for a claim of $5 each you may send the United States marshal to a thousand men, or ten thousand…and compel them to travel hundreds of miles to defend against your claim, or, as more frequently occurs, to pay an unjust demand as the cheapest way of meeting it.” Precisely the same complaint applies to modern patent battles.

A question of great relevance to our modern patent litigation debate therefore is immediate: Why did these scattershot individual lawsuits eventually fade away in the late 1800s? Beauchamp is equivocal here, but notes that judicial hostility toward the approach may have decreased win rates, and hence the incentive to file against small, weak defendants. Further, the rise of the modern corporation (see Alfred Chandler’s Scale and Scope) in the late 19th century changed the necessity of sublicensing inventions to local ligitating attorneys, rather that just suing large infringing manufacturers directly.

Of course, not everything historic is a mirror of the present. A major source of patent litigation in the mid-1800s involved patent reissues. Essentially, a patent would be granted with weak scope. An industry would rise up using related non-infringing technology. A sophisticated corporation would buy the initial patent, then file for a “reissue” which expanded the scope of the patent to cover many technologies then in use. Just as “submarine” patents, held secretly in application while an industry grows, are a major problem recently, patent reissues led to frequent 19th century complaints, until changes in jurisprudence in the late 1800s led to greatly decreased deference to the reissued patent.

What does this history tell us about modern innovation policy? As Beauchamp discusses, “[t]o a modern observer, the content of the earlier legal and regulatory reactions can seem strikingly familiar. Many of the measures now proposed or attempted as solutions for the ills of modern patent litigation were proposed or attempted in the nineteenth century as well.” To the extent we are worried about how to stop “patent trolls” from enforcing weak patents against unsophisticated end-users, we ought look at how our 19th century forebears handled the weak barbed wire and well patents filed against small-town farmers. With the (often economically-illiterate) rise of the “Hipster Antitrust” ideas of Lina Khan and her compatriots, will the intersection of patent and antitrust law move from today’s “technocratic air” – Beauchamp’s phrase – to the more political battleground of the 19th century? And indeed, for patent skeptics like myself, how are we to reconcile the litigious era of patenting of 1850-1880 with the undisputed fact that this period was dead in the heart of the Second Industrial Revolution, the incredible rise of electricity and modern chemicals inventions that made the modern world?

Full article is in Yale Law Journal, Feb. 2016.

Advertisements

The 2018 John Bates Clark: Parag Pathak

The most prestigious award a young researcher can receive in economics, the John Bates Clark medal, has been awarded to the incredibly productive Parag Pathak. His CV is one that could only belong a true rocket of a career: he was a full professor at MIT 11 years after he started his PhD, finishing the degree in four years before going to Paul Samuelson route through the Harvard Society of Fellows to become the top young theorist in Cambridge.

Pathak is best known for his work on the nature and effects of how students are allocated to schools. This is, of course, an area where theorists have had incredible influence on public policy, notably via Pathak’s PhD Advisor, the Nobel prize winner Al Roth, the group of Turkish researchers including Atila Abdulkadiroglu, Utku Unver, and Tayfun Sonmez, as well as the work of 2015 Clark medal winner Roland Fryer. Indeed, this group’s work on how to best allocate students to schools in an incentive-compatible way – that is, in a way where parents need only truthfully state which schools they like best – was adopted by the city of Boston, to my knowledge the first-time this theoretically-optimal mechanism was used by an actual school district. As someone born in Boston’s contentious Dorchester neighborhood, it is quite striking how much more successful this reform was than the busing policies of the 1970s which led to incredible amounts of bigoted pushback.

Consider the old “Boston mechanism”. Parents list their preferred schools in order. Everyone would be allocated their first choice if possible. If a school is oversubscribed, some random percentage get their second choice, and if still oversubscribed, their third, and so on. This mechanism gives clear reason for strategic manipulation: you certainly don’t want to list a very popular school as your second choice, since there is almost no chance that it won’t fill up with first choices. The Boston mechanism was replaced by the Gale-Shapley mechanism, which has the property that it is never optimal for a parent to misrepresent preferences. In theory, not only is this more efficient but it is also fairer: parents who do not have access to sophisticated neighbors helping them game the system are, you might reason, most likely to lose out. And this is precisely what Pathak and Sonmez show in a theoretical model: sophisticated parents may prefer the old Boston mechanism because it makes them better off at the expense of the less sophisticated! The latter concern is a tough one for traditional mechanism design to handle, as we generally assume that agents act in their self-interest, including taking advantage of the potential for strategic manipulation. There remains some debate about what is means for a mechanism to be “better” when some agents are unsophisticated or when they do not have strict preferences over all options.

Competition for students may also affect the quality of underlying schools, either because charter schools and for-profits compete on a profit-maximizing basis, or because public schools somehow respond to the incentive to get good students. Where Pathak’s work is particularly rigorous is that he notes how critical both the competitive environment and the exact mechanism for assigning students are for the responsiveness of schools. It is not that “charters are good” or “charters are bad” or “test schools produce X outcome”, but rather the conditionality of these statements on how the choice and assignment mechanism works. A pure empirical study found charters in Boston performed much better for lottery-selected students than public schools, and that attending an elite test school in Boston or New York doesn’t really affect student outcomes. Parents appear unable to evaluate school quality except in terms of peer effects, which can be particularly problematic when good peers enroll is otherwise bad schools.

Pathak’s methodological approach is refreshing. He has a theorist’s toolkit and an empiricist’s interest in policy. For instance, imagine we want to know how well charter schools perform. The obvious worry is that charter students are better than the average student. Many studies take advantage of charter lotteries, where oversubscribed schools assign students from the application class by lottery. This does not identify the full effect of charters, however: whether I enter a lottery at all depends on how I ranked schools, and hence participants in a lottery for School A versus School B are not identical. Pathak, Angrist and Walters show how to combine the data from specific school choice mechanisms with the lottery in a way that ensures we are comparing like-to-like when evaluating lotteries at charters versus non-charters. In particular, they find that Denver’s charter schools do in fact perform better.

Indeed, reading through Pathak’s papers this afternoon, I find myself struck by how empirical his approach has become over time: if a given question requires the full arsenal of choice, he deploys it, and if it is unnecessary, he estimates things in a completely reduced form way. Going beyond the reduced form often produces striking results. For instance, what would be lost if affirmative action based on race were banned in school choice? Chicago’s tier-based plans, where many seats at test schools were reserved for students from low-SES tiers, works dreadfully: not only does it not actually select low-SES students (high-SES students in low-income neighborhoods are selected), but it would require massively dropping entrance score criteria to get a pre-established number of black and hispanic students to attend. This is particularly true for test schools on the north side of the city. Answering the question of what racial representation looks like in a counterfactual world where Chicago doesn’t have to use SES-based criteria to indirectly choose students to get a given racial makeup, and the question of whether the problem is Chicago’s particular mechanism or whether it is fundamental to any location-based selection mechanism, requires theory, and Pathak deploys it wonderfully. Peng Shi and Pathak also do back-testing on their theory-based discrete-choice predictions of the impact of a change in Boston’s mechanism meant to reduce travel times, showing that to the extent the model missed, it was because the student characteristics were unexpected, not because there were not underlying structural preferences. If we are going to deploy serious theoretical methods to applied questions, rather than just to thought experiments, this type of rigorous combination of theory, design-based empirics, and back-testing is essential.

In addition to his education research, Pathak has also contributed, alongside Fuhito Kojima, to the literature on large matching markets. The basic idea is the following. In two-sided matching, where both sides have preferences like in a marriage market, there are no stable matching mechanisms where both sides want to report truthfully. For example, if you use the mechanism currently in place in Boston where students rank schools, schools themselves have the incentive to manipulate the outcome by changing how many slots they offer. Pathak and Kojima show that when the market is large, it is (in a particular sense) an equilibrium for both sides to act truthfully; roughly, even if I screw one student I don’t want out of a slot, in a thick market it is unlikely I wind up with a more-preferred student to replace them. There has more recently been a growing literature on what really matters in matching markets: is it the stability of the matching mechanism, or the thickness of the market, or the timing, and so on.

This award strikes me as the last remaining award, at least in the near term, from the matching/market design boom of the past 20 years. As Becker took economics out of pure market transactions and into a wider world of rational choice under constraints, the work of Al Roth and his descendants, including Parag Pathak, has greatly expanded our ability to take advantage of choice and local knowledge in situations like education and health where, for many reasons, we do not use the price mechanism. That said, there remains quite a bit to do on understanding how to get the benefits of decentralization without price – I am deeply interested in this question when it comes to innovation policy – and I don’t doubt that two decades from now, continued inquiry along these lines will have fruitfully exploited the methods and careful technique that Parag Pathak embodies.

One final note, and this in no way takes away from how deserving Pathak and other recent winners have been. Yet: I would be remiss if I didn’t point out, again, how unusually “micro” the Clark medal has been of late. There literally has not been a pure macroeconomist or econometrician – two of the three “core” fields of economics – since 1999, and only Donaldson and Acemoglu are even arguably close. Though the prize has gone to three straight winners with a theoretical bent, at least in part, the prize is still not reflecting our field as a whole. Nothing for Emi Nakamura, or Victor Chernozhukov, or Emmanuel Farhi, or Ivan Werning, or Amir Sufi, or Chad Syverson, or Marc Melitz? These folks are incredibly influential on our field as a whole, and the Clark medal is failing to reflect the totality of what economists actually do.

The 2017 Nobel: Richard Thaler

A true surprise this morning: the behavioral economist Richard Thaler from the University of Chicago has won the Nobel Prize in economics. It is not a surprise because it is undeserving; rather, it is a surprise because only four years ago, Thaler’s natural co-laureate Bob Shiller won while Thaler was left the bridesmaid. But Thaler’s influence on the profession, and the world, is unquestionable. There are few developed governments who do not have a “nudge” unit of some sort trying to take advantage of behavioral nudges to push people a touch in one way or another, including here in Ontario via my colleagues at BEAR. I will admit, perhaps under the undue influence of too many dead economists, that I am skeptical of nudging and behavioral finance on both positive and normative grounds, so this review will be one of friendly challenge rather than hagiography. I trust that there will be no shortage of wonderful positive reflections on Thaler’s contribution to policy, particularly because he is the rare economist whose work is totally accessible to laymen and, more importantly, journalists.

Much of my skepticism is similar to how Fama thinks about behavioral finance: “I’ve always said they are very good at describing how individual behavior departs from rationality. That branch of it has been incredibly useful. It’s the leap from there to what it implies about market pricing where the claims are not so well-documented in terms of empirical evidence.” In other words, surely most people are not that informed and not that rational much of the time, but repeated experience, market selection, and other aggregative factors mean that this irrationality may not matter much for the economy at large. It is very easy to claim that since economists model “agents” as “rational”, we would, for example, “not expect a gift on the day of the year in which she happened to get married, or be born” and indeed “would be perplexed by the idea of gifts at all” (Thaler 2015). This type of economist caricature is both widespread and absurd, I’m afraid. In order to understand the value of Thaler’s work, we ought first look at situations where behavioral factors matter in real world, equilibrium decisions of consequence, then figure out how common those situations are, and why.

The canonical example of Thaler’s useful behavioral nudges is his “Save More Tomorrow” pension plan, with Benartzi. Many individuals in defined contribution plans save too little, both because they are not good at calculating how much they need to save and because they are biased toward present consumption. You can, of course, force people to save a la Singapore, but we dislike these plans because individuals vary in their need and desire for saving, and because we find the reliance on government coercion to save heavy-handed. Alternatively, you can default defined-contribution plans to involve some savings rate, but it turns out people do not vary their behavior from the default throughout their career, and hence save too little solely because they didn’t want too much removed from their first paycheck. Thaler and Benartzi have companies offer plans where you agree now to having your savings rate increased when you get raises – for instance, if your salary goes up 2%, you will have half of that set into a savings plan tomorrow, until you reach a savings rate that is sufficiently high. In this way, no one takes a nominal post-savings paycut. People can, of course, leave this plan whenever they want. In their field experiments, savings rates did in fact soar (with takeup varying hugely depending on how information about the plan was presented), and attrition in the future from the plan was low.

This policy is what Thaler and Sunstein call “libertarian paternalism”. It is paternalistic because, yes, we think that you may make bad decisions from your own perspective because you are not that bright, or because you are lazy, or because you have many things which require your attention. It is libertarian because there is no compulsion, in that anyone can opt out at their leisure. Results similar to Thaler and Benartzi’s have found by Ashraf et al in a field experiment in the Philippines, and by Karlan et al in three countries where just sending reminder messages which make savings goals more salient modestly increase savings.

So far, so good. We have three issues to unpack, however. First, when is this nudge acceptable on ethical grounds? Second, why does nudging generate such large effects here, and if the effects are large, why doesn’t the market simply provide them? Third, is the 401k savings case idiosyncratic or representative? The idea that the homo economicus, rational calculator, misses important features of human behavior, and would do with some insights from psychology, is not new, of course. Thaler’s prize is, at minimum, the fifth Nobel to go to someone pushing this general idea, since Herb Simon, Maurice Allais, Daniel Kahneman, and the aforementioned Bob Shiller have all already won. Copious empirical evidence, and indeed simple human observation, implies that people have behavioral biases, that they are not perfectly rational – as Thaler has noted, we see what looks like irrationality even in the composition of 100 million dollar baseball rosters. The more militant behavioralists insist that ignoring these psychological factors is unscientific! And yet, and yet: the vast majority of economists, all of whom are by now familiar with these illustrious laureates and their work, still use fairly standard expected utility maximizing agents in nearly all of our papers. Unpacking the three issues above will clarify how that could possibly be so.

Let’s discuss ethics first. Simply arguing that organizations “must” make a choice (as Thaler and Sunstein do) is insufficient; we would not say a firm that defaults consumers into an autorenewal for a product they rarely renew when making an active choice is acting “neutrally”. Nudges can be used for “good” or “evil”. Worse, whether a nudge is good or evil depends on the planner’s evaluation of the agent’s “inner rational self”, as Infante and Sugden, among others, have noted many times. That is, claiming paternalism is “only a nudge” does not excuse the paternalist from the usual moral philosophic critiques! Indeed, as Chetty and friends have argued, the more you believe behavioral biases exist and are “nudgeable”, the more careful you need to be as a policymaker about inadvertently reducing welfare. There is, I think, less controversy when we use nudges rather than coercion to reach some policy goal. For instance, if a policymaker wants to reduce energy usage, and is worried about distortionary taxation, nudges may (depending on how you think about social welfare with non-rational preferences!) be a better way to achieve the desired outcomes. But this goal is very different from the common justification that nudges somehow are pushing people toward policies they actually like in their heart of hearts. Carroll et al have a very nice theoretical paper trying to untangle exactly what “better” means for behavioral agents, and exactly when the imprecision of nudges or defaults given our imperfect knowledge of individual’s heterogeneous preferences makes attempts at libertarian paternalism worse than laissez faire.

What of the practical effects of nudges? How can they be so large, and in what contexts? Thaler has very convincingly shown that behavioral biases can affect real world behavior, and that understanding those biases means two policies which are identical from the perspective of a homo economicus model can have very different effects. But many economic situations involve players doing things repeatedly with feedback – where heuristics approximated by rationality evolve – or involve players who “perform poorly” being selected out of the game. For example, I can think of many simple nudges to get you or me to play better basketball. But when it comes to Michael Jordan, the first order effects are surely how well he takes cares of his health, the teammates he has around him, and so on. I can think of many heuristics useful for understanding how simple physics will operate, but I don’t think I can find many that would improve Einstein’s understanding of how the world works. The 401k situation is unusual because it is a decision with limited short-run feedback, taken by unsophisticated agents who will learn little even with experience. The natural alternative, of course, is to have agents outsource the difficult parts of the decision, to investment managers or the like. And these managers will make money by improving people’s earnings. No surprise that robo-advisors, index funds, and personal banking have all become more important as defined contribution plans have become more common! If we worry about behavioral biases, we ought worry especially about market imperfections that prevent the existence of designated agents who handle the difficult decisions for us.

The fact that agents can exist is one reason that irrationality in the lab may not translate into irrationality in the market. But even without agents, we might reasonably be suspect of some claims of widespread irrationality. Consider Thaler’s famous endowment effect: how much you are willing to pay for, say, a coffee mug or a pen is much less than how much you would accept to have the coffee mug taken away from you. Indeed, it is not unusual in a study to find a ratio of three times or greater between the willingness to pay and willingness to accept amount. But, of course, if these were “preferences”, you could be money pumped (see Yaari, applying a theorem of de Finetti, on the mathematics of the pump). Say you value the mug at ten bucks when you own it and five bucks when you don’t. Do we really think I can regularly get you to pay twice as much by loaning you the mug for free for a month? Do we see car companies letting you take a month-long test drive of a $20,000 car then letting you keep the car only if you pay $40,000, with some consumers accepting? Surely not. Now the reason why is partly what Laibson and Yariv argue, that money pumps do not exist in competitive economies since market pressure will compete away rents: someone else will offer you the car at $20,000 and you will just buy from them. But even if the car company is a monopolist, surely we find the magnitude of the money pump implied here to be on face ridiculous.

Even worse are the dictator games introduced in Thaler’s 1986 fairness paper. Students were asked, upon being given $20, whether they wanted to give an anonymous student half of their endowment or 10%. Many of the students gave half! This experiment has been repeated many, many times, with similar effects. Does this mean economists are naive to neglect the social preferences of humans? Of course not! People are endowed with money and gifts all the time. They essentially never give any of it to random strangers – I feel confident assuming you, the reader, have never been handed some bills on the sidewalk by an officeworker who just got a big bonus! Worse, the context of the experiment matters a ton (see John List on this point). Indeed, despite hundreds of lab experiments on dictator games, I feel far more confident predicting real world behavior following windfalls if we use a parsimonious homo economicus model than if we use the results of dictator games. Does this mean the games are useless? Of course not – studying what factors affect other-regarding preferences is interesting, and important. But how odd to have a branch of our field filled with people who see armchair theorizing of homo economicus as “unscientific”, yet take lab experiments so literally even when they are so clearly contrary to data?

To take one final example, consider Thaler’s famous model of “mental accounting”. In many experiments, he shows people have “budgets” set aside for various tasks. I have my “gas budget” and adjust my driving when gas prices change. I only sell stocks when I am up overall on that stock since I want my “mental account” of that particular transaction to be positive. But how important is this in the aggregate? Take the Engel curve. Budget shares devoted to food fall with income. This is widely established historically and in the cross section. Where is the mental account? Farber (2008 AER) even challenges the canonical account of taxi drivers working just enough hours to make their targeted income. As in the dictator game and the endowment effect, there is a gap between what is real, psychologically, and what is consequential enough to be first-order in our economic understanding of the world.

Let’s sum up. Thaler’s work is brilliant – it is a rare case of an economist taking psychology seriously and actually coming up with policy-relevant consequences like the 401k policy. But Thaler’s work is also dangerous to young economists who see biases everywhere. Experts in a field, and markets with agents and mechanisms and all the other tricks they develop, are very very good at ferreting out irrationality, and economists core skill lies in not missing those tricks.

Some remaining bagatelles: 1) Thaler and his PhD advisor, Sherwin Rosen, have one of the first papers on measuring the “statistical” value of a life, a technique now widely employed in health economics and policy. 2) Beyond his academic work, Thaler has won a modicum of fame as a popular writer (Nudge, written with Cass Sunstein, is canonical here) and for his brief turn as an actor alongside Selena Gomez in “The Big Short”. 3) Dick has a large literature on “fairness” in pricing, a topic which goes back to Thomas Aquinas, if not earlier. Many of the experiments Thaler performs, like the thought experiments of Aquinas, come down to the fact that many perceive market power to be unfair. Sure, I agree, but I’m not sure there’s much more that can be learned than this uncontroversial fact. 4) Law and econ has been massively influenced by Thaler. As a simple example, if endowment effects are real, then the assignment of property rights matters even when there are no transaction costs. Jolls et al 1998 go into more depth on this issue. 5) Thaler’s precise results in so-called behavioral finance are beyond my area of expertise, so I defer to John Cochrane’s comments following the 2013 Nobel. Eugene Fama is, I think, correct when he suggests that market efficiency generated by rational traders with risk aversion is the best model we have of financial behavior, where best is measured by “is this model useful for explaining the world.” The number of behavioral anomalies at the level of the market which persist and are relevant in the aggregate do not strike me as large, while the number of investors and policymakers who make dreadful decisions because they believe markets are driven by behavioral sentiments is large indeed!

“Resetting the Urban Network,” G. Michaels & F. Rauch (2017)

Cities have two important properties: they are enormously consequential for people’s economic prosperity, and they are very sticky. That stickiness is twofold: cities do not change their shape rapidly in response to changing economic or technological opportunities (consider, e.g., Hornbeck and Keniston on the positive effects of the Great Fire of Boston), and people are hesitant to leave their existing non-economic social network (Deryagina et al show that Katrina victims, a third of whom never return to New Orleans, are materially better off as soon as three years after the hurricane, earning more and living in less expensive cities; Shoag and Carollo find that Japanese-Americans randomly placed in internment camps in poor areas during World War 2 see lower incomes and children’s educational outcomes even many years later).

A lot of recent work in urban economics suggests that the stickiness of cities is getting worse, locking path dependent effects in with even more vigor. A tour-de-force by Shoag and Ganong documents that income convergence across cities in the US has slowed since the 1970s, that this only happened in cities with restrictive zoning rules, and that the primary effect has been that as land use restrictions make housing prices elastic to income, working class folks no longer move from poor to rich cities because the cost of housing makes such a move undesirable. Indeed, they suggest a substantial part of growing income inequality, in line with work by Matt Rognlie and others, is due to the fact that owners of land have used political means to capitalize productivity gains into their existing, tax-advantaged asset.

Now, one part of urban stickiness over time may simply be reflecting that certain locations are very productive, that they have a large and valuable installed base of tangible and intangible assets that make their city run well, and hence we shouldn’t be surprised to see cities retain their prominence and nature over time. So today, let’s discuss a new paper by Michaels and Rauch which uses a fantastic historical case to investigate this debate: the rise and fall of the Roman Empire.

The Romans famously conquered Gaul – today’s France – under Caesar, and Britain in stages up through Hadrian (and yes, Mary Beard’s SPQR is worthwhile summer reading; the fact that Nassim Taleb and her do not get along makes it even more self-recommending!). Roman cities popped up across these regions, until the 5th century invasions wiped out Roman control. In Britain, for all practical purposes the entire economic network faded away: cities hollowed out, trade came to a stop, and imports from outside Britain and Roman coin are near nonexistent in the archaeological record for the next century and a half. In France, the network was not so cleanly broken, with Christian bishoprics rising in many of the old Roman towns.

Here is the amazing fact: today, 16 of France’s 20 largest cities are located on or near a Roman town, while only 2 of Britain’s 20 largest are. This difference existed even back in the Middle Ages. So who cares? Well, Britain’s cities in the middle ages are two and a half times more likely to have coastal access than France’s cities, so that in 1700, when sea trade was hugely important, 56% of urban French lived in towns with sea access while 87% of urban Brits did. This is even though, in both countries, cities with sea access grew faster and huge sums of money were put into building artificial canals. Even at a very local level, the France/Britain distinction holds: when Roman cities were within 25km of the ocean or a navigable river, they tended not to move in France, while in Britain they tended to reappear nearer to the water. The fundamental factor for the shift in both places was that developments in shipbuilding in the early middle ages made the sea much more suitable for trade and military transport than the famous Roman Roads which previously played that role.

Now the question, of course, is what drove the path dependence: why didn’t the French simply move to better locations? We know, as in Ganong and Shoag’s paper above, that in the absence of legal restrictions, people move toward more productive places. Indeed, there is a lot of hostility to the idea of path dependence more generally. Consider, for example, the case of the typewriter, which “famously” has its QWERTY form because of an idiosyncracy in the very early days of the typewriter. QWERTY is said to be much less efficient than alternative key layouts like Dvorak. Liebowitz and Margolis put this myth to bed: not only is QWERTY fairly efficient (you can think much faster than you can type for any reasonable key layout), but typewriting companies spent huge amounts of money on training schools and other mechanisms to get secretaries to switch toward the companies’ preferred keyboards. That is, while it can be true that what happened in the past matters, it is also true that there are many ways to coordinate people to shift to a more efficient path if a suitable large productivity improvement exists.

With cities, coordinating on the new productive location is harder. In France, Michaels and Rauch suggest that bishops and the church began playing the role of a provider of public goods, and that the continued provision of public goods in certain formerly-Roman cities led them to grow faster than they otherwise would have. Indeed, Roman cities in France with no bishop show a very similar pattern to Roman cities in Britain: general decline. That sunk costs and non-economic institutional persistence can lead to multiple steady states in urban geography, some of which are strictly worse, has been suggested in smaller scale studies (e.g., Redding et al RESTAT 2011 on Germany’s shift from Berlin to Frankfurt, or the historical work of Engerman and Sokoloff).

I loved this case study, and appreciate the deep dive into history that collecting data on urban locations over this period required. But the implications of this literature broadly are very worrying. Much of the developed world has, over the past forty years, pursued development policies that are very favorable to existing landowners. This has led to stickiness which makes path dependence more important, and reallocation toward more productive uses less likely, both because cities cannot shift their geographic nature and because people can’t move to cities that become more productive. We ought not artificially wind up like Dijon and Chartres in the middle ages, locking our population into locations better suited for the economy of the distant past.

2016 working paper (RePEc IDEAS). Article is forthcoming in Economic Journal. With incredible timing, Michaels and Rauch, alongside two other coauthors, have another working paper called Flooded Cities. Essentially, looking across the globe, there are frequent very damaging floods, occurring every 20 years or so in low-lying areas of cities. And yet, as long as those areas are long settled, people and economic activity simply return to those areas after a flood. Note this is true even in countries without US-style flood insurance programs. The implication is that the stickiness of urban networks, amenities, and so on tends to be very strong, and if anything encouraged by development agencies and governments, yet this stickiness means that we wind up with many urban neighborhoods, and many cities, located in places that are quite dangerous for their residents without any countervailing economic benefit. You will see their paper in action over the next few years: despite some neighborhoods flooding three times in three years, one can bet with confidence that population and economic activity will remain on the floodplains of Houston’s bayou. (And in the meanwhile, ignoring our worries about future economic efficiency, I wish only the best for a safe and quick recovery to friends and colleagues down in Houston!)

Two New Papers on Militarized Police

The so-called militarization of police has become a major issue both in libertarian policy circles and in the civil rights community. Radley Balko has done yeoman’s work showing the harms, including outrageous civil liberty violations, generated by the use of military-grade armor and weapons, the rise of the SWAT team, and the intimidating clothing preferred by many modern police. The photos of tanks on the streets of Ferguson were particularly galling. As a literal card-carrying member of the ACLU, you can imagine my own opinion about this trend.

That said, the new issue of AEJ: Policy has two side-by-side papers – one from a group at the University of Tennessee, and one by researches at Warwick and NHH – that give quite shocking evidence about the effects of militarized police. They both use the “1033 Program”, where surplus military equipment was transferred to police departments, to investigate how military equipment affects crime, citizen complaints, violence by officers, and violence against police. Essentially, when the military has a surplus, such as when the changed a standard gun in 2006, the decommissioned supplies are given to centers located across the country which then send those out to police departments within a few weeks. The application forms are short and straightforward, and are not terribly competitive. About 30 percent of the distributions are things like vests, clothing and first aid kits, while the rest is more tactical: guns, drones, vehicles, and so on.

Causal identification is, of course, a worry here: places that ask for military equipment are obviously unusual. The two papers use rather different identification strategies. The Tennessee paper uses the distance to a distribution center as an instrument, since the military wants to reduce the cost of decommissioning and hence prefers closer departments. Therefore, a first-stage IV will predict whether a sheriff gets new military items on the joint basis of total material decommissioned combined with their distance to decommissioning centers. The Warwick-NHH paper uses the fact that some locations apply frequently for items, and others only infrequently. When military spending is high, there is a lot more excess to decommission. Therefore, an instrument combining overall military spending with previous local requests for “1033” items can serve as a first stage for predicted surplus items received.

Despite the different local margins these two instruments imply, the findings in both papers are nearly identical. In places that get more military equipment, crime falls, particularly for crime that is easy to deter like carjacking or low-level drug crime. Citizen complaints, if anything, go down. Violence against police falls. And there is no increase in officer-caused deaths. In terms of magnitudes, the fall in crime is substantial given the cost: the Warwick-NHH paper finds the value of reduced crime, using standard metrics, is roughly 20 times the cost of the military equipment. Interestingly, places that get this equipment also hire fewer cops, suggesting some sort of substitutability between labor and capital in policing. The one negative finding, in the Tennessee paper, is that arrests for petty crimes appear to rise in a minor way.

Both papers are very clear that these results don’t mean we should militarize all police departments, and both are clear that in places with poor community-police relations, militarization can surely inflame things further. But the pure empirical estimates, that militarization reduces crime without any objectively measured cost in terms of civic unhappiness, are quite mind-blowing in terms of changing my own priors. It is similar to the Doleac-Hansen result that “Ban the Box” leads to worse outcomes for black folks, for reasons that make perfect game theoretic sense; I couldn’t have imagined Ban the Box was a bad policy, but the evidence these serious researchers present is too compelling to ignore.

So how are we to square these results with the well-known problems of police violence, and poor police-citizen relations, in the United States? Consider Roland Fryer’s recent paper on police violence and race, where essentially the big predictor of police violence is interacting with police, not individual characteristics. A unique feature of the US compared to other developed countries is that there really is more violent crime, hence police are rationally more worried about it, therefore people who interact with police are worried about violence from police. Policies that reduce the extent to which police and civilians interact in potentially dangerous settings reduce this cycle. You might argue – I certainly would – that policing is no more dangerous than, say, professional ocean fishing or taxicab driving, and you wouldn’t be wrong. But as long as the perception of a possibility of violence remains, things like military-grade vests or vehicles may help break the violence cycle. We shall see.

The two AEJ: Policy papers are Policeman on the Frontline or a Soldier?” (V. Bove & E. Gavrilova) and Peacekeeping Force: Effects of Providing Tactical Equipment to Local Law Enforcement (M. C. Harris, J. S. Park, D. J. Bruce and M. N. Murray). I am glad to see that the former paper, particularly, cites heavily from the criminology literature. Economics has a reputation in the social sciences both for producing unbiased research (as these two papers, and the Fryer paper, demonstrate) and for refusing to acknowledge quality work done in the sister social sciences, so I am particularly glad to see the latter problem avoided in this case!

“The Development Effects of the Extractive Colonial Economy,” M. Dell & B. Olken (2017)

A good rule of thumb is that you will want to read any working paper Melissa Dell puts out. Her main interest is the long-run path-dependent effect of historical institutions, with rigorous quantitative investigation of the subtle conditionality of the past. For instance, in her earlier work on Peru (Econometrica, 2010), mine slavery in the colonial era led to fewer hacienda style plantations at the end of the era, which led to less political power without those large landholders in the early democratic era, which led to fewer public goods throughout the 20th century, which led to less education and income today in eras that used to have mine slavery. One way to read this is that local inequality is the past may, through political institutions, be a good thing today! History is not as simple as “inequality is the past causes bad outcomes today” or “extractive institutions in the past cause bad outcomes today” or “colonial economic distortions cause bad outcomes today”. But, contra the branch of historians that don’t like to assign causality to any single factor in any given situation, we don’t need to entirely punt on the effects of specific policies in specific places if we apply careful statistical and theoretical analysis.

Dell’s new paper looks at the cultuurstelsel, a policy the Dutch imposed on Java in the mid-19th century. Essentially, the Netherlands was broke and Java was suitable for sugar, so the Dutch required villages in certain regions to use huge portions of their arable land, and labor effort, to produce sugar for export. They built roads and some rail, as well as sugar factories (now generally long gone), as part of this effort, and the land used for sugar production generally became public village land controlled at the behest of local leaders. This was back in the mid-1800s, so surely it shouldn’t affect anything of substance today?

But it did! Take a look at villages near the old sugar plantations, or that were forced to plant sugar, and you’ll find higher incomes, higher education levels, high school attendance rates even back in the late colonial era, higher population densities, and more workers today in retail and manufacturing. Dell and Olken did some wild data matching using a great database of geographic names collected by the US government to match the historic villages where these sugar plants, and these labor requirements, were located with modern village and town locations. They then constructed “placebo” factories – locations along coastal rivers in sugar growing regions with appropriate topography where a plant could have been located but wasn’t. In particular, as in the famous Salop circle, you won’t locate a factory too close to an existing one, but there are many counterfactual equilibria where we just shift all the factories one way or the other. By comparing the predicted effect of distance from the real factory on outcomes today with the predicted effect of distance from the huge number of hypothetical factories, you can isolate the historic local influence of the real factory from other local features which can’t be controlled for.

Consumption right next to old, long-destroyed factories is 14% higher than even five kilometers away, education is 1.25 years longer on average, electrification, road, and rail density are all substantially higher, and industrial production upstream and downstream from sugar (e.g., farm machinery upstream, and processed foods downstream) are also much more likely to be located in villages with historic factories even if there is no sugar production anymore in that region!

It’s not just the factory and Dutch investments that matter, however. Consider the villages, up to 10 kilometers away, which were forced to grow the raw cane. Their elites took private land for this purpose, and land inequality remains higher in villages that were forced to grow cane compared to villages right next door that were outside the Dutch-imposed boundary. But this public land permitted surplus extraction in an agricultural society which could be used for public goods, like schooling, which would later become important! These villages were much more likely to have schools especially before the 1970s, when public schooling in Indonesia was limited, and today are higher density, richer, more educated, and less agricultural than villages nearby which weren’t forced to grow cane. This all has shades of the long debate on “forward linkages” in agricultural societies, where it is hypothesized that agricultural surplus benefits industrialization by providing the surplus necessary for education and capital to be purchased; see this nice paper by Sam Marden showing linkages of this sort in post-Mao China.

Are you surprised by these results? They fascinate me, honestly. Think through the logic: forced labor (in the surrounding villages) and extractive capital (rail and factories built solely to export a crop in little use domestically) both have positive long-run local effects! They do so by affecting institutions – whether villages have the ability to produce public goods like education – and by affecting incentives – the production of capital used up- and downstream. One can easily imagine cases where forced labor and extractive capital have negative long-run effects, and we have great papers by Daron Acemoglu, Nathan Nunn, Sara Lowes and others on precisely this point. But it is also very easy for societies to get trapped in bad path dependent equilibria, for which outside intervention, even ethically shameful ones, can (perhaps inadvertently) cause useful shifts in incentives and institutions! I recall a visit to Babeldaob, the main island in Palau. During the Japanese colonial period, the island was heavily industrialized as part of Japan’s war machine. These factories were destroyed by the Allies in World War 2. Yet despite their extractive history, a local told me many on the island believe that the industrial development of the region was permanently harmed when those factories were damaged. It seems a bit crazy to mourn the loss of polluting, extractive plants whose whole purpose was to serve a colonial master, but the Palauan may have had some wisdom after all!

2017 Working Paper is here (no RePEc IDEAS version). For more on sugar and institutions, I highly recommend Christian Dippel, Avner Greif and Dan Trefler’s recent paper on Caribbean sugar. The price of sugar fell enormously in the late 19th century, yet wages on islands which lost the ability to productively export sugar rose. Why? Planters in places like Barbados had so much money from their sugar exports that they could manipulate local governance and the police, while planters in places like the Virgin Islands became too poor to do the same. This decreased labor coercion, permitting workers on sugar plantations to work small plots or move to other industries, raising wages in the end. I continue to await Suresh Naidu’s book on labor coercion – it is astounding the extent to which labor markets were distorted historically (see, e.g., Eric Foner on Reconstruction), and in some cases still today, by legal and extralegal restrictions on how workers could move on up.

William Baumol: Truly Productive Entrepreneurship

It seems this weblog has become an obituary page rather than a simple research digest of late. I am not even done writing on the legacy of Ken Arrow (don’t worry – it will come!) when news arrives that yet another product of the World War 2 era in New York City, an of the CCNY system, has passed away: the great scholar of entrepreneurship and one of my absolute favorite economists, William Baumol.

But we oughtn’t draw the line on his research simply at entrepreneurship, though I will walk you through his best piece in the area, a staple of my own PhD syllabus, on “creative, unproductive, and destructive” entrepreneurship. Baumol was also a great scholar of the economics of the arts, performing and otherwise, which were the motivation for his famous cost disease argument. He was a very skilled micro theorist, a talented economic historian, and a deep reader of the history of economic thought, a nice example of which is his 2000 QJE on what we have learned since Marshall. In all of these areas, his papers are a pleasure to read, clear, with elegant turns of phrase and the casual yet erudite style of an American who’d read his PhD in London under Robbins and Viner. That he has passed without winning his Nobel Prize is a shame – how great would it have been had he shared a prize with Nate Rosenberg before it was too late for them both?

Baumol is often naively seen as a Schumpeter-esque defender of the capitalist economy and the heroic entrepreneur, and that is only half right. Personally, his politics were liberal, and as he argued in a recent interview, “I am well aware of all the very serious problems, such as inequality, unemployment, environmental damage, that beset capitalist societies. My thesis is that capitalism is a special mechanism that is uniquely effective in accomplishing one thing: creating innovations, applying those innovations and using them to stimulate growth.” That is, you can find in Baumol’s work many discussions of environmental externalities, of the role of government in funding research, in the nature of optimal taxation. You can find many quotes where Baumol expresses interest in the policy goals of the left (though often solved with the mechanism of the market, and hence the right). Yet the core running through much of Baumol’s work is a rigorous defense, historically and theoretically grounded, in the importance of getting incentives correct for socially useful innovation.

Baumol differs from many other prominent economists of innovation because is at his core a neoclassical theorist. He is not an Austrian like Kirzner or an evolutionary economist like Sid Winter. Baumol’s work stresses that entrepreneurs and the innovations they produce are fundamental to understanding the capitalist economy and its performance relative to other economic systems, but that the best way to understand the entrepreneur methodologically was to formalize her within the context of neoclassical equilibria, with innovation rather than price alone being “the weapon of choice” for rational, competitive firms. I’ve always thought of Baumol as being the lineal descendant of Schumpeter, the original great thinker on entrepreneurship and one who, nearing the end of his life and seeing the work of his student Samuelson, was convinced that his ideas should be translated into formal neoclassical theory.

A 1968 essay in the AER P&P laid out Baumol’s basic idea that economics without the entrepreneur is, in a line he would repeat often, like Hamlet without the Prince of Denmark. He clearly understood that we did not have a suitable theory for oligopoly and entry into new markets, or for the supply of entrepreneurs, but that any general economic theory needed to be able to explain why growth is different in different countries. Solow’s famous essay convinced much of the profession that the residual, interpreted then primarily as technological improvement, was the fundamental variable explaining growth, and Baumol, like many, believed those technological improvements came mainly from entrepreneurial activity.

But what precisely should the theory look like? Ironically, Baumol made his most productive step in a beautiful 1990 paper in the JPE which contains not a single formal theorem nor statistical estimate of any kind. Let’s define an entrepreneur as “persons who are ingenious or creative in finding ways to add to their wealth, power, or prestige”. These people may introduce new goods, or new methods of production, or new markets, as Schumpeter supposed in his own definition. But are these ingenious and creative types necessarily going to do something useful for social welfare? Of course not – the norms, institutions, and incentives in a given society may be such that the entrepreneurs perform socially unproductive tasks, such as hunting for new tax loopholes, or socially destructive tasks, such as channeling their energy into ever-escalating forms of warfare.

With the distinction between productive, unproductive, and destructive entrepreneurship in mind, we might imagine that the difference in technological progress across societies may have less to do with the innate drive of the society’s members, and more to do with the incentives for different types of entrepreneurship. Consider Rome, famously wealthy yet with very little in the way of useful technological diffusion: certainly the Romans appear less innovative than either the Greeks or Europe of the Middle Ages. How can a society both invent a primitive steam engine – via Herod of Alexandria – and yet see it used for nothing other than toys and religious ceremonies? The answer, Baumol notes, is that status in Roman society required one to get rich via land ownership, usury, or war; commerce was a task primarily for slaves and former slaves! And likewise in Song dynasty China, where imperial examinations were both the source of status and the ability to expropriate any useful inventions or businesses that happened to appear. In the European middle ages, incentives shift for the clever from developing war implements to the diffusion of technology like the water-mill under the Cistercians back to weapons. These examples were expanded to every society from Ancient Mesopotamia to the Dutch Republic to the modern United States in a series of economically-minded historians in a wonderful collection of essays called “The Invention of Enterprise” which was edited by Baumol alongside Joel Mokyr and David Landes.

Now we are approaching a sort of economic theory of entrepreneurship – no need to rely on the whims of character, but instead focus on relative incentives. But we are still far from Baumol’s 1968 goal: incorporating the entrepreneur into neoclassical theory. The closest Baumol comes is in his work in the early 1980s on contestable markets, summarized in the 1981 AEA Presidential Address. The basic idea is this. Assume industries have scale economies, so oligopoly is their natural state. How worried should we be? Well, if there are no sunk costs and no entry barriers for entrants, and if entrants can siphon off customers quicker than incumbents can respond, then Baumol and his coauthors claimed that the market was contestable: the threat of entry is sufficient to keep the incumbent from exerting their market power. On the one hand, fine, we all agree with Baumol now that industry structure is endogenous to firm behavior, and the threat of entry clearly can restrain market power. But on the other hand, is this “ultra-free entry” model the most sensible way to incorporate entry and exit into a competitive model? Why, as Dixit argued, is it quicker to enter a market than to change price? Why, as Spence argued, does the unrealized threat of entry change equilibrium behavior if the threat is truly unrealized along the equilibrium path?

It seems that what Baumol was hoping this model would lead to was a generalized theory of perfect competition that permitted competition for the market rather than just in the market, since the competition for the market is naturally the domain of the entrepreneur. Contestable markets are too flawed to get us there. But the basic idea, that game-theoretic endogenous market structure, rather than the old fashioned idea that industry structure affects conduct affects performance, is clearly here to stay: antitrust is essentially applied game theory today. And once you have the idea of competition for the market, the natural theoretical model is one where firms compete to innovate in order to push out incumbents, incumbents innovate to keep away from potential entrants, and profits depend on the equilibrium time until the dominant firm shifts: I speak, of course, about the neo-Schumpeterian models of Aghion and Howitt. These models, still a very active area of research, are finally allowing us to rigorously investigate the endogenous rewards to innovation via a completely neoclassical model of market structure and pricing.

I am not sure why Baumol did not find these neo-Schumpeterian models to be the Holy Grail he’d been looking for; in his final book, he credits them for being “very powerful” but in the end holding different “central concerns”. He may have been mistaken in this interpretation. It proved quite interesting to give a careful second read of Baumol’s corpus on entrepreneurship, and I have to say it disappoints in part: the questions he asked were right, the theoretical acumen he possessed was up to the task, the understanding of history and qualitative intuition was second to none, but in the end, he appears to have been just as stymied by the idea of endogenous neoclassical entrepreneurship as the many other doyens of our field who took a crack at modeling this problem without, in the end, generating the model they’d hoped they could write.

Where Baumol has more success, and again it is unusual for a theorist that his most well-known contribution is largely qualitative, is in the idea of cost disease. The concept comes from Baumol’s work with William Bowen (see also this extension with a complete model) on the economic problems of the performing arts. It is a simple idea: imagine productivity in industry rises 4% per year, but “the output per man-hour of a violinist playing a Schubert quarter in a standard concert hall” remains fixed. In order to attract workers into music rather than industry, wages must rise in music at something like the rate they rise in industry. But then costs are increasing while productivity is not, and the arts looks “inefficient”. The same, of course, is said for education, and health care, and other necessarily labor-intensive industries. Baumol’s point is that rising costs in unproductive sectors reflect necessary shifts in equilibrium wages rather than, say, growing wastefulness.

How much can cost disease explain? Because the concept is so widely known by now that it is, in fact, used to excuse stagnant industries. Teaching, for example, requires some labor, but does anybody believe that it is impossible for R&D and complementary inventions (like the internet, for example) to produce massive productivity improvements? Is it not true that movie theaters now show opera live from the world’s great halls on a regular basis? Is it not true that my Google Home can, activated by voice, call up two seconds from now essentially any piece of recorded music I desire, for free? Speculating about industries that are necessarily labor-intensive (and hence grow slowly) from those with rapid technological progress is a very difficult game, and one we ought hesitate to play. But equally, we oughtn’t forget Baumol’s lesson: in some cases, in some industries, what appears to be fixable slack is in fact simply cost disease. We may ask, how was it that Ancient Greece, with its tiny population, put on so many plays, while today we hustle ourselves to small ballrooms in New York and London? Baumol’s answer, rigorously shown: cost disease. The “opportunity cost” of recruiting a big chorus was low, as those singers would otherwise have been idle or working unproductive fields gathering olives. The difference between Athens and our era is not simply that they were “more supportive of the arts”!

Baumol was incredibly prolific, so these suggestions for further reading are but a taste: An interview by Alan Krueger is well worth the read for anecdotes alone, like the fact that apparently one used to do one’s PhD oral defense “over whiskies and sodas at the Reform Club”. I also love his defense of theory, where if he is very lucky, his initial intuition “turn[s] out to be totally wrong. Because when I turn out to be totally wrong, that’s when the best ideas come out. Because if my intuition was right, it’s almost always going to be simple and straightforward. When my intuition turns out to be wrong, then there is something less obvious to explain.” Every theorist knows this: formalization has this nasty habit of refining our intuition and convincing us our initial thoughts actually contain logical fallacies or rely on special cases! Though known as an applied micro theorist, Baumol also wrote a canonical paper, with Bradford, on optimal taxation: essentially, if you need to raise $x in tax, how should you optimally deviate from marginal cost pricing? The history of thought is nicely diagrammed, and of course this 1970 paper was very quickly followed by the classic work of Diamond and Mirrlees. Baumol wrote extensively on environmental economics, drawing in many of his papers on the role nonconvexities in the social production possibilities frontier play when they are generated by externalities – a simple example of this effect, and the limitations it imposes on Pigouvian taxation, is in the link. More recently, Baumol has been writing on international trade with Ralph Gomory (the legendary mathematician behind a critical theorem in integer programming, and later head of the Sloan Foundation); their main theorems are not terribly shocking to those used to thinking in terms of economies of scale, but the core example in the linked paper is again a great example of how nonconvexities can overturn a lot of our intuition, in the case on comparative advantage. Finally, beyond his writing on the economics of the arts, Baumol proved that there is no area in which he personally had stagnant productivity: an art major in college, he was also a fantastic artist in his own right, picking up computer-generated art while in his 80s and teaching for many years a course on woodworking at Princeton!

A John Bates Clark Prize for Economic History!

A great announcement last week, as Dave Donaldson, an economic historian and trade economist, has won the 2017 John Bates Clark medal! This is an absolutely fantastic prize: it is hard to think of any young economist whose work is as serious as Donaldson’s. What I mean by that is that in nearly all of Donaldson’s papers, there is a very specific and important question, a deep collection of data, and a rigorous application of theory to help identify the precise parameters we are most concerned with. It is the modern economic method at its absolute best, and frankly is a style of research available to very few researchers, as the specific combination of theory knowledge and empirical agility required to employ this technique is very rare.

A canonical example of Donaldson’s method is his most famous paper, written back when he was a graduate student: “The Railroads of the Raj”. The World Bank today spends more on infrastructure than on health, education, and social services combined. Understanding the link between infrastructure and economic outcomes is not easy, and indeed has been a problem that has been at the center of economic debates since Fogel’s famous accounting on the railroad. Further, it is not obvious either theoretically or empirically that infrastructure is good for a region. In the Indian context, no less a sage than the proponent of traditional village life Mahatma Gandhi felt the British railroads, rather than help village welfare, “promote[d] evil”, and we have many trade models where falling trade costs plus increasing returns to scale can decrease output and increase income volatility.

Donaldson looks at the setting of British India, where 67,000 kilometers of rail were built, largely for military purposes. India during the British Raj is particularly compelling as a setting due to its heterogeneous nature. Certain seaports – think modern Calcutta – were built up by the British as entrepots. Many internal regions nominally controlled by the British were left to rot via, at best, benign neglect. Other internal regions were quasi-independent, with wildly varying standards of governance. The most important point, though, is that much of the interior was desperately poor and in a de facto state of autarky: without proper roads or rail until the late 1800s, goods were transported over rough dirt paths, leading to tiny local “marketing regions” similar to what Skinner found in his great studies of China. British India is also useful since data on goods shipped, local on weather conditions, and agricultural prices were rigorously collected by the colonial authorities. Nearly all that local economic data is in dusty tomes in regional offices across the modern subcontinent, but it is at least in principle available.

Let’s think about how many competent empirical microeconomists would go about investigating the effects of the British rail system. It would be a lot of grunt work, but many economists would spend the time collecting data from those dusty old colonial offices. They would then worry that railroads are endogenous to economic opportunity, so would hunt for reasonable instruments or placebos, such as railroads that were planned yet unbuilt, or railroad segments that skipped certain areas because of temporary random events. They would make some assumptions on how to map agricultural output into welfare, probably just restricting the dependent variable in their regressions to some aggregate measure of agricultural output normalized by price. All that would be left to do is run some regressions and claim that the arrival of the railroad on average raised agricultural income by X percent. And look, this wouldn’t be a bad paper. The setting is important, the data effort heroic, the causal factors plausibly exogenous: a paper of this form would have a good shot at a top journal.

When I say that Donaldson does “serious” work, what I mean is that he didn’t stop with those regressions. Not even close! Consider what we really want to know. It’s not “What is the average effect of a new railroad on incomes?” but rather, “How much did the railroad reduce shipping costs, in each region?”, “Why did railroads increase local incomes?”, “Are there alternative cheaper policies that could have generated the same income benefit?” and so on. That is, there are precise questions, often involving counterfactuals, which we would like to answer, and these questions and counterfactuals necessarily involve some sort of model mapping the observed data into hypotheticals.

Donaldson leverages both reduced-form, well-identified evidence, and that broader model we suggested was necessary, and does so with a paper which is beautifully organized. First, he writes down an Eaton-Kortum style model of trade (Happy 200th Birthday to the theory of comparative advantage!) where districts get productivity draws across goods then trade subject to shipping costs. Consider this intuition: if a new rail line connect Gujarat to Bihar, then the existence of this line will change Gujarat’s trade patterns with every other state, causing those other states to change their own trade patterns, causing a whole sequence of shifts in relative prices that depend on initial differences in trade patterns, the relative size of states, and so on. What Donaldson notes is that if you care about welfare in Gujarat, all of those changes only affect Gujaratis if they affect what Gujaratis end up consuming, or equivalently if it affects the real income they earn from their production. Intuitively, if pre-railroad Gujarat’s local consumption was 90% locally produced, and after the railroad was 60% locally produced, then declining trade costs permitted the magic of comparative advantage to permit additional specialization and hence additional Ricardian rents. This is what is sometimes called a sufficient statistics approach: the model suggests that the entire effect of declining trade costs on welfare can be summarized by knowing agricultural productivity for each crop in each area, the local consumption share which is imported, and a few elasticity parameters. Note that the sufficient statistic is a result, not an assumption: the Eaton-Kortum model permits taste for variety, for instance, so we are not assuming away any of that. Now of course the model can be wrong, but that’s something we can actually investigate directly.

So here’s what we’ll do: first, simply regress time and region dummies plus a dummy for whether rail has arrived in a region on real agricultural production in that region. This regression suggests a rail line increases incomes by 16%, whereas placebo regressions for rail lines that were proposed but canceled see no increase at all. 16% is no joke, as real incomes in India over the period only rose 22% in total! All well and good. But what drives that 16%? Is it really Ricardian trade? To answer that question, we need to estimate the parameters in that sufficient statistics approach to the trade model – in particular, we need the relative agricultural productivity of each crop in each region, elasticities of trade flows to trade costs (and hence the trade costs themselves), and the share of local consumption which is locally produced (the “trade share”). We’ll then note that in the model, real income in a region is entirely determined by an appropriately weighted combination of local agricultural productivity and changes in the weighted trade share, hence if you regress real income minus the weighted local agricultural productivity shock on a dummy for the arrival of a railroad and the trade share, you should find a zero coefficient on the rail dummy if, in fact, the Ricardian model is capturing why railroads affect local incomes. And even more importantly, if we find that zero, then we understand that efficient infrastructure benefits a region through the sufficient statistic of the trade share, and we can compare the cost-benefit ratio of the railroad to other hypothetical infrastructure projects on the basis of a few well-known elasticities.

So that’s the basic plot. All that remains is to estimate the model parameters, a nontrivial task. First, to get trade costs, one could simply use published freight rates for boats, overland travel, and rail, but this wouldn’t be terribly compelling; bandits, and spoilage, and all the rest of Samuelson’s famous “icebergs” like linguistic differences raise trade costs as well. Donaldson instead looks at the differences in origin and destination prices for goods produced in only one place – particular types of salt – before and after the arrival of a railroad. He then uses a combination of graph theory and statistical inference to estimate the decline in trade costs between all region pairs. Given massive heterogeneity in trade costs by distance – crossing the Western Ghats is very different from shipping a boat down the Ganges! – this technique is far superior to simply assuming trade costs linear in distance for rail, road, or boat.

Second, he checks whether lowered trade costs actually increased trade volume, and at what elasticity, using local rainfall as a proxy for local productivity shocks. The use of rainfall data is wild: for each district, he gathers rainfall deviations for the sowing to harvest times individually for each crop. This identifies the agricultural productivity distribution parameters by region, and therefore, in the Eaton-Kortum type model, lets us calculate the elasticity of trade volume to trade shocks. Salt shipments plus crop-by-region specific rain shocks give us all of the model parameters which aren’t otherwise available in the British data. Throwing these parameters into the model regression, we do in fact find that once agricultural productivity shocks and the weighted trade share are accounted for, the effect of railroads on local incomes are not much different from zero. The model works, and note that real incomes changes based on the timing of the railroad were at no point used to estimate any of the model parameters! That is, if you told me that Bihar had positive rain shocks which increased output on their crops by 10% in the last ten years, and that the share of local production which is eaten locally went from 60 to 80%, I could tell you with quite high confidence the change in local real incomes without even needing to know when the railroad arrived – this is the sense in which those parameters are a “sufficient statistic” for the full general equilibrium trade effects induced by the railroad.

Now this doesn’t mean the model has no further use: indeed, that the model appears to work gives us confidence to take it more seriously when looking at counterfactuals like, what if Britain had spent money developing more effective seaports instead? Or building a railroad network to maximize local economic output rather than on the basis of military transit? Would a noncolonial government with half the resources, but whose incentives were aligned with improving the domestic economy, have been able to build a transport network that improved incomes more even given their limited resources? These are first order questions about economic history which Donaldson can in principle answer, but which are fundamentally unavailable to economists who do not push theory and data as far as he was willing to push them.

The Railroads of the Raj paper is canonical, but far from Donaldson’s only great work. He applies a similar Eaton-Kortum approach to investigate how rail affected the variability of incomes in India, and hence the death rate. Up to 35 million people perished in famines in India in the second half of the 19th century, as the railroad was being built, and these famines appeared to end (1943 being an exception) afterwards. Theory is ambiguous about whether openness increases or decreases the variance of your welfare. On the one hand, in an open economy, the price of potatoes is determined by the world market and hence the price you pay for potatoes won’t swing wildly up and down depending on the rain in a given year in your region. On the other hand, if you grow potatoes and there is a bad harvest, the price of potatoes won’t go up and hence your real income can be very low during a drought. Empirically, less variance in prices in the market after the railroad arrives tends to be more important for real consumption, and hence for mortality, than the lower prices you can get for your own farm goods when there is a drought. And as in the Railroads of the Raj paper, sufficient statistics from a trade model can fully explain the changes in mortality: the railroad decreased the effect of bad weather on mortality completely through Ricardian trade.

Leaving India, Donaldson and Richard Hornbeck took Fogel’s intuition that the the importance of the railroad to the US depends on trade that is worthwhile when the railroad exists versus trade that is worthwhile when only alternatives like better canals or roads exist. That is, if it costs $9 to ship a wagonful of corn by canal, and $8 to do the same by rail, then even if all corn is shipped by rail once the railroad is built, we oughtn’t ascribe all of that trade to the rail. Fogel assumed relationships between land prices and the value of the transportation network. Hornbeck and Donaldson alternatively estimate that relationship, again deriving a sufficient statistic for the value of market access. The intuition is that adding a rail link from St. Louis to Kansas City will also affect the relative prices, and hence agricultural production, in every other region of the country, and these spatial spillovers can be quite important. Adding the rail line to Kansas City affects market access costs in Kansas City as well as relative prices, but clever application of theory can still permit a Fogel-style estimate of the value of rail to be made.

Moving beyond railroads, Donaldson’s trade work has also been seminal. With Costinot and Komunjer, he showed how to rigorously estimate the empirical importance of Ricardian trade for overall gains from trade. Spoiler: it isn’t that important, even if you adjust for how trade affects market power, a result seen in a lot of modern empirical trade research which suggests that aspects like variety differences are more important than Ricardian productivity differences for gains from international trade. There are some benefits to Ricardian trade across countries being relatively unimportant: Costinot, Donaldson and Smith show that changes to what crops are grown in each region can massively limit the welfare harms of climate change, whereas allowing trade patterns to change barely matters. The intuition is that there is enough heterogeneity in what can be grown in each country when climate changes to make international trade relatively unimportant for mitigating these climate shifts. Donaldson has also rigorously studied in a paper with Atkin the importance of internal rather than international trade costs, and has shown in a paper with Costinot that economic integration has been nearly as important as productivity improvements in increasing the value created by American agriculture over the past century.

Donaldson’s CV is a testament to how difficult this style of work is. He spent eight years at LSE before getting his PhD, and published only one paper in a peer reviewed journal in the 13 years following the start of his graduate work. “Railroads of the Raj” has been forthcoming at the AER for literally half a decade, despite the fact that this work is the core of what got Donaldson a junior position at MIT and a tenured position at Stanford. Is it any wonder that so few young economists want to pursue a style of research that is so challenging and so difficult to publish? Let us hope that Donaldson’s award encourages more of us to fully exploit both the incredible data we all now have access to, but also the beautiful body of theory that induces deep insights from that data.

Kenneth Arrow Part II: The Theory of General Equilibrium

The first post in this series discussed Ken Arrow’s work in the broad sense, with particular focus on social choice. In this post, we will dive into his most famous accomplishment, the theory of general equilibrium (1954, Econometrica). I beg the reader to offer some sympathy for the approximations and simplifications that will appear below: the history of general equilibrium is, by this point, well-trodden ground for historians of thought, and the interpretation of history and theory in this area is quite contentious.

My read of the literature on GE following Arrow is as follows. First, the theory of general equilibrium is an incredible proof that markets can, in theory and in certain cases, work as efficiently as an all-powerful planner. That said, the three other hopes of general equilibrium theory since the days of Walras are, in fact, disproven by the work of Arrow and its followers. Market forces will not necessarily lead us toward these socially optimal equilibrium prices. Walrasian demand does not have empirical content derived from basic ordinal utility maximization. We cannot rigorously perform comparative statics on general equilibrium economic statistics without assumptions that go beyond simple utility maximization. From my read of Walras and the early general equilibrium theorists, all three of those results would be a real shock.

Let’s start at the beginning. There is an idea going back to Adam Smith and the invisible hand, an idea that individual action will, via the price system, lead to an increase or even maximization of economic welfare (an an aside, Smith’s own use of “invisible hand” trope is overstated, as William Grampp among others has convincingly argued). The kind of people who denigrate modern economics – the neo-Marxists, the back-of-the-room scribblers, the wannabe-contrarian-dilletantes – see Arrow’s work, and the idea of using general equilibrium theory to “prove that markets work”, as a barbarism. We know, and have known well before Arrow, that externalities exist. We know, and have known well before Arrow, that the distribution of income depends on the distribution of endowments. What Arrow was interested in was examining not only whether the invisible hand argument “is true, but whether it could be true”. That is, if we are to claim markets are uniquely powerful at organizing economic activity, we ought formally show that the market could work in such a manner, and understand the precise conditions under which it won’t generate these claimed benefits. How ought we do this? Prove the precise conditions under which there exists a price vector where markets clear, show the outcome satisfies some welfare criterion that is desirable, and note exactly why each of the conditions are necessary for such an outcome.

The question is, how difficult is it to prove these prices exist? The term “general equilibrium” has had many meanings in economics. Today, it is often used to mean “as opposed to partial equilibrium”, meaning that we consider economic effects allowing all agents to adjust to a change in the environment. For instance, a small random trial of guaranteed incomes has, as its primary effect, an impact on the incomes of the recipients; the general equilibrium effects of making such a policy widespread on the labor market will be difficult to discern. In the 19th and early 20th century, however, the term was much more concerned with the idea of the economy as a self-regulating system. Arrow put it very nicely in an encyclopedia chapter he wrote in 1966: general equilibrium is both “the simple notion of determinateness, that the relations which describe the economic system must form a system sufficiently complete to determine the values of its variables and…the more specific notion that each relation represents a balance of forces.”

If you were a classical, a Smith or a Marx or a Ricardo, the problem of what price will obtain in a market is simple to solve: ignore demand. Prices are implied by costs and a zero profit condition, essentially free entry. And we more or less think like this now in some markets. With free entry and every firm producing at the identical minimum efficient scale, price is entirely determined by the supply side, and only quantity is determined by demand. With one factor, labor where the Malthusian condition plays the role of free entry, or labor and land in the Ricardian system, this classical model of value is well-defined. How to handle capital and differentiated labor is a problem to be assumed away, or handled informally; Samuelson has many papers where he is incensed by Marx’s handling of capital as embodied labor.

The French mathematical economist Leon Walras finally cracked the nut by introducing demand and price-taking. There are household who produce and consume. Equilibrium involves supply and demand equating in each market, hence price is where margins along the supply and demand curves equate. Walras famously (and informally) proposed a method by which prices might actually reach equilibrium: the tatonnement. An auctioneer calls out a price vector: in some markets there is excess demand and in some excess supply. Prices are then adjusted one at a time. Of course each price change will affect excess demand and supply in other markets, but you might imagine things can “converge” if you adjust prices just right. Not bad for the 1870s – there is a reason Schumpeter calls this the “Magna Carta” of economic theory in his History of Economic Analysis. But Walras was mistaken on two counts: first, knowing whether there even exists an equilibrium that clears every market simultaneously is, it turns out, equivalent to a problem in Poincare’s analysis situs beyond the reach of mathematics in the 19th century, and second, the conditions under which tatonnement actually converges are a devilish problem.

The equilibrium existence problem is easy to understand. Take the simplest case, with all j goods made up of the linear combination of k factors. Demand equals supply just says that Aq=e, where q is the quantity of each good produced, e is the endowment of each factor, and A is the input-output matrix whereby product j is made up of some combination of factors k. Also, zero profit in every market will imply Ap(k)=p(j), where p(k) are the factor prices and p(j) the good prices. It was pointed out that even in this simple system where everything is linear, it is not at all trivial to ensure that prices and quantities are not negative. It would not be until Abraham Wald in the mid-1930s – later Arrow’s professor at Columbia and a fellow Romanian, links that are surely not a coincidence! – that formal conditions were shown giving existence of general equilibrium in a simple system like this one, though Wald’s proof greatly simplified by the general problem by imposing implausible restrictions on aggregate demand.

Mathematicians like Wald, trained in the Vienna tradition, were aghast at the state of mathematical reasoning in economics at the time. Oskar Morgenstern absolutely hammered the great economist John Hicks in a 1941 review of Hicks’ Value and Capital, particularly over the crazy assertion (similar to Walras!) that the number of unknowns and equations being identical in a general equilibrium system sufficed for a solution to exist (if this isn’t clear to you in a nonlinear system, a trivial example with two equations and two unknowns is here). Von Neumann apparently said (p. 85) to Oskar, in reference to Hicks and those of his school, “if those books are unearthed a hundred years hence, people will not believe they were written in our time. Rather they will think they are about contemporary with Newton, so primitive is the mathematics.” And Hicks was quite technically advanced compared to his contemporary economists, bringing the Keynesian macroeconomics and the microeconomics of indifference curves and demand analysis together masterfully. Arrow and Hahn even credit their initial interest in the problems of general equilibrium to the serendipity of coming across Hicks’ book.

Mathematics had advanced since Walras, however, and those trained at the mathematical frontier finally had the tools to tackle Walras’ problem seriously. Let D(p) be a vector of demand for all goods given price p, and e be initial endowments of each good. Then we simply need D(p)=e or D(p)-e=0 in each market. To make things a bit harder, we can introduce intermediate and factor goods with some form of production function, but the basic problem is the same: find whether there exists a vector p such that a nonlinear equation is equal to zero. This is the mathematics of fixed points, and Brouwer had, in 1912, given a nice theorem: every continuous function from a compact convex subset to itself has a fixed point. Von Neumann used this in the 1930s to prove a similar result to Wald. A mathematician named Shizuo Kakutani, inspired by von Neumann, extended the Brouwer result to set-valued mappings called correspondences, and John Nash in 1950 used that result to show, in a trivial proof, the existence of mixed equilibria in noncooperative games. The math had arrived: we had the tools to formally state when non-trivial non-linear demand and supply systems had a fixed point, and hence a price that cleared all markets. We further had techniques for handling “corner solutions” where demand for a given good was zero at some price, surely a common outcome in the world: the idea of the linear program and complementary slackness, and its origin in convex set theory as applied to the dual, provided just the mathematics Arrow and his contemporaries would need.

So here we stood in the early 1950s. The mathematical conditions necessary to prove that a set-valued function has an equilibrium have been worked out. Hicks, in Value and Capital, has given Arrow the idea that relating the future to today is simple: just put a date on every commodity and enlarge the commodity space. Indeed, adding state-contingency is easy: put an index for state in addition to date on every commodity. So we need not only zero excess demand in apples, or in apples delivered in May 1955, but in apples delivered in May 1955 if Eisenhower loses his reelection bid. Complex, it seems, but no matter: the conditions for the existence of a fixed point will be the same in this enlarged commodity space.

With these tools in mind, Arrow and Debreu can begin their proof. They first define a generalization of an n-person game where the feasible set of actions for each player depends on the actions of every other player; think of the feasible set as “what can I afford given the prices that will result for the commodities I am endowed with?” The set of actions is an n-tuple where n is the number of date and state indexed commodities a player could buy. Debreu showed in 1952 PNAS that these generalized games have an equilibrium as long as each payoff function varies continuously with other player’s actions, the feasible set of choices convex and varies continuously in other player’s actions, and the set of actions which improve a player’s payoff are convex for every action profile. Arrow and Debreu then show that the usual implications on individual demand are sufficient to aggregate up to the conditions Debreu’s earlier paper requires. This method is much, much different from what is done by McKenzie or other early general equilibrium theorists: excess demand is never taken as a primitive. This allows the Arrow-Debreu proof to provide substantial economic intuition as Duffie and Sonnenschein point out in a 1989 JEL. For instance, showing that the Arrow-Debreu equilibrium exists even with taxation is trivial using their method but much less so in methods that begin with excess demand functions.

This is already quite an accomplishment: Arrow and Debreu have shown that there exists a price vector that clears all markets simultaneously. The nature of their proof, as later theorists will point out, relies less on convexity on preferences and production sets as on the fact that every agent is “small” relative to the market (convexity is used to get continuity in the Debreu game, and you can get this equally well by making all consumers infinitesimal and then randomizing allocations to smooth things out; see Duffie and Sonnenschein above for an example). At this point, it’s the mid-1950s, heyday of the Neoclassical synthesis: surely we want to be able to answer questions like, when there is a negative demand shock, how will the economy best reach a Pareto-optimal equilibrium again? How do different speeds of adjustment due to sticky prices or other frictions affect the rate at which optimal is regained? Those types of question implicitly assume that the equilibrium is unique (at least locally) so that we actually can “return” to where we were before the shock. And of course we know some of the assumptions needed for the Arrow-Debreu proof are unrealistic – e.g., no fixed costs in production – but we would at least like to work out how to manipulate the economy in the “simple” case before figuring out how to deal with those issues.

Here is where things didn’t work out as hoped. Uzawa (RESTUD, 1960) proved that not only could Brouwer’s theorem be used to prove the existence of general equilibrum, but that the opposite was true as well: the existence of general equilibrium was logically equivalent to Brouwer. A result like this certainly makes one worry about how much one could say about prices in general equilibrium. The 1970s brought us the Sonnenschein-Mantel-Debreu “Anything Goes” theorem: aggregate excess demand functions do not inherit all the properties of individual excess demand functions because of wealth effects (when relative prices change, the value of one’s endowment changes as well). For any aggregate excess demand function satisfying a couple minor restrictions, there exists an economy with individual preferences generating that function; in particular, fewer restrictions than are placed on individual excess demand as derived from individual preference maximization. This tells us, importantly, that there is no generic reason for equilibria to be unique in an economy.

Multiplicity of equilibria is a problem: if the goal of GE was to be able to take underlying primitives like tastes and technology, calculate “the” prices that clear the market, then examine how those prices change (“comparative statics”), we essentially lose the ability to do all but local comparative statics since large changes in the environment may cause the economy to jump to a different equilibrium (luckily, Debreu (1970, Econometrica) at least generically gives us a finite number of equilibria, so we may at least be able to say something about local comparative statics for very small shocks). Indeed, these analyses are tough without an equilibrium selection mechanism, which we don’t really have even now. Some would say this is no big deal: of course the same technology and tastes can generate many equilibria, just as cars may wind up all driving on either the left or the right in equilibrium. And true, all of the Arrow-Debreu equilibria are Pareto optimal. But it is still far afield from what might have been hoped for in the 1930s when this quest for a modern GE theory began.

Worse yet is stability, as Arrow and his collaborators (1958, Ecta; 1959, Ecta) would help discover. Even if we have a unique equilibrium, Herbert Scarf (IER, 1960) showed, via many simple examples, how Walrasian tatonnement can lead to cycles which never converge. Despite a great deal of the intellectual effort in the 1960s and 1970s, we do not have a good model of price adjustment even now. I should think we are unlikely to ever have such a theory: as many theorists have pointed out, if we are in a period of price adjustment and not in an equilibrium, then the zero profit condition ought not apply, ergo why should there be “one” price rather than ten or a hundred or a thousand?

The problem of multiplicity and instability for comparative static analysis ought be clear, but it should also be noted how problematic they are for welfare analysis. Consider the Second Welfare Theorem: under the Arrow-Debreu system, for every Pareto optimal allocation, there exists an initial endowment of resources such that that allocation is an equilibrium. This is literally the main justification for the benefits of the market: if we reallocate endowments, free exchange can get us to any Pareto optimal point, ergo can get us to any reasonable socially optimal point no matter what social welfare function you happen to hold. How valid is this justification? Call x* the allocation that maximizes some social welfare function. Let e* be an initial endowment for which x* is an equilibrium outcome – such an endowment must exist via Arrow-Debreu’s proof. Does endowing agents with e* guarantee we reach that social welfare maximum? No: x* may not be unique. Even if it unique, will we reach it? No: if it is not a stable equilibrium, it is only by dint of luck that our price adjustment process will ever reach it.

So let’s sum up. In the 1870s, Walras showed us that demand and supply, with agents as price takers, can generate supremely useful insights into the economy. Since demand matters, changes in demand in one market will affect other markets as well. If the price of apples rises, demand for pears will rise, as will their price, whose secondary effect should be accounted for in the market for apples. By the 1930s we have the beginnings of a nice model of individual choice based on constrained preference maximization. Taking prices as given, individual demands have well-defined forms, and excess demand in the economy can be computed by a simple summing up. So we now want to know: is there in fact a price that clears the market? Yes, Arrow and Debreu show, there is, and we needn’t assume anything strange about individual demand to generate this. These equilibrium prices always give Pareto optimal allocations, as had long been known, but there also always exist endowments such that every Pareto optimal allocation is an equilibria. It is a beautiful and important result, and a triumph for the intuition of the invisible hand it its most formal sense.

Alas, it is there we reach a dead end. Individual preferences alone do not suffice to tell us what equilibria we are at, nor that any equilibria will be stable, nor that any equilibria will be reached by an economically sensible adjustment process. To say anything meaningful about aggregate economic outcomes, or about comparative statics after modest shocks, or about how technological changes change price, we need to make assumptions that go beyond individual rationality and profit maximization. This is, it seems to me, a shock for the economists of the middle of the century, and still a shock for many today. I do not think this means “general equilibrium is dead” or that the mathematical exploration in the field was a waste. We learned a great deal about precisely when markets could even in principle achieve the first best, and that education was critical for the work Arrow would later do on health care, innovation, and the environment, which I will discuss in the next two posts. And we needn’t throw out general equilibrium analysis because of uniqueness or stability problems, any more than we would throw out game theoretic analysis because of the same problems. But it does mean that individual rationality as the sole paradigm of economic analysis is dead: it is mathematically proven that postulates of individual rationality will not allow us to say anything of consequence about economic aggregates or game theoretic outcomes in the frequent scenarios where we do not have a unique equilibria with a well-defined way to get there (via learning in games, or a tatonnament process in GE, or something of a similar nature). Arrow himself (1986, J. Business) accepts this: “In the aggregate, the hypothesis of rational behavior has in general no implications.” This is an opportunity for economists, not a burden, and we still await the next Arrow who can guide us on how to proceed.

Some notes on the literature: For those interested in the theoretical development of general equilibrium, I recommend General Equilibrium Analysis by Roy Weintraub, a reformed theorist who now works in the history of thought. Wade Hands has a nice review of the neoclassical synthesis and the ways in which Keynesianism and GE analysis were interrelated. On the battle for McKenzie to be credited alongside Arrow and Debreu, and the potentially scandalous way Debreu may have secretly been responsible for the Arrow and Debreu paper being published first, see the fine book Finding Equilibrium by Weintraub and Duppe; both Debreu and McKenzie have particularly wild histories. Till Duppe, a scholar of Debreu, also has a nice paper in the JHET on precisely how Arrow and Debreu came to work together, and what the contribution of each to their famous ’54 paper was.

The Greatest Living Economist Has Passed Away: Notes on Kenneth Arrow Part I

It is amazing how quickly the titans of the middle of the century have passed. Paul Samuelson and his mathematization, Ronald Coase and his connection of law to economics, Gary Becker and his incorporation of choice into the full sphere of human behavior, John Nash and his formalization of strategic interaction, Milton Friedman and his defense of the market in the precarious post-war period, Robert Fogel and his cliometric revolution: the remaining titan was Kenneth Arrow, the only living economist who could have won a second Nobel Prize without a whit of complaint from the gallery. These figures ruled as economics grew from a minor branch of moral philosophy into the most influential, most prominent, and most advanced of the social sciences. It is hard to imagine our field will ever again have such a collection of scholars rise in one generation, and with the tragic news that Ken has now passed away as well, we have, with great sadness and great rapidity, lost the full set.

Though he was 95 years old, Arrow was still hard at work; his paper with Kamran Bilir and Alan Sorensen was making its way around the conference circuit just last year. And beyond incredible productivity, Arrow had a legendary openness with young scholars. A few years ago, a colleague and I were debating a minor point in the history of economic thought, one that Arrow had played some role in; with the debate deadlocked, it was suggested that I simply email the protagonist to learn the truth. No reply came; perhaps no surprise, given how busy he was and how unknown I was. Imagine my surprise when, two months letter, a large manila envelope showed up in my mailbox at Northwestern, with a four page letter Ken had written inside! Going beyond a simple answer, he patiently walked me through his perspective on the entire history of mathematical economics, the relative centrality of folks like Wicksteed and Edgeworth to the broader economic community, the work he did under Hotelling and the Cowles Commission, and the nature of formal logic versus price theory. Mind you, this was his response to a complete stranger.

This kindness extended beyond budding economists: Arrow was a notorious generator of petitions on all kinds of social causes, and remained so late in life, signing the Economists Against Trump that many of us supported last year. You will be hardpressed to find an open letter or amicus curiae, on any issue from copyright term extension to the use of nuclear weapons, which Arrow was unaware of. The Duke Library holds the papers of both Arrow and Paul Samuelson – famously they became brothers-in-law – and the frequency with which their correspondence involves this petition or that, with Arrow in general the instigator and Samuelson the deflector, is unmistakable. I recall a great series of letters where Arrow queried Samuelson as to who had most deserved the Nobel but had died too early to receive it. Arrow at one point proposed Joan Robinson, which sent Samuelson into convulsions. “But she was a communist! And besides, her theory of imperfect competition was subpar.” You get the feeling in these letters of Arrow making gentle comments and rejoinders while Samuelson exercises his fists in the way he often did when battling everyone from Friedman to the Marxists at Cambridge to (worst of all, for Samuelson) those who were ignorant of their history of economic thought. Their conversation goes way back: you can find in one of the Samuelson boxes his recommendation that the University of Michigan bring in this bright young fellow named Arrow, a missed chance the poor Wolverines must still regret!

Arrow is so influential, in some many areas of economics, that it is simply impossible to discuss his contributions in a single post. For this reason, I will break the post into four parts, with one posted each day this week. We’ll look at Arrow’s work in choice theory today, his work on general equilibrium tomorrow, his work on innovation on Thursday, and some selected topics where he made seminal contributions (the economics of the environment, the principal-agent problem, and the economics of health care, in particular) on Friday. I do not lightly say that Arrow was the greatest living economist, and in my reckoning second only to Samuelson for the title of greatest economist of all time. Arrow wrote the foundational paper of general equilibrium analysis, the foundational paper of social choice and voting, the foundational paper justifying government intervention in innovation, and the foundational paper in the economics of health care. His legacy is the greatest legacy possible for the mathematical approach pushed by the Cowles Commission, the Econometric Society, Irving Fisher, and the mathematician-cum-economist Harold Hotelling. And so it is there that we must begin.

Arrow was born in New York City, a CCNY graduate like many children of the Great Depression, who went on to study mathematics in graduate school at Columbia. Economics in the United States in the 1930s was not a particularly mathematical science. The formalism of von Neumann, the late-life theoretical conversion of Schumpeter, Samuelson’s Foundations, and the soft nests at Cowles and the Econometric Society were in their infancy.

The usual story is that Arrow’s work on social choice came out of his visit to RAND in 1948. But this misstates the intellectual history: Arrow’s actual encouragement comes from his engagement with a new form of mathematics, the expansions of formal logic beginning with people like Peirce and Boole. While a high school student, Arrow read Bertrand Russell’s text on mathematical logic, and was enthused with the way that set theory permitted logic to go well beyond the syllogisms of the Greeks. What a powerful tool for the generation of knowledge! His Senior year at CCNY, Arrow took the advanced course on relational logic taught by Alfred Tarski, where the eminent philosopher took pains to reintroduce the ideas of Charles Sanders Peirce, the greatest yet most neglected American philosopher. The idea of relations are familiar to economists: give some links between a set (i.e, xRy and yRz) and some properties to the relation (i.e., it is well-ordered), and you can then perform logical operations on the relation to derive further properties. Every trained economist sees an example of this when first learning about choice and utility, but of course things like “greater than” and “less than” are relations as well. In 1940, one would have had to be extraordinarily lucky to encounter this theory: Tarski’s own books were not even translated.

But what great training this would be! For Arrow joined a graudate program in mathematical statistics at Columbia, where one of the courses was taught by Hotelling from the economics department. Hotelling was an ordinalist, rare in those days, and taught his students demand theory from a rigorous basis in ordinal preferences. But what are these? Simply relations with certain properties! Combined with a statistician’s innate ability to write proofs using inequalities, Arrow greatly impressed Hotelling, and switched to a PhD in economics with inspiration in the then-new subfield on mathematical economics that Hotelling, Samuelson, and Hicks were helping to expand.

After his wartime service doing operation research related to weather and flight planning, and a two year detour into capital theory with little to show for it, Arrow took a visiting position at the Cowles Commission, a center of research in mathematical economics then at the University of Chicago. In 1948, Arrow spent the summer at RAND, still yet to complete his dissertation, or even to strike on a worthwhile idea. RAND in Santa Monica was the world center for applied game theory: philosophers, economists, and mathematicians prowled the halls working through the technical basics of zero-sum games, but also the application of strategic decision theory to problems of serious global importance. Arrow had been thinking about voting a bit, and had written a draft of a paper, similar to that of Duncan Black’s 1948 JPE, essentially suggesting that majority voting “works” when preferences are single-peaked; that is, if everyone can rank options from “left to right”, and simply differ on which point is their “peak” of preference, then majority voting reflects individual preferences in a formal sense. At RAND, the philosopher Olaf Helmer pointed out that a similar concern mattered in international relations: how are we to say that the Soviet Union or the United States have preferences? They are collections of individuals, not individuals themselves.

Right, Arrow agreed. But economists had thought about collective welfare, from Pareto to Bergson-Samuelson. The Bergson-Samuelson idea is simple. Let all individuals in society have preferences over states of the world. If we all prefer state A to state B, then the Pareto criterion suggests society should as well. Of course, tradeoffs are inevitable, so what are we to do? We could assume cardinal utility (e.g., “how much money are willing to be paid to accept A if you prefer B to A and society goes toward A?”) as in the Kaldor-Hicks criterion (though the technically minded will know that Kaldor-Hicks does not define an order on states of the world, so isn’t really great for social choice). But let’s assume all people have is their own ordinal utility, their own rank-order of states, an order that is naturally hard to compare across people. Let’s assume for some pairs we have Pareto dominance: we all prefer A to C, and Q to L, and Z to X, but for other pairs there is no such dominance. A great theorem due to the Polish mathematician Szpilrain, and I believe popularized among economists by Blackwell, says that if you have a quasiorder R that is transitive, then there exists an order R’ which completes it. In simple terms, if you can rank some pairs, and the pairs you do rank do not have any intransitivity, then you can generate a complete rankings of all pairs which respects the original incomplete ordering. Since individuals have transitive preferences, Pareto ranks are transitive, and hence we know there exist social welfare functions which “extend” Pareto. The implications of this are subtle: for instance, as I discuss in the link earlier in this paragraph, it implies that pure monetary egalitarianism can never be socially optimal even if the only requirement is to respect Pareto dominance.

So aren’t we done? We know what it means, via Bergson-Samuelson, for the Soviet Union to “prefer” X to Y. But alas, Arrow was clever and attacked the problem from a separate view. His view was to, rather than taking preference orderings of individuals as given and constructing a social ordering, to instead ask whether there is any mechanism for constructing a social ordering from arbitrary individual preferences that satisfies certain criteria. For instance, you may want to rule out a rule that says “whatever Kevin prefers most is what society prefers, no matter what other preferences are” (non-dictatorship). You may want to require Pareto dominance to be respected so that if everyone likes A more than B, A must be chosen (Pareto criterion). You may want to ensure that “irrelevant options” do not matter, so that if giving an option to choose “orange” in addition to “apple” and “pear” does not affect any individual’s ranking of apples and pears, then the orange option also oughtn’t affect society’s rankings of apples and pears (IIA). Arrow famously proved that if we do not restrict what types of preferences individuals may have over social outcomes, there is no system that can rank outcomes socially and still satisfy those three criteria. It has been known that majority voting suffers a problem of this sort since Condorcet in the 18th century, but the general impossibility was an incredible breakthrough, and a straightforward one once Arrow was equipped with the ideas of relational logic.

It was with this result, in the 1951 book-length version of the idea, that social choice as a field distinct from welfare economics really took off. It is a startling result in two ways. First, in pure political theory, it rather simply killed off two centuries of blather about what the “best” voting system was: majority rule, Borda counts, rank-order voting, or whatever you like, every system must violate one of the Arrow axioms. And indeed, subsequent work has shown that the axioms can be relaxed and still generate impossibility. In the end, we do need to make social choices, so what should we go with? If you’re Amartya Sen, drop the Pareto condition. Others have quibbled with IIA. The point is that there is no right answer. The second startling implication is that welfare economics may be on pretty rough footing. Kaldor-Hicks conditions, which in practice motivate all sorts of regulatory decisions in our society, both rely on the assumption of cardinal or interpersonally-comparable utility, and do not generate an order over social options. Any Bergson-Samuelson social welfare function, a really broad class, must violate some pretty natural conditions on how they treat “equivalent” people (see, e.g., Kemp and Ng 1976). One questions whether we are back in the pre-Samuelson state where, beyond Pareto dominance, we can’t say much with any rigor about whether something is “good” or “bad” for society without dictatorially imposing our ethical standard, individual preferences be damned. Arrow’s theorem is a remarkable achievement for a man as young as he was when he conceived it, one of those rare philosophical ideas that will enter the canon alongside the categorical imperative or Hume on induction, a rare idea that will without question be read and considered decades and centuries hence.

Some notes to wrap things up:

1) Most call the result “Arrow’s Impossibility Theorem”. After all, he did prove the impossibility of a certain form of social choice. But Tjalling Koopmans actually convinced Arrow to call the theorem a “Possibility Theorem” out of pure optimism. Proof that the author rarely gets to pick the eventual name!

2) The confusion between Arrow’s theorem and the existence of social welfare functions in Samuelson has a long and interesting history: see this recent paper by Herrada Igersheim. Essentially, as I’ve tried to make clear in this post, Arrow’s result does not prove that Bergson-Samuelson social welfare functions do not exist, but rather implicitly imposes conditions on the indifference curves which underlie the B-S function. Much more detail in the linked paper.

3) So what is society to do in practice given Arrow? How are we to decide? There is much to recommend in Posner and Weyl’s quadratic voting when preferences can be assumed to have some sort of interpersonally comparable cardinal structure, yet are unknown. When interpersonal comparisons are impossible and we do not know people’s preferences, the famous Gibbard-Satterthwaite Theorem says that we have no voting system that can avoid getting people to sometimes vote strategically. We might then ask, ok, fine, what voting or social choice system works “the best” (e.g., satisfies some desiderata) over the broadest possible sets of individual preferences? Partha Dasgupta and Eric Maskin recently proved that, in fact, good old fashioned majority voting works best! But the true answer as to the “best” voting system depends on the distribution of underlying preferences you expect to see – it is a far less simple question than it appears.

4) The conditions I gave above for Arrow’s Theorem are actually different from the 5 conditions in the original 1950 paper. The reason is that Arrow’s original proof is actually incorrect, as shown by Julian Blau in a 1957 Econometrica. The basic insight of the proof is of course salvageable.

5) Among the more beautiful simplifications of Arrow’s proof is Phil Reny’s “side by side” proof of Arrow and Gibbard-Satterthwaite, where he shows just how related the underlying logic of the two concepts is.

We turn to general equilibrium theory tomorrow. And if it seems excessive to need four days to cover the work on one man – even in part! – that is only because I understate the breadth of his contributions. Like Samuelson’s obscure knowledge of Finnish ministers which I recounted earlier this year, Arrow’s breadth of knowledge was also notorious. There is a story Eric Maskin has claimed to be true, where some of Arrow’s junior colleagues wanted to finally stump the seemingly all-knowing Arrow. They all studied the mating habits of whales for days, and then, when Arrow was coming down the hall, faked a vigorous discussion on the topic. Arrow stopped and turned, remaining silent at first. The colleagues had found a topic he didn’t fully know! Finally, Arrow interrupted: “But I thought Turner’s theory was discredited by Spenser, who showed that the supposed homing mechanism couldn’t possibly work”! And even this intellectual feat hardly matches Arrow’s well-known habit of sleeping through the first half of seminars, waking up to make the most salient point of the whole lecture, then falling back asleep again (as averred by, among others, my colleague Joshua Gans, a former student of Ken’s).

Advertisements
%d bloggers like this: