Category Archives: Classics

Kenneth Arrow Part II: The Theory of General Equilibrium

The first post in this series discussed Ken Arrow’s work in the broad sense, with particular focus on social choice. In this post, we will dive into his most famous accomplishment, the theory of general equilibrium (1954, Econometrica). I beg the reader to offer some sympathy for the approximations and simplifications that will appear below: the history of general equilibrium is, by this point, well-trodden ground for historians of thought, and the interpretation of history and theory in this area is quite contentious.

My read of the literature on GE following Arrow is as follows. First, the theory of general equilibrium is an incredible proof that markets can, in theory and in certain cases, work as efficiently as an all-powerful planner. That said, the three other hopes of general equilibrium theory since the days of Walras are, in fact, disproven by the work of Arrow and its followers. Market forces will not necessarily lead us toward these socially optimal equilibrium prices. Walrasian demand does not have empirical content derived from basic ordinal utility maximization. We cannot rigorously perform comparative statics on general equilibrium economic statistics without assumptions that go beyond simple utility maximization. From my read of Walras and the early general equilibrium theorists, all three of those results would be a real shock.

Let’s start at the beginning. There is an idea going back to Adam Smith and the invisible hand, an idea that individual action will, via the price system, lead to an increase or even maximization of economic welfare (an an aside, Smith’s own use of “invisible hand” trope is overstated, as William Grampp among others has convincingly argued). The kind of people who denigrate modern economics – the neo-Marxists, the back-of-the-room scribblers, the wannabe-contrarian-dilletantes – see Arrow’s work, and the idea of using general equilibrium theory to “prove that markets work”, as a barbarism. We know, and have known well before Arrow, that externalities exist. We know, and have known well before Arrow, that the distribution of income depends on the distribution of endowments. What Arrow was interested in was examining not only whether the invisible hand argument “is true, but whether it could be true”. That is, if we are to claim markets are uniquely powerful at organizing economic activity, we ought formally show that the market could work in such a manner, and understand the precise conditions under which it won’t generate these claimed benefits. How ought we do this? Prove the precise conditions under which there exists a price vector where markets clear, show the outcome satisfies some welfare criterion that is desirable, and note exactly why each of the conditions are necessary for such an outcome.

The question is, how difficult is it to prove these prices exist? The term “general equilibrium” has had many meanings in economics. Today, it is often used to mean “as opposed to partial equilibrium”, meaning that we consider economic effects allowing all agents to adjust to a change in the environment. For instance, a small random trial of guaranteed incomes has, as its primary effect, an impact on the incomes of the recipients; the general equilibrium effects of making such a policy widespread on the labor market will be difficult to discern. In the 19th and early 20th century, however, the term was much more concerned with the idea of the economy as a self-regulating system. Arrow put it very nicely in an encyclopedia chapter he wrote in 1966: general equilibrium is both “the simple notion of determinateness, that the relations which describe the economic system must form a system sufficiently complete to determine the values of its variables and…the more specific notion that each relation represents a balance of forces.”

If you were a classical, a Smith or a Marx or a Ricardo, the problem of what price will obtain in a market is simple to solve: ignore demand. Prices are implied by costs and a zero profit condition, essentially free entry. And we more or less think like this now in some markets. With free entry and every firm producing at the identical minimum efficient scale, price is entirely determined by the supply side, and only quantity is determined by demand. With one factor, labor where the Malthusian condition plays the role of free entry, or labor and land in the Ricardian system, this classical model of value is well-defined. How to handle capital and differentiated labor is a problem to be assumed away, or handled informally; Samuelson has many papers where he is incensed by Marx’s handling of capital as embodied labor.

The French mathematical economist Leon Walras finally cracked the nut by introducing demand and price-taking. There are household who produce and consume. Equilibrium involves supply and demand equating in each market, hence price is where margins along the supply and demand curves equate. Walras famously (and informally) proposed a method by which prices might actually reach equilibrium: the tatonnement. An auctioneer calls out a price vector: in some markets there is excess demand and in some excess supply. Prices are then adjusted one at a time. Of course each price change will affect excess demand and supply in other markets, but you might imagine things can “converge” if you adjust prices just right. Not bad for the 1870s – there is a reason Schumpeter calls this the “Magna Carta” of economic theory in his History of Economic Analysis. But Walras was mistaken on two counts: first, knowing whether there even exists an equilibrium that clears every market simultaneously is, it turns out, equivalent to a problem in Poincare’s analysis situs beyond the reach of mathematics in the 19th century, and second, the conditions under which tatonnement actually converges are a devilish problem.

The equilibrium existence problem is easy to understand. Take the simplest case, with all j goods made up of the linear combination of k factors. Demand equals supply just says that Aq=e, where q is the quantity of each good produced, e is the endowment of each factor, and A is the input-output matrix whereby product j is made up of some combination of factors k. Also, zero profit in every market will imply Ap(k)=p(j), where p(k) are the factor prices and p(j) the good prices. It was pointed out that even in this simple system where everything is linear, it is not at all trivial to ensure that prices and quantities are not negative. It would not be until Abraham Wald in the mid-1930s – later Arrow’s professor at Columbia and a fellow Romanian, links that are surely not a coincidence! – that formal conditions were shown giving existence of general equilibrium in a simple system like this one, though Wald’s proof greatly simplified by the general problem by imposing implausible restrictions on aggregate demand.

Mathematicians like Wald, trained in the Vienna tradition, were aghast at the state of mathematical reasoning in economics at the time. Oskar Morgenstern absolutely hammered the great economist John Hicks in a 1941 review of Hicks’ Value and Capital, particularly over the crazy assertion (similar to Walras!) that the number of unknowns and equations being identical in a general equilibrium system sufficed for a solution to exist (if this isn’t clear to you in a nonlinear system, a trivial example with two equations and two unknowns is here). Von Neumann apparently said (p. 85) to Oskar, in reference to Hicks and those of his school, “if those books are unearthed a hundred years hence, people will not believe they were written in our time. Rather they will think they are about contemporary with Newton, so primitive is the mathematics.” And Hicks was quite technically advanced compared to his contemporary economists, bringing the Keynesian macroeconomics and the microeconomics of indifference curves and demand analysis together masterfully. Arrow and Hahn even credit their initial interest in the problems of general equilibrium to the serendipity of coming across Hicks’ book.

Mathematics had advanced since Walras, however, and those trained at the mathematical frontier finally had the tools to tackle Walras’ problem seriously. Let D(p) be a vector of demand for all goods given price p, and e be initial endowments of each good. Then we simply need D(p)=e or D(p)-e=0 in each market. To make things a bit harder, we can introduce intermediate and factor goods with some form of production function, but the basic problem is the same: find whether there exists a vector p such that a nonlinear equation is equal to zero. This is the mathematics of fixed points, and Brouwer had, in 1912, given a nice theorem: every continuous function from a compact convex subset to itself has a fixed point. Von Neumann used this in the 1930s to prove a similar result to Wald. A mathematician named Shizuo Kakutani, inspired by von Neumann, extended the Brouwer result to set-valued mappings called correspondences, and John Nash in 1950 used that result to show, in a trivial proof, the existence of mixed equilibria in noncooperative games. The math had arrived: we had the tools to formally state when non-trivial non-linear demand and supply systems had a fixed point, and hence a price that cleared all markets. We further had techniques for handling “corner solutions” where demand for a given good was zero at some price, surely a common outcome in the world: the idea of the linear program and complementary slackness, and its origin in convex set theory as applied to the dual, provided just the mathematics Arrow and his contemporaries would need.

So here we stood in the early 1950s. The mathematical conditions necessary to prove that a set-valued function has an equilibrium have been worked out. Hicks, in Value and Capital, has given Arrow the idea that relating the future to today is simple: just put a date on every commodity and enlarge the commodity space. Indeed, adding state-contingency is easy: put an index for state in addition to date on every commodity. So we need not only zero excess demand in apples, or in apples delivered in May 1955, but in apples delivered in May 1955 if Eisenhower loses his reelection bid. Complex, it seems, but no matter: the conditions for the existence of a fixed point will be the same in this enlarged commodity space.

With these tools in mind, Arrow and Debreu can begin their proof. They first define a generalization of an n-person game where the feasible set of actions for each player depends on the actions of every other player; think of the feasible set as “what can I afford given the prices that will result for the commodities I am endowed with?” The set of actions is an n-tuple where n is the number of date and state indexed commodities a player could buy. Debreu showed in 1952 PNAS that these generalized games have an equilibrium as long as each payoff function varies continuously with other player’s actions, the feasible set of choices convex and varies continuously in other player’s actions, and the set of actions which improve a player’s payoff are convex for every action profile. Arrow and Debreu then show that the usual implications on individual demand are sufficient to aggregate up to the conditions Debreu’s earlier paper requires. This method is much, much different from what is done by McKenzie or other early general equilibrium theorists: excess demand is never taken as a primitive. This allows the Arrow-Debreu proof to provide substantial economic intuition as Duffie and Sonnenschein point out in a 1989 JEL. For instance, showing that the Arrow-Debreu equilibrium exists even with taxation is trivial using their method but much less so in methods that begin with excess demand functions.

This is already quite an accomplishment: Arrow and Debreu have shown that there exists a price vector that clears all markets simultaneously. The nature of their proof, as later theorists will point out, relies less on convexity on preferences and production sets as on the fact that every agent is “small” relative to the market (convexity is used to get continuity in the Debreu game, and you can get this equally well by making all consumers infinitesimal and then randomizing allocations to smooth things out; see Duffie and Sonnenschein above for an example). At this point, it’s the mid-1950s, heyday of the Neoclassical synthesis: surely we want to be able to answer questions like, when there is a negative demand shock, how will the economy best reach a Pareto-optimal equilibrium again? How do different speeds of adjustment due to sticky prices or other frictions affect the rate at which optimal is regained? Those types of question implicitly assume that the equilibrium is unique (at least locally) so that we actually can “return” to where we were before the shock. And of course we know some of the assumptions needed for the Arrow-Debreu proof are unrealistic – e.g., no fixed costs in production – but we would at least like to work out how to manipulate the economy in the “simple” case before figuring out how to deal with those issues.

Here is where things didn’t work out as hoped. Uzawa (RESTUD, 1960) proved that not only could Brouwer’s theorem be used to prove the existence of general equilibrum, but that the opposite was true as well: the existence of general equilibrium was logically equivalent to Brouwer. A result like this certainly makes one worry about how much one could say about prices in general equilibrium. The 1970s brought us the Sonnenschein-Mantel-Debreu “Anything Goes” theorem: aggregate excess demand functions do not inherit all the properties of individual excess demand functions because of wealth effects (when relative prices change, the value of one’s endowment changes as well). For any aggregate excess demand function satisfying a couple minor restrictions, there exists an economy with individual preferences generating that function; in particular, fewer restrictions than are placed on individual excess demand as derived from individual preference maximization. This tells us, importantly, that there is no generic reason for equilibria to be unique in an economy.

Multiplicity of equilibria is a problem: if the goal of GE was to be able to take underlying primitives like tastes and technology, calculate “the” prices that clear the market, then examine how those prices change (“comparative statics”), we essentially lose the ability to do all but local comparative statics since large changes in the environment may cause the economy to jump to a different equilibrium (luckily, Debreu (1970, Econometrica) at least generically gives us a finite number of equilibria, so we may at least be able to say something about local comparative statics for very small shocks). Indeed, these analyses are tough without an equilibrium selection mechanism, which we don’t really have even now. Some would say this is no big deal: of course the same technology and tastes can generate many equilibria, just as cars may wind up all driving on either the left or the right in equilibrium. And true, all of the Arrow-Debreu equilibria are Pareto optimal. But it is still far afield from what might have been hoped for in the 1930s when this quest for a modern GE theory began.

Worse yet is stability, as Arrow and his collaborators (1958, Ecta; 1959, Ecta) would help discover. Even if we have a unique equilibrium, Herbert Scarf (IER, 1960) showed, via many simple examples, how Walrasian tatonnement can lead to cycles which never converge. Despite a great deal of the intellectual effort in the 1960s and 1970s, we do not have a good model of price adjustment even now. I should think we are unlikely to ever have such a theory: as many theorists have pointed out, if we are in a period of price adjustment and not in an equilibrium, then the zero profit condition ought not apply, ergo why should there be “one” price rather than ten or a hundred or a thousand?

The problem of multiplicity and instability for comparative static analysis ought be clear, but it should also be noted how problematic they are for welfare analysis. Consider the Second Welfare Theorem: under the Arrow-Debreu system, for every Pareto optimal allocation, there exists an initial endowment of resources such that that allocation is an equilibrium. This is literally the main justification for the benefits of the market: if we reallocate endowments, free exchange can get us to any Pareto optimal point, ergo can get us to any reasonable socially optimal point no matter what social welfare function you happen to hold. How valid is this justification? Call x* the allocation that maximizes some social welfare function. Let e* be an initial endowment for which x* is an equilibrium outcome – such an endowment must exist via Arrow-Debreu’s proof. Does endowing agents with e* guarantee we reach that social welfare maximum? No: x* may not be unique. Even if it unique, will we reach it? No: if it is not a stable equilibrium, it is only by dint of luck that our price adjustment process will ever reach it.

So let’s sum up. In the 1870s, Walras showed us that demand and supply, with agents as price takers, can generate supremely useful insights into the economy. Since demand matters, changes in demand in one market will affect other markets as well. If the price of apples rises, demand for pears will rise, as will their price, whose secondary effect should be accounted for in the market for apples. By the 1930s we have the beginnings of a nice model of individual choice based on constrained preference maximization. Taking prices as given, individual demands have well-defined forms, and excess demand in the economy can be computed by a simple summing up. So we now want to know: is there in fact a price that clears the market? Yes, Arrow and Debreu show, there is, and we needn’t assume anything strange about individual demand to generate this. These equilibrium prices always give Pareto optimal allocations, as had long been known, but there also always exist endowments such that every Pareto optimal allocation is an equilibria. It is a beautiful and important result, and a triumph for the intuition of the invisible hand it its most formal sense.

Alas, it is there we reach a dead end. Individual preferences alone do not suffice to tell us what equilibria we are at, nor that any equilibria will be stable, nor that any equilibria will be reached by an economically sensible adjustment process. To say anything meaningful about aggregate economic outcomes, or about comparative statics after modest shocks, or about how technological changes change price, we need to make assumptions that go beyond individual rationality and profit maximization. This is, it seems to me, a shock for the economists of the middle of the century, and still a shock for many today. I do not think this means “general equilibrium is dead” or that the mathematical exploration in the field was a waste. We learned a great deal about precisely when markets could even in principle achieve the first best, and that education was critical for the work Arrow would later do on health care, innovation, and the environment, which I will discuss in the next two posts. And we needn’t throw out general equilibrium analysis because of uniqueness or stability problems, any more than we would throw out game theoretic analysis because of the same problems. But it does mean that individual rationality as the sole paradigm of economic analysis is dead: it is mathematically proven that postulates of individual rationality will not allow us to say anything of consequence about economic aggregates or game theoretic outcomes in the frequent scenarios where we do not have a unique equilibria with a well-defined way to get there (via learning in games, or a tatonnament process in GE, or something of a similar nature). Arrow himself (1986, J. Business) accepts this: “In the aggregate, the hypothesis of rational behavior has in general no implications.” This is an opportunity for economists, not a burden, and we still await the next Arrow who can guide us on how to proceed.

Some notes on the literature: For those interested in the theoretical development of general equilibrium, I recommend General Equilibrium Analysis by Roy Weintraub, a reformed theorist who now works in the history of thought. Wade Hands has a nice review of the neoclassical synthesis and the ways in which Keynesianism and GE analysis were interrelated. On the battle for McKenzie to be credited alongside Arrow and Debreu, and the potentially scandalous way Debreu may have secretly been responsible for the Arrow and Debreu paper being published first, see the fine book Finding Equilibrium by Weintraub and Duppe; both Debreu and McKenzie have particularly wild histories. Till Duppe, a scholar of Debreu, also has a nice paper in the JHET on precisely how Arrow and Debreu came to work together, and what the contribution of each to their famous ’54 paper was.

The Greatest Living Economist Has Passed Away: Notes on Kenneth Arrow Part I

It is amazing how quickly the titans of the middle of the century have passed. Paul Samuelson and his mathematization, Ronald Coase and his connection of law to economics, Gary Becker and his incorporation of choice into the full sphere of human behavior, John Nash and his formalization of strategic interaction, Milton Friedman and his defense of the market in the precarious post-war period, Robert Fogel and his cliometric revolution: the remaining titan was Kenneth Arrow, the only living economist who could have won a second Nobel Prize without a whit of complaint from the gallery. These figures ruled as economics grew from a minor branch of moral philosophy into the most influential, most prominent, and most advanced of the social sciences. It is hard to imagine our field will ever again have such a collection of scholars rise in one generation, and with the tragic news that Ken has now passed away as well, we have, with great sadness and great rapidity, lost the full set.

Though he was 95 years old, Arrow was still hard at work; his paper with Kamran Bilir and Alan Sorensen was making its way around the conference circuit just last year. And beyond incredible productivity, Arrow had a legendary openness with young scholars. A few years ago, a colleague and I were debating a minor point in the history of economic thought, one that Arrow had played some role in; with the debate deadlocked, it was suggested that I simply email the protagonist to learn the truth. No reply came; perhaps no surprise, given how busy he was and how unknown I was. Imagine my surprise when, two months letter, a large manila envelope showed up in my mailbox at Northwestern, with a four page letter Ken had written inside! Going beyond a simple answer, he patiently walked me through his perspective on the entire history of mathematical economics, the relative centrality of folks like Wicksteed and Edgeworth to the broader economic community, the work he did under Hotelling and the Cowles Commission, and the nature of formal logic versus price theory. Mind you, this was his response to a complete stranger.

This kindness extended beyond budding economists: Arrow was a notorious generator of petitions on all kinds of social causes, and remained so late in life, signing the Economists Against Trump that many of us supported last year. You will be hardpressed to find an open letter or amicus curiae, on any issue from copyright term extension to the use of nuclear weapons, which Arrow was unaware of. The Duke Library holds the papers of both Arrow and Paul Samuelson – famously they became brothers-in-law – and the frequency with which their correspondence involves this petition or that, with Arrow in general the instigator and Samuelson the deflector, is unmistakable. I recall a great series of letters where Arrow queried Samuelson as to who had most deserved the Nobel but had died too early to receive it. Arrow at one point proposed Joan Robinson, which sent Samuelson into convulsions. “But she was a communist! And besides, her theory of imperfect competition was subpar.” You get the feeling in these letters of Arrow making gentle comments and rejoinders while Samuelson exercises his fists in the way he often did when battling everyone from Friedman to the Marxists at Cambridge to (worst of all, for Samuelson) those who were ignorant of their history of economic thought. Their conversation goes way back: you can find in one of the Samuelson boxes his recommendation that the University of Michigan bring in this bright young fellow named Arrow, a missed chance the poor Wolverines must still regret!

Arrow is so influential, in some many areas of economics, that it is simply impossible to discuss his contributions in a single post. For this reason, I will break the post into four parts, with one posted each day this week. We’ll look at Arrow’s work in choice theory today, his work on general equilibrium tomorrow, his work on innovation on Thursday, and some selected topics where he made seminal contributions (the economics of the environment, the principal-agent problem, and the economics of health care, in particular) on Friday. I do not lightly say that Arrow was the greatest living economist, and in my reckoning second only to Samuelson for the title of greatest economist of all time. Arrow wrote the foundational paper of general equilibrium analysis, the foundational paper of social choice and voting, the foundational paper justifying government intervention in innovation, and the foundational paper in the economics of health care. His legacy is the greatest legacy possible for the mathematical approach pushed by the Cowles Commission, the Econometric Society, Irving Fisher, and the mathematician-cum-economist Harold Hotelling. And so it is there that we must begin.

Arrow was born in New York City, a CCNY graduate like many children of the Great Depression, who went on to study mathematics in graduate school at Columbia. Economics in the United States in the 1930s was not a particularly mathematical science. The formalism of von Neumann, the late-life theoretical conversion of Schumpeter, Samuelson’s Foundations, and the soft nests at Cowles and the Econometric Society were in their infancy.

The usual story is that Arrow’s work on social choice came out of his visit to RAND in 1948. But this misstates the intellectual history: Arrow’s actual encouragement comes from his engagement with a new form of mathematics, the expansions of formal logic beginning with people like Peirce and Boole. While a high school student, Arrow read Bertrand Russell’s text on mathematical logic, and was enthused with the way that set theory permitted logic to go well beyond the syllogisms of the Greeks. What a powerful tool for the generation of knowledge! His Senior year at CCNY, Arrow took the advanced course on relational logic taught by Alfred Tarski, where the eminent philosopher took pains to reintroduce the ideas of Charles Sanders Peirce, the greatest yet most neglected American philosopher. The idea of relations are familiar to economists: give some links between a set (i.e, xRy and yRz) and some properties to the relation (i.e., it is well-ordered), and you can then perform logical operations on the relation to derive further properties. Every trained economist sees an example of this when first learning about choice and utility, but of course things like “greater than” and “less than” are relations as well. In 1940, one would have had to be extraordinarily lucky to encounter this theory: Tarski’s own books were not even translated.

But what great training this would be! For Arrow joined a graudate program in mathematical statistics at Columbia, where one of the courses was taught by Hotelling from the economics department. Hotelling was an ordinalist, rare in those days, and taught his students demand theory from a rigorous basis in ordinal preferences. But what are these? Simply relations with certain properties! Combined with a statistician’s innate ability to write proofs using inequalities, Arrow greatly impressed Hotelling, and switched to a PhD in economics with inspiration in the then-new subfield on mathematical economics that Hotelling, Samuelson, and Hicks were helping to expand.

After his wartime service doing operation research related to weather and flight planning, and a two year detour into capital theory with little to show for it, Arrow took a visiting position at the Cowles Commission, a center of research in mathematical economics then at the University of Chicago. In 1948, Arrow spent the summer at RAND, still yet to complete his dissertation, or even to strike on a worthwhile idea. RAND in Santa Monica was the world center for applied game theory: philosophers, economists, and mathematicians prowled the halls working through the technical basics of zero-sum games, but also the application of strategic decision theory to problems of serious global importance. Arrow had been thinking about voting a bit, and had written a draft of a paper, similar to that of Duncan Black’s 1948 JPE, essentially suggesting that majority voting “works” when preferences are single-peaked; that is, if everyone can rank options from “left to right”, and simply differ on which point is their “peak” of preference, then majority voting reflects individual preferences in a formal sense. At RAND, the philosopher Olaf Helmer pointed out that a similar concern mattered in international relations: how are we to say that the Soviet Union or the United States have preferences? They are collections of individuals, not individuals themselves.

Right, Arrow agreed. But economists had thought about collective welfare, from Pareto to Bergson-Samuelson. The Bergson-Samuelson idea is simple. Let all individuals in society have preferences over states of the world. If we all prefer state A to state B, then the Pareto criterion suggests society should as well. Of course, tradeoffs are inevitable, so what are we to do? We could assume cardinal utility (e.g., “how much money are willing to be paid to accept A if you prefer B to A and society goes toward A?”) as in the Kaldor-Hicks criterion (though the technically minded will know that Kaldor-Hicks does not define an order on states of the world, so isn’t really great for social choice). But let’s assume all people have is their own ordinal utility, their own rank-order of states, an order that is naturally hard to compare across people. Let’s assume for some pairs we have Pareto dominance: we all prefer A to C, and Q to L, and Z to X, but for other pairs there is no such dominance. A great theorem due to the Polish mathematician Szpilrain, and I believe popularized among economists by Blackwell, says that if you have a quasiorder R that is transitive, then there exists an order R’ which completes it. In simple terms, if you can rank some pairs, and the pairs you do rank do not have any intransitivity, then you can generate a complete rankings of all pairs which respects the original incomplete ordering. Since individuals have transitive preferences, Pareto ranks are transitive, and hence we know there exist social welfare functions which “extend” Pareto. The implications of this are subtle: for instance, as I discuss in the link earlier in this paragraph, it implies that pure monetary egalitarianism can never be socially optimal even if the only requirement is to respect Pareto dominance.

So aren’t we done? We know what it means, via Bergson-Samuelson, for the Soviet Union to “prefer” X to Y. But alas, Arrow was clever and attacked the problem from a separate view. His view was to, rather than taking preference orderings of individuals as given and constructing a social ordering, to instead ask whether there is any mechanism for constructing a social ordering from arbitrary individual preferences that satisfies certain criteria. For instance, you may want to rule out a rule that says “whatever Kevin prefers most is what society prefers, no matter what other preferences are” (non-dictatorship). You may want to require Pareto dominance to be respected so that if everyone likes A more than B, A must be chosen (Pareto criterion). You may want to ensure that “irrelevant options” do not matter, so that if giving an option to choose “orange” in addition to “apple” and “pear” does not affect any individual’s ranking of apples and pears, then the orange option also oughtn’t affect society’s rankings of apples and pears (IIA). Arrow famously proved that if we do not restrict what types of preferences individuals may have over social outcomes, there is no system that can rank outcomes socially and still satisfy those three criteria. It has been known that majority voting suffers a problem of this sort since Condorcet in the 18th century, but the general impossibility was an incredible breakthrough, and a straightforward one once Arrow was equipped with the ideas of relational logic.

It was with this result, in the 1951 book-length version of the idea, that social choice as a field distinct from welfare economics really took off. It is a startling result in two ways. First, in pure political theory, it rather simply killed off two centuries of blather about what the “best” voting system was: majority rule, Borda counts, rank-order voting, or whatever you like, every system must violate one of the Arrow axioms. And indeed, subsequent work has shown that the axioms can be relaxed and still generate impossibility. In the end, we do need to make social choices, so what should we go with? If you’re Amartya Sen, drop the Pareto condition. Others have quibbled with IIA. The point is that there is no right answer. The second startling implication is that welfare economics may be on pretty rough footing. Kaldor-Hicks conditions, which in practice motivate all sorts of regulatory decisions in our society, both rely on the assumption of cardinal or interpersonally-comparable utility, and do not generate an order over social options. Any Bergson-Samuelson social welfare function, a really broad class, must violate some pretty natural conditions on how they treat “equivalent” people (see, e.g., Kemp and Ng 1976). One questions whether we are back in the pre-Samuelson state where, beyond Pareto dominance, we can’t say much with any rigor about whether something is “good” or “bad” for society without dictatorially imposing our ethical standard, individual preferences be damned. Arrow’s theorem is a remarkable achievement for a man as young as he was when he conceived it, one of those rare philosophical ideas that will enter the canon alongside the categorical imperative or Hume on induction, a rare idea that will without question be read and considered decades and centuries hence.

Some notes to wrap things up:

1) Most call the result “Arrow’s Impossibility Theorem”. After all, he did prove the impossibility of a certain form of social choice. But Tjalling Koopmans actually convinced Arrow to call the theorem a “Possibility Theorem” out of pure optimism. Proof that the author rarely gets to pick the eventual name!

2) The confusion between Arrow’s theorem and the existence of social welfare functions in Samuelson has a long and interesting history: see this recent paper by Herrada Igersheim. Essentially, as I’ve tried to make clear in this post, Arrow’s result does not prove that Bergson-Samuelson social welfare functions do not exist, but rather implicitly imposes conditions on the indifference curves which underlie the B-S function. Much more detail in the linked paper.

3) So what is society to do in practice given Arrow? How are we to decide? There is much to recommend in Posner and Weyl’s quadratic voting when preferences can be assumed to have some sort of interpersonally comparable cardinal structure, yet are unknown. When interpersonal comparisons are impossible and we do not know people’s preferences, the famous Gibbard-Satterthwaite Theorem says that we have no voting system that can avoid getting people to sometimes vote strategically. We might then ask, ok, fine, what voting or social choice system works “the best” (e.g., satisfies some desiderata) over the broadest possible sets of individual preferences? Partha Dasgupta and Eric Maskin recently proved that, in fact, good old fashioned majority voting works best! But the true answer as to the “best” voting system depends on the distribution of underlying preferences you expect to see – it is a far less simple question than it appears.

4) The conditions I gave above for Arrow’s Theorem are actually different from the 5 conditions in the original 1950 paper. The reason is that Arrow’s original proof is actually incorrect, as shown by Julian Blau in a 1957 Econometrica. The basic insight of the proof is of course salvageable.

5) Among the more beautiful simplifications of Arrow’s proof is Phil Reny’s “side by side” proof of Arrow and Gibbard-Satterthwaite, where he shows just how related the underlying logic of the two concepts is.

We turn to general equilibrium theory tomorrow. And if it seems excessive to need four days to cover the work on one man – even in part! – that is only because I understate the breadth of his contributions. Like Samuelson’s obscure knowledge of Finnish ministers which I recounted earlier this year, Arrow’s breadth of knowledge was also notorious. There is a story Eric Maskin has claimed to be true, where some of Arrow’s junior colleagues wanted to finally stump the seemingly all-knowing Arrow. They all studied the mating habits of whales for days, and then, when Arrow was coming down the hall, faked a vigorous discussion on the topic. Arrow stopped and turned, remaining silent at first. The colleagues had found a topic he didn’t fully know! Finally, Arrow interrupted: “But I thought Turner’s theory was discredited by Spenser, who showed that the supposed homing mechanism couldn’t possibly work”! And even this intellectual feat hardly matches Arrow’s well-known habit of sleeping through the first half of seminars, waking up to make the most salient point of the whole lecture, then falling back asleep again (as averred by, among others, my colleague Joshua Gans, a former student of Ken’s).

Nobel Prize 2016 Part II: Oliver Hart

The Nobel Prize in Economics was given yesterday to two wonderful theorists, Bengt Holmstrom and Oliver Hart. I wrote a day ago about Holmstrom’s contributions, many of which are simply foundational to modern mechanism design and its applications. Oliver Hart’s contribution is more subtle and hence more of a challenge to describe to a nonspecialist; I am sure of this because no concept gives my undergraduate students more headaches than Hart’s “residual control right” theory of the firm. Even stranger, much of Hart’s recent work repudiates the importance of his most famous articles, a point that appears to have been entirely lost on every newspaper discussion of Hart that I’ve seen (including otherwise very nice discussions like Applebaum’s in the New York Times). A major reason he has changed his beliefs, and his research agenda, so radically is not simply the whims of age or the pressures of politics, but rather the impact of a devastatingly clever, and devastatingly esoteric, argument made by the Nobel winners Eric Maskin and Jean Tirole. To see exactly what’s going on in Hart’s work, and why there remains many very important unsolved questions in this area, let’s quickly survey what economists mean by “theory of the firm”.

The fundamental strangeness of firms goes back to Coase. Markets are amazing. We have wonderful theorems going back to Hurwicz about how competitive market prices coordinate activity efficiently even when individuals only have very limited information about how various things can be produced by an economy. A pencil somehow involves graphite being mined, forests being explored and exploited, rubber being harvested and produced, the raw materials brought to a factory where a machine puts the pencil together, ships and trains bringing the pencil to retail stores, and yet this decentralized activity produces a pencil costing ten cents. This is the case even though not a single individual anywhere in the world knows how all of those processes up the supply chain operate! Yet, as Coase pointed out, a huge amount of economic activity (including the majority of international trade) is not coordinated via the market, but rather through top-down Communist-style bureaucracies called firms. Why on Earth do these persistent organizations exist at all? When should firms merge and when should they divest themselves of their parts? These questions make up the theory of the firm.

Coase’s early answer is that something called transaction costs exist, and that they are particularly high outside the firm. That is, market transactions are not free. Firm size is determined at the point where the problems of bureaucracy within the firm overwhelm the benefits of reducing transaction costs from regular transactions. There are two major problems here. First, who knows what a “transaction cost” or a “bureaucratic cost” is, and why they differ across organizational forms: the explanation borders on tautology. Second, as the wonderful paper by Alchian and Demsetz in 1972 points out, there is no reason we should assume firms have some special ability to direct or punish their workers. If your supplier does something you don’t like, you can keep them on, or fire them, or renegotiate. If your in-house department does something you don’t like, you can keep them on, or fire them, or renegotiate. The problem of providing suitable incentives – the contracting problem – does not simply disappear because some activity is brought within the boundary of the firm.

Oliver Williamson, a recent Nobel winner joint with Elinor Ostrom, has a more formal transaction cost theory: some relationships generate joint rents higher than could be generated if we split ways, unforeseen things occur that make us want to renegotiate our contract, and the cost of that renegotiation may be lower if workers or suppliers are internal to a firm. “Unforeseen things” may include anything which cannot be measured ex-post by a court or other mediator, since that is ultimately who would enforce any contract. It is not that everyday activities have different transaction costs, but that the negotiations which produce contracts themselves are easier to handle in a more persistent relationship. As in Coase, the question of why firms do not simply grow to an enormous size is largely dealt with by off-hand references to “bureaucratic costs” whose nature was largely informal. Though informal, the idea that something like transaction costs might matter seemed intuitive and had some empirical support – firms are larger in the developing world because weaker legal systems means more “unforeseen things” will occur outside the scope of a contract, hence the differential costs of holdup or renegotiation inside and outside the firm are first order when deciding on firm size. That said, the Alchian-Demsetz critique, and the question of what a “bureaucratic cost” is, are worrying. And as Eric van den Steen points out in a 2010 AER, can anyone who has tried to order paper through their procurement office versus just popping in to Staples really believe that the reason firms exist is to lessen the cost of intrafirm activities?

Grossman and Hart (1986) argue that the distinction that really makes a firm a firm is that it owns assets. They retain the idea that contracts may be incomplete – at some point, I will disagree with my suppliers, or my workers, or my branch manager, about what should be done, either because a state of the world has arrived not covered by our contract, or because it is in our first-best mutual interest to renegotiate that contract. They retain the idea that there are relationship-specific rents, so I care about maintaining this particular relationship. But rather than rely on transaction costs, they simply point out that the owner of the asset is in a much better bargaining position when this disagreement occurs. Therefore, the owner of the asset will get a bigger percentage of rents after renegotiation. Hence the person who owns an asset should be the one whose incentive to improve the value of the asset is most sensitive to that future split of rents.

Baker and Hubbard (2004) provide a nice empirical example: when on-board computers to monitor how long-haul trucks were driven began to diffuse, ownership of those trucks shifted from owner-operators to trucking firms. Before the computer, if the trucking firm owns the truck, it is hard to contract on how hard the truck will be driven or how poorly it will be treated by the driver. If the driver owns the truck, it is hard to contract on how much effort the trucking firm dispatcher will exert ensuring the truck isn’t sitting empty for days, or following a particularly efficient route. The computer solves the first problem, meaning that only the trucking firm is taking actions relevant to the joint relationship which are highly likely to be affected by whether they own the truck or not. In Grossman and Hart’s “residual control rights” theory, then, the introduction of the computer should mean the truck ought, post-computer, be owned by the trucking firm. If these residual control rights are unimportant – there is no relationship-specific rent and no incompleteness in contracting – then the ability to shop around for the best relationship is more valuable than the control rights asset ownership provides. Hart and Moore (1990) extends this basic model to the case where there are many assets and many firms, suggesting critically that sole ownership of assets which are highly complementary in production is optimal. Asset ownership affects outside options when the contract is incomplete by changing bargaining power, and splitting ownership of complementary assets gives multiple agents weak bargaining power and hence little incentive to invest in maintaining the quality of, or improving, the assets. Hart, Schleifer and Vishny (1997) provide a great example of residual control rights applied to the question of why governments should run prisons but not garbage collection. (A brief aside: note the role that bargaining power plays in all of Hart’s theories. We do not have a “perfect” – in a sense that can be made formal – model of bargaining, and Hart tends to use bargaining solutions from cooperative game theory like the Shapley value. After Shapley’s prize alongside Roth a few years ago, this makes multiple prizes heavily influenced by cooperative games applied to unexpected problems. Perhaps the theory of cooperative games ought still be taught with vigor in PhD programs!)

There are, of course, many other theories of the firm. The idea that firms in some industries are big because there are large fixed costs to enter at the minimum efficient scale goes back to Marshall. The agency theory of the firm going back at least to Jensen and Meckling focuses on the problem of providing incentives for workers within a firm to actually profit maximize; as I noted yesterday, Holmstrom and Milgrom’s multitasking is a great example of this, with tasks being split across firms so as to allow some types of workers to be given high powered incentives and others flat salaries. More recent work by Bob Gibbons, Rebecca Henderson, Jon Levin and others on relational contracting discusses how the nexus of self-enforcing beliefs about how hard work today translates into rewards tomorrow can substitute for formal contracts, and how the credibility of these “relational contracts” can vary across firms and depend on their history.

Here’s the kicker, though. A striking blow was dealt to all theories which rely on the incompleteness or nonverifiability of contracts by a brilliant paper of Maskin and Tirole (1999) in the Review of Economic Studies. Theories relying on incomplete contracts generally just hand-waved that there are always events which are unforeseeable ex-ante or impossible to verify in court ex-post, and hence there will always scope for disagreement about what to do when those events occur. But, as Maskin and Tirole correctly point out, agent don’t care about anything in these unforeseeable/unverifiable states except for what the states imply about our mutual valuations from carrying on with a relationship. Therefore, every “incomplete contract” should just involve the parties deciding in advance that if a state of the world arrives where you value keeping our relationship in that state at 12 and I value it at 10, then we should split that joint value of 22 at whatever level induces optimal actions today. Do this same ex-ante contracting for all future profit levels, and we are done. Of course, there is still the problem of ensuring incentive compatibility – why would the agents tell the truth about their valuations when that unforeseen event occurs? I will omit the details here, but you should read the original paper where Maskin and Tirole show a (somewhat convoluted but still working) mechanism that induces truthful revelation of private value by each agent. Taking the model’s insight seriously but the exact mechanism less seriously, the paper basically suggests that incomplete contracts don’t matter if we can truthfully figure out ex-post who values our relationship at what amount, and there are many real-world institutions like mediators who do precisely that. If, as Maskin and Tirole prove (and Maskin described more simply in a short note), incomplete contracts aren’t a real problem, we are back to square one – why have persistent organizations called firms?

What should we do? Some theorists have tried to fight off Maskin and Tirole by suggesting that their precise mechanism is not terribly robust to, for instance, assumptions about higher-order beliefs (e.g., Aghion et al (2012) in the QJE). But these quibbles do not contradict the far more basic insight of Maskin and Tirole, that situations we think of empirically as “hard to describe” or “unlikely to occur or be foreseen”, are not sufficient to justify the relevance of incomplete contracts unless we also have some reason to think that all mechanisms which split rent on the basis of future profit, like a mediator, are unavailable. Note that real world contracts regularly include provisions that ex-ante describe how contractual disagreement ex-post should be handled.

Hart’s response, and this is both clear from his CV and from his recent papers and presentations, is to ditch incompleteness as the fundamental reason firms exist. Hart and Moore’s 2007 AER P&P and 2006 QJE are very clear:

Although the incomplete contracts literature has generated some useful insights about firm boundaries, it has some shortcomings. Three that seem particularly important to us are the following. First, the emphasis on noncontractible ex ante investments seems overplayed: although such investments are surely important, it is hard to believe that they are the sole drivers of organizational form. Second, and related, the approach is ill suited to studying the internal organization of firms, a topic of great interest and importance. The reason is that the Coasian renegotiation perspective suggests that the relevant parties will sit down together ex post and bargain to an efficient outcome using side payments: given this, it is hard to see why authority, hierarchy, delegation, or indeed anything apart from asset ownership matters. Finally, the approach has some foundational weaknesses [pointed out by Maskin and Tirole (1999)].

To my knowledge, Oliver Hart has written zero papers since Maskin-Tirole was published which attempt to explain any policy or empirical fact on the basis of residual control rights and their necessary incomplete contracts. Instead, he has been primarily working on theories which depend on reference points, a behavioral idea that when disagreements occur between parties, the ex-ante contracts are useful because they suggest “fair” divisions of rent, and induce shading and other destructive actions when those divisions are not given. These behavioral agents may very well disagree about what the ex-ante contract means for “fairness” ex-post. The primary result is that flexible contracts (e.g., contracts which deliberately leave lots of incompleteness) can adjust easily to changes in the world but will induce spiteful shading by at least one agent, while rigid contracts do not permit this shading but do cause parties to pursue suboptimal actions in some states of the world. This perspective has been applied by Hart to many questions over the past decade, such as why it can be credible to delegate decision making authority to agents; if you try to seize it back, the agent will feel aggrieved and will shade effort. These responses are hard, or perhaps impossible, to justify when agents are perfectly rational, and of course the Maskin-Tirole critique would apply if agents were purely rational.

So where does all this leave us concerning the initial problem of why firms exist in a sea of decentralized markets? In my view, we have many clever ideas, but still do not have the perfect theory. A perfect theory of the firm would need to be able to explain why firms are the size they are, why they own what they do, why they are organized as they are, why they persist over time, and why interfirm incentives look the way they do. It almost certainly would need its mechanisms to work if we assumed all agents were highly, or perfectly, rational. Since patterns of asset ownership are fundamental, it needs to go well beyond the type of hand-waving that makes up many “resource” type theories. (Firms exist because they create a corporate culture! Firms exist because some firms just are better at doing X and can’t be replicated! These are outcomes, not explanations.) I believe that there are reasons why the costs of maintaining relationships – transaction costs – endogenously differ within and outside firms, and that Hart is correct is focusing our attention on how asset ownership and decision making authority affects incentives to invest, but these theories even in their most endogenous form cannot do everything we wanted a theory of the firm to accomplish. I think that somehow reputation – and hence relational contracts – must play a fundamental role, and that the nexus of conflicting incentives among agents within an organization, as described by Holmstrom, must as well. But we still lack the precise insight to clear up this muddle, and give us a straightforward explanation for why we seem to need “little Communist bureaucracies” to assist our otherwise decentralized and almost magical market system.

Nobel Prize 2016 Part I: Bengt Holmstrom

The Nobel Prize in Economics has been announced, and what a deserving prize it is: Bengt Holmstrom and Oliver Hart have won for the theory of contracts. The name of this research weblog is “A Fine Theorem”, and it would be hard to find two economists whose work is more likely to elicit such a description! Both are incredibly deserving; more than five years ago on this site, I discussed how crazy it was that Holmstrom had yet to win!. The only shock is the combination: a more natural prize would have been Holmstrom with Paul Milgrom and Robert Wilson for modern applied mechanism design, and Oliver Hart with John Moore and Sandy Grossman for the theory of the firm. The contributions of Holmstrom and Hart are so vast that I’m splitting this post into two, so as to properly cover the incredible intellectual accomplishments of these two economists.

The Finnish economist Bengt Holmstrom did his PhD in operations research at Stanford, advised by Robert Wilson, and began his career at my alma mater, the tiny department of Managerial Economics and Decision Sciences at Northwestern’s Kellogg School. To say MEDS struck gold with their hires in this era is an extreme understatement: in 1978 and 1979 alone, they hired Holmstrom and his classmate Paul Milgrom (another Wilson student from Stanford), hired Nancy Stokey promoted Nobel laureate Roger Myerson to Associate Professor, and tenured an adviser of mine, Mark Satterthwaite. And this list doesn’t even include other faculty in the late 1970s and early 1980s like eminent contract theorist John Roberts, behavioralist Colin Camerer, mechanism designer John Ledyard or game theorist Ehud Kalai. This group was essentially put together by two senior economists at Kellogg, Nancy Schwartz and Stanley Reiter, who had the incredible foresight to realize both that applied game theory was finally showing promise of tackling first-order economic questions in a rigorous way, and that the folks with the proper mathematical background to tackle these questions were largely going unhired since they often did their graduate work in operations or mathematics departments rather than traditional economics departments. This market inefficiency, as it were, allowed Nancy and Stan to hire essentially every young scholar in what would become the field of mechanism design, and to develop a graduate program which combined operations, economics, and mathematics in a manner unlike any other place in the world.

From that fantastic group, Holmstrom’s contribution lies most centrally in the area of formal contract design. Imagine that you want someone – an employee, a child, a subordinate division, an aid contractor, or more generally an agent – to perform a task. How should you induce them to do this? If the task is “simple”, meaning the agent’s effort and knowledge about how to perform the task most efficiently is known and observable, you can simply pay a wage, cutting off payment if effort is not being exerted. When only the outcome of work can be observed, if there is no uncertainty in how effort is transformed into outcomes, knowing the outcome is equivalent to knowing effort, and hence optimal effort can be achieved via a bonus payment made on the basis of outcomes. All straightforward so far. The trickier situations, which Holmstrom and his coauthors analyzed at great length, are when neither effort nor outcomes are directly observable.

Consider paying a surgeon. You want to reward the doctor for competent, safe work. However, it is very difficult to observe perfectly what the surgeon is doing at all times, and basing pay on outcomes has a number of problems. First, the patient outcome depends on the effort of not just one surgeon, but on others in the operating room and prep table: team incentives must be provided. Second, the doctor has many ways to shift the balance of effort between reducing costs to the hospital, increasing patient comfort, increasing the quality of the medical outcome, and mentoring young assistant surgeons, so paying on the basis of one or two tasks may distort effort away from other harder-to-measure tasks: there is a multitasking problem. Third, the number of medical mistakes, or the cost of surgery, that a hospital ought expect from a competent surgeon depends on changes in training and technology that are hard to know, and hence a contract may want to adjust payments for its surgeons on the performance of surgeons elsewhere: contracts ought take advantage of relevant information when it is informative about the task being incentivized. Fourth, since surgeons will dislike risk in their salary, the fact that some negative patient outcomes are just bad luck means that you will need to pay the surgeon very high bonuses to overcome their risk aversion: when outcome measures involve uncertainty, optimal contracts will weigh “high-powered” bonuses against “low-powered” insurance against risk. Fifth, the surgeon can be incentivized either by payments today or by keeping their job tomorrow, and worse, these career concerns may cause the surgeon to waste the hospital’s money on tasks which matter to the surgeon’s career beyond the hospital.

Holmstrom wrote the canonical paper on each of these topics. His 1979 paper in the Bell Journal of Economics shows that any information which reduces the uncertainty about what an agent actually did should feature in a contract, since by reducing uncertainty, you reduce the risk premium needed to incentivize the agent to accept the contract. It might seem strange that contracts in many cases do not satisfy this “informativeness principle”. For instance, CEO bonuses are often not indexed to the performance of firms in the same industry. If oil prices rise, essentially all oil firms will be very profitable, and this is true whether or not a particular CEO is a good one. Bertrand and Mullainathan argue that this is because many firms with diverse shareholders are poorly governed!

The simplicity of contracts in the real world may have more prosaic explanations. Jointly with Paul Milgrom, the famous “multitasking” paper published in JLEO in 1991 notes that contracts shift incentives across different tasks in addition to serving as risk-sharing mechanisms and as methods for inducing effort. Since bonuses on task A will cause agents to shift effort away from hard-to-measure task B, it may be optimal to avoid strong incentives at all (just pay teachers a salary rather than a bonus based only on test performance) or to split job tasks (pay bonuses to teacher A who is told to focus only on math test scores, and pay salary to teacher B who is meant to serve as a mentor). That outcomes are generated by teams also motivates simpler contracts. Holmstrom’s 1982 article on incentives in teams, published in the Bell Journal, points out that if both my effort and yours is required to produce a good outcome, then the marginal product of our efforts are both equal to the entire value of what is produced, hence there is not enough output to pay each of us our marginal product. What can be done? Alchian and Demsetz had noticed this problem in 1972, arguing that firms exist to monitor the effort of individuals working in teams. With perfect knowledge of who does what, you can simply pay the workers a wage sufficient to make the optimal effort, then collect the residual as profit. Holmstrom notes that the monitoring isn’t the important bit: rather, even shareholder controlled firms where shareholders do no monitoring at all are useful. The reason is that shareholders can be residual claimants for profit, and hence there is no need to fully distribute profit to members of the team. Free-riding can therefore be eliminated by simply paying team members a wage of X if the team outcome is optimal, and 0 otherwise. Even a slight bit of shirking by a single agent drops their payment precipitously (which is impossible if all profits generated by the team are shared by the team), so the agents will not shirk. Of course, when there is uncertainty about how team effort transforms into outcomes, this harsh penalty will not work, and hence incentive problems may require team sizes to be smaller than that which is first-best efficient. A third justification for simple contracts is career concerns: agents work hard today to try to signal to the market that they are high quality, and do so even if they are paid a fixed wage. This argument had been made less formally by Fama, but Holmstrom (in a 1982 working paper finally published in 1999 in RESTUD) showed that this concern about the market only completely mitigates moral hazard if outcomes within a firm were fully observable to the market, or the future is not discounted at all, or there is no uncertainty about agent’s abilities. Indeed, career concerns can make effort provision worse; for example, agents may take actions to signal quality to the market which are negative for their current firm! A final explanation for simple contracts comes from Holmstrom’s 1987 paper with Milgrom in Econometrica. They argue that simple “linear” contracts, with a wage and a bonus based linearly on output, are more “robust” methods of solving moral hazard because they are less susceptible to manipulation by agents when the environment is not perfectly known. Michael Powell, a student of Holmstrom’s now at Northwestern, has a great set of PhD notes providing details of these models.

These ideas are reasonably intuitive, but the way Holmstrom answered them is not. Think about how an economist before the 1970s, like Adam Smith in his famous discussion of the inefficiency of sharecropping, might have dealt with these problems. These economists had few tools to deal with asymmetric information, so although economists like George Stigler analyzed the economic value of information, the question of how to elicit information useful to a contract could not be discussed in any systematic way. These economists would have been burdened by the fact that the number of contracts one could write are infinite, so beyond saying that under a contract of type X does not equate marginal cost to marginal revenue, the question of which “second-best” contract is optimal is extraordinarily difficult to answer in the absence of beautiful tricks like the revelation principle partially developed by Holmstrom himself. To develop those tricks, a theory of how individuals would respond to changes in their joint incentives over time was needed; the ideas of Bayesian equilibria and subgame perfection, developed by Harsanyi and Selten, were unknown before the 1960s. The accretion of tools developed by pure theory finally permitted, in the late 1970s and early 1980s, an absolute explosion of developments of great use to understanding the economic world. Consider, for example, the many results in antitrust provided by Nobel winner Jean Tirole, discussed here two years ago.

Holmstrom’s work has provided me with a great deal of understanding of why innovation management looks the way it does. For instance, why would a risk neutral firm not work enough on high-variance moonshot-type R&D projects, a question Holmstrom asks in his 1989 JEBO Agency Costs and Innovation? Four reasons. First, in Holmstrom and Milgrom’s 1987 linear contracts paper, optimal risk sharing leads to more distortion by agents the riskier the project being incentivized, so firms may choose lower expected value projects even if they themselves are risk neutral. Second, firms build reputation in capital markets just as workers do with career concerns, and high variance output projects are more costly in terms of the future value of that reputation when the interest rate on capital is lower (e.g., when firms are large and old). Third, when R&D workers can potentially pursue many different projects, multitasking suggests that workers should be given small and very specific tasks so as to lessen the potential for bonus payments to shift worker effort across projects. Smaller firms with fewer resources may naturally have limits on the types of research a worker could pursue, which surprisingly makes it easier to provide strong incentives for research effort on the remaining possible projects. Fourth, multitasking suggests agent’s tasks should be limited, and that high variance tasks should be assigned to the same agent, which provides a role for decentralizing research into large firms providing incremental, safe research, and small firms performing high-variance research. That many aspects of firm organization depend on the swirl of conflicting incentives the firm and the market provide is a topic Holmstrom has also discussed at length, especially in his beautiful paper “The Firm as an Incentive System”; I shall reserve discussion of that paper for a subsequent post on Oliver Hart.

Two final light notes on Holmstrom. First, he is the source of one of my favorite stories about Paul Samuelson, the greatest economic theorist of all time. Samuelson was known for having a steel trap of a mind. At a light trivia session during a house party for young faculty at MIT, Holmstrom snuck in a question, as a joke, asking for the name of the third President of independent Finland. Samuelson not only knew the name, but apparently was also able to digress on the man’s accomplishments! Second, I mentioned at the beginning of this post the illustrious roster of theorists who once sat at MEDS. Business school students are often very hesitant to deal with formal models, partially because they lack a technical background but also because there is a trend of “dumbing down” in business education whereby many schools (of course, not including my current department at The University of Toronto Rotman!) are more worried about student satisfaction than student learning. With perhaps Stanford GSB as an exception, it is inconceivable that any school today, Northwestern included, would gather such an incredible collection of minds working on abstract topics whose applicability to tangible business questions might lie years in the future. Indeed, I could name a number of so-called “top” business schools who have nobody on their faculty who has made any contribution of note to theory! There is a great opportunity for a Nancy Schwartz or Stan Reiter of today to build a business school whose students will have the ultimate reputation for rigorous analysis of social scientific questions.

Reinhard Selten and the making of modern game theory

Reinhard Selten, it is no exaggeration, is a founding father of two massive branches of modern economics: experiments and industrial organization. He passed away last week after a long and idiosyncratic life. Game theory as developed by the three co-Nobel laureates Selten, Nash, and Harsanyi is so embedded in economic reasoning today that, to a great extent, it has replaced price theory as the core organizing principle of our field. That this would happen was not always so clear, however.

Take a look at some canonical papers before 1980. Arrow’s Possibility Theorem simply assumed true preferences can be elicited; not until Gibbard and Satterthwaite do we answer the question of whether there is even a social choice rule that can elicit those preferences truthfully! Rothschild and Stiglitz’s celebrated 1976 essay on imperfect information in insurance markets defines equilibria in terms of a individual rationality, best responses in the Cournot sense, and free entry. How odd this seems today – surely the natural equilibrium in an insurance market depends on beliefs about the knowledge held by others, and beliefs about those beliefs! Analyses of bargaining before Rubinstein’s 1982 breakthrough nearly always rely on axioms of psychology rather than strategic reasoning. Discussions of predatory pricing until the 1970s, at the very earliest, relied on arguments that we now find unacceptably loose in their treatment of beliefs.

What happened? Why didn’t modern game-theoretic treatment of strategic situations – principally those involve more than one agent but less than an infinite number, although even situations of perfect competition now often are motivated game theoretically – arrive soon after the proofs of von Neumann, Morganstern, and Nash? Why wasn’t the Nash program, of finding justification in self-interested noncooperative reasoning for cooperative or axiom-driven behavior, immediately taken up? The problem was that the core concept of the Nash equilibrium simply permits too great a multiplicity of outcomes, some of which feel natural and others of which are less so. As such, a long search, driven essentially by a small community of mathematicians and economists, attempted to find the “right” refinements of Nash. And a small community it was: I recall Drew Fudenberg telling a story about a harrowing bus ride at an early game theory conference, where a fellow rider mentioned offhand that should they crash, the vast majority of game theorists in the world would be wiped out in one go!

Selten’s most renowned contribution came in the idea of perfection. The concept of subgame perfection was first proposed in a German-language journal in 1965 (making it one of the rare modern economic classics inaccessible to English speakers in the original, alongside Maurice Allais’ 1953 French-language paper in Econometrica which introduces the Allais paradox). Selten’s background up to 1965 is quite unusual. A young man during World War II, raised Protestant but with one Jewish parent, Selten fled Germany to work on farms, and only finished high school at 20 and college at 26. His two interests were mathematics, for which he worked on the then-unusual extensive form game for his doctoral degree, and experimentation, inspired by the small team of young professors at Frankfurt trying to pin down behavior in oligopoly through small lab studies.

In the 1965 paper, on demand inertia (paper is gated), Selten wrote a small game theoretic model to accompany the experiment, but realized there were many equilibria. The term “subgame perfect” was not introduced until 1974, also by Selten, but the idea itself is clear in the ’65 paper. He proposed that attention should focus on equilibria where, after every action, each player continues to act rationally from that point forward; that is, he proposed that in every “subgame”, or every game that could conceivably occur after some actions have been taken, equilibrium actions must remain an equilibrium. Consider predatory pricing: a firm considers lowering price below cost today to deter entry. It is a Nash equilibrium for entrants to believe the price would continue to stay low should they enter, and hence to not enter. But it is not subgame perfect: the entrant should reason that after entering, it is not worthwhile for the incumbent to continue to lose money once the entry has already occurred.

Complicated strings of deductions which rule out some actions based on faraway subgames can seem paradoxical, of course, and did even to Selten. In his famous Chain Store paradox, he considers a firm with stores in many locations choosing whether to price aggressively to deter entry, with one potential entrant in each town choosing one at a time whether to enter. Entrants prefer to enter if pricing is not aggressive, but prefer to remain out otherwise; incumbents prefer to price nonaggressively either if entry occurs or not. Reasoning backward, in the final town we have the simple one-shot predatory pricing case analyzed above, where we saw that entry is the only subgame perfect equilibria. Therefore, the entrant in the second-to-last town knows that the incumbent will not fight entry aggressively in the final town, hence there is no benefit to doing so in the second-to-last town, hence entry occurs again. Reasoning similarly, entry occurs everywhere. But if the incumbent could commit in advance to pricing aggressively in, say, the first 10 towns, it would deter entry in those towns and hence its profits would improve. Such commitment may not possible, but what if the incumbent’s reasoning ability is limited, and it doesn’t completely understand why aggressive pricing in early stages won’t deter the entrant in the 16th town? And what if entrants reason that the incumbent’s reasoning ability is not perfectly rational? Then aggressive pricing to deter entry can occur.

That behavior may not be perfectly rational but rather bounded had been an idea of Selten’s since he read Herbert Simon as a young professor, but in his Nobel Prize biography, he argues that progress on a suitable general theory of bounded rationality has been hard to come by. The closest Selten comes to formalizing the idea is in his paper on trembling hand perfection in 1974, inspired by conversations with John Harsanyi. The problem with subgame perfection had been noted: if an opponent takes an action off the equilibrium path, it is “irrational”, so why should rationality of the opponent be assumed in the subgame that follows? Harsanyi assumes that tiny mistakes can happen, putting even rational players into subgames. Taking the limit as mistakes become infinitesimally rare produces the idea of trembling-hand perfection. The idea of trembles implicitly introduces the idea that players have beliefs at various information sets about what has happened in the game. Kreps and Wilson’s sequential equilibrium recasts trembles as beliefs under uncertainty, and showed that a slight modification of the trembling hand leads to an easier decision-theoretic interpretation of trembles, an easier computation of equilibria, and an outcome that is nearly identical to Selten’s original idea. Sequential equilibria, of course, goes on to become to workhorse solution concept in dynamic economics, a concept which underscores essentially all of modern industrial organization.

That Harsanyi, inventor of the Bayesian game, is credited by Selten for inspiring the trembling hand paper is no surprise. The two had met at a conference in Jerusalem in the mid-1960s, and they’d worked together both on applied projects for the US military, and on pure theory research while Selten visiting Berkeley. A classic 1972 paper of theirs on Nash bargaining with incomplete information (article is gated) begins the field of cooperative games with incomplete information. And this was no minor field: Roger Myerson, in his paper introducing mechanism design under incomplete information – the famous Bayesian revelation principle paper – shows that there exists a unique Selten-Harsanyi bargaining solution under incomplete information which is incentive compatible.

Myerson’s example is amazing. Consider building a bridge which costs $100. Two people will use the bridge. One values the bridge at $90. The other values the bridge at $90 with probability .9, and $30 with probability p=.1, where that valuation is the private knowledge of the second person. Note that in either case, the bridge is worth building. But who should pay? If you propose a 50/50 split, the bridge will simply not be built 10% of the time. If you propose an 80/20 split, where even in their worst case situation each person gets a surplus value of ten dollars, the outcome is unfair to player one 90% of the time (where “unfair” will mean, violates certain principles of fairness that Nash, and later Selten and Harsanyi, set out axiomatically). What of the 53/47 split that gives each party, on average, the same split? Again, this is not “interim incentive compatible”, in that player two will refuse to pay in the case he is the type that values the bridge only at $30. Myerson shows mathematically that both players will agree once they know their private valuations to the following deal, and that the deal satisfies the Selten-Nash fairness axioms: when player 2 claims to value at $90, the payment split is 49.5/50.5 and the bridge is always built, but when player 2 claims to value at $30, the entire cost is paid by player 1 but the bridge is built with only probability .439. Under this split, there are correct incentives for player 2 to always reveal his true willingness to pay. The mechanism means that there is a 5.61 percent chance the bridge isn’t built, but the split of surplus from the bridge nonetheless does better than any other split which satisfies all of Harsanyi and Selten’s fairness axioms.

Selten’s later work is, it appears to me, more scattered. His attempt with Harsanyi to formalize “the” equilibrium refinement, in a 1988 book, was a valiant but in the end misguided attempt. His papers on theoretical biology, inspired by his interest in long walks among the wildflowers, are rather tangential to his economics. And what of his experimental work? To understand Selten’s thinking, read this fascinating dialogue with himself that Selten gave as a Schwartz Lecture at Northwestern MEDS. In this dialogue, he imagines a debate between a Bayesian economist, experimentalist, and an evolutionary biologist. The economist argues that “theory without theorems” is doomed to fail, that Bayesianism is normatively “correct”, and the Bayesian reasoning can easily be extended to include costs of reasoning or reasoning mistakes. The experimentalist argues that ad hoc assumptions are better than incorrect ones: just as human anatomy is complex and cannot be reduced to a few axioms, neither can social behavior. The biologist argues that learning a la Nelson and Winter is descriptively accurate as far as how humans behave, whereas high level reasoning is not. The “chairman”, perhaps representing Selten himself, sums up the argument as saying that experiments which simply contradict Bayesianism are a waste of time, but that human social behavior surely depends on bounded rationality and hence empirical work ought be devoted to constructing a foundation for such a theory (shall we call this the “Selten program”?). And yet, this essay was from 1990, and we seem no closer to having such a theory, nor does it seem to me that behavioral research has fundamentally contradicted most of our core empirical understanding derived from theories with pure rationality. Selten’s program, it seems, remains not only incomplete, but perhaps not even first order; the same cannot be said of his theoretical constructs, as without perfection a great part of modern economics simply could not exist.

The Economics of John Nash

I’m in the midst of a four week string of conferences and travel, and terribly backed up with posts on some great new papers, but I can’t let the tragic passing today of John Nash go by without comment. When nonacademics ask what I do, I often say that I work in a branch of applied math called game theory; if you say you are economist, the man on the street expects you to know when unemployment will go down, or which stocks they should buy, or whether monetary expansion will lead to inflation, questions which the applied theorist has little ability to answer in a satisfying way. But then, if you mention game theory, without question the most common way your interlocutor knows the field is via Russell Crowe’s John Nash character in A Beautiful Mind, so surely, and rightfully, no game theorist has greater popular name recognition.

Now Nash’s contributions to economics are very small, though enormously influential. He was a pure mathematician who took only one course in economics in his studies; more on this fortuitous course shortly. The contributions are simple to state: Nash founded the theory of non-cooperative games, and he instigated an important, though ultimately unsuccessful, literature on bargaining. Nash essentially only has two short papers on each topic, each of which is easy to follow for a modern reader, so I will generally discuss some background on the work rather than the well-known results directly.

First, non-cooperative games. Robert Leonard has a very interesting intellectual history of the early days of game theory, the formal study of strategic interaction, which begins well before Nash. Many like to cite von Neumann’s “Zur Theorie der Gesellschaftsspiele” (“A Theory of Parlor Games”), from whence we have the minimax theorem, but Emile Borel in the early 1920’s, and Ernst Zermelo with his eponymous theorem a decade earlier, surely form relevant prehistory as well. These earlier attempts, including von Neumann’s book with Morganstern, did not allow general investigation of what we now call noncooperative games, or strategic situations where players do not attempt to collude. The most famous situation of this type is the Prisoner’s Dilemma, a simple example, yet a shocking one: competing agents, be they individuals, firms or countries, may (in a sense) rationally find themselves taking actions which both parties think is worse than some alternative. Given the U.S. government interest in how a joint nuclear world with the Soviets would play out, analyzing situations of that type was not simply a “Gesellschaftsspiele” in the late 1940s; Nash himself was funded by the Atomic Energy Commission, and RAND, site of a huge amount of important early game theory research, was linked to the military.

Nash’s insight was, in retrospect, very simple. Consider a soccer penalty kick, where the only options are to kick left and right for the shooter, and to simultaneously dive left or right for the goalie. Now at first glance, it seems like there can be no equilibrium: if the shooter will kick left, then the goalie will jump to that side, in which case the shooter would prefer to shoot right, in which case the goalie would prefer to switch as well, and so on. In real life, then, what do we expect to happen? Well, surely we expect that the shooter will sometimes shoot left and sometimes right, and likewise the goalie will mix which way she dives. That is, instead of two strategies for each player, we have a continuum of mixed strategies, where a mixed strategy is simply a probability distribution over the strategies “Left, Right”. This idea of mixed strategies “convexifies” the strategy space so that we can use fixed point strategies to guarantee that an equilibrium exists in every finite-strategy noncooperative game under expected utility (Kakutani’s Fixed Point in the initial one-page paper in PNAS which Nash wrote his very first year of graduate school, and Brouwer’s Fixed Point in the Annals of Math article which more rigorously lays out Nash’s noncooperative theory). Because of Nash, we are able to analyze essentially whatever nonstrategic situation we want under what seems to be a reasonable solution concept (I optimize given my beliefs about what others will do, and my beliefs are in the end correct). More importantly, the fixed point theorems Nash used to generate his equilibria are now so broadly applied that no respectable economist should now get a PhD without understanding how they work.

(A quick aside: it is quite interesting to me that game theory, as opposed to Walrasian/Marshallian economics, does not derive from physics or other natural sciences, but rather from a program at the intersection of formal logic and mathematics, primarily in Germany, primarily in the early 20th century. I still have a mind to write a proper paper on this intellectual history at some point, but there is a very real sense in which economics post-Samuelson, von Neumann and Nash forms a rather continuous methodology with earlier social science in the sense of qualitative deduction, whereas it is our sister social sciences which, for a variety of reasons, go on writing papers without the powerful tools of modern logic and the mathematics which followed Hilbert. When Nash makes claims about the existence of equilibria due to Brouwer, the mathematics is merely the structure holding up and extending ideas concerning the interaction of agents in noncooperative systems that would have been totally familiar to earlier generations of economists who simply didn’t have access to tools like the fixed point theorems, in the same way that Samuelson and Houthakker’s ideas on utility are no great break from earlier work aside from their explicit incorporation of deduction on the basis of relational logic, a tool unknown to economists in the 19th century. That is, I claim the mathematization of economics in the mid 20th century represents no major methodological break, nor an attempt to ape the natural sciences. Back to Nash’s work in the direct sense.)

Nash only applies his theory to one game: a simplified version of poker due to his advisor called Kuhn Poker. It turned out that the noncooperative solution was not immediately applicable, at least to the types of applied situations where it is now commonplace, without a handful of modifications. In my read of the intellectual history, noncooperative games was a bit of a failure outside the realm of pure math in its first 25 years because we still needed Harsanyi’s purification theorem and Bayesian equilibria to understand what exactly was going on with mixed strategies, Reinhard Selten’s idea of subgame perfection to reasonably analyze games with multiple stages, and the idea of mechanism design of Gibbard, Vickers, Myerson, Maskin, and Satterthwaite (among others) to make it possible to discuss how institutions affect outcomes which are determined in equilibrium. It is not simply economists that Nash influenced; among many others, his work directly leads to the evolutionary games of Maynard Smith and Price in biology and linguistics, the upper and lower values of his 1953 results have been used to prove other mathematical results and to discuss what is meant as truth in philosophy, and Nash is widespread in the analysis of voting behavior in political science and international relations.

The bargaining solution is a trickier legacy. Recall Nash’s sole economics course, which he took as an undergraduate. In that course, he wrote a term paper, eventually to appear in Econometrica, where he attempted to axiomatize what will happen when two parties bargain over some outcome. The idea is simple. Whatever the bargaining outcome is, we want it to satisfy a handful of reasonable assumptions. First, since ordinal utility is invariant to affine transformations of a utility function, the bargaining outcome should not be affected by these types of transformations: only ordinal preferences should matter. Second, the outcome should be Pareto optimal: the players would have to mighty spiteful to throw away part of the pie rather than give it to at least one of them. Third, given their utility functions players should be treated symmetrically. Fourth (and a bit controversially, as we will see), Nash insisted on Independence of Irrelevant Alternatives, meaning that if f(T) is the set of “fair bargains” when T is the set of all potential bargains, then if the potential set of bargains is smaller yet still contains f(T), say S strictly contained by T where f(T) is in S, then f(T) must remain the barganing outcome. It turns out that under these assumptions, there is a unique outcome which maximizes (u(x)-u(d))*(v(x)-v(d)), where u and v are each player’s utility functions, x is the vector of payoffs under the eventual bargain, and d the “status-quo” payoff if no bargain is made. This is natural in many cases. For instance, if two identical agents are splitting a dollar, then 50-50 is the only Nash outcome. Uniqueness is not at all obvious: recall the Edgeworth box and you will see that individual rationality and Pareto optimality alone leave many potential equilibria. Nash’s result is elegant and surprising, and it is no surprise that Nash’s grad school recommendation letter famously was only one sentence long: “This man is a genius.”

One problem with Nash bargaining, however. Nash was famously bipolar in real life, but there is an analogous bipolar feel to the idea of Nash equilibrium and the idea of Nash bargaining: where exactly are threats in Nash’s bargain theory? That is, Nash bargaining as an idea completely follows from the cooperative theory of von Neumann and Morganstern. Consider two identical agents splitting a dollar once more. Imagine that one of the agents already has 30 cents, so that only 70 of the cents are actually in the middle of the table. The Nash solution is that the person who starts with the thirty cents eventually winds up with 65 cents, and the other person with 35. But play this out in your head.

Player 1: “I, already having the 30 cents, should get half of what remains. It is only fair, and if you don’t give me 65 I will walk away from this table and we will each get nothing more.”

Player 2: “What does that have to do with it? The fair outcome is 50 cents each, which leaves you with more than your originally thirty, so you can take your threat and go jump off a bridge.”

That is, 50/50 might be a reasonable solution here, right? This might make even more sense if we take a more concrete example: bargaining over wages. Imagine the prevailing wage for CEOs in your industry is $250,000. Two identical CEOs will generate $500,000 in value for the firm if hired. CEO Candidate One has no other job offer. CEO Candidate Two has an offer from a job with similar prestige and benefits, paying $175,000. Surely we can’t believe that the second CEO will wind up with higher pay, right? It is a completely noncredible threat to take the $175,000 offer, hence it shouldn’t affect the bargaining outcome. A pet peeve of mine is that many applied economists are still using Nash bargaining – often in the context of the labor market! – despite this well-known problem.

Nash was quite aware of this, as can be seen by his 1953 Econometrica, where he attempts to give a noncooperative bargaining game that reaches the earlier axiomatic outcome. Indeed, this paper inspired an enormous research agenda called the Nash Program devoted to finding noncooperative games which generate well-known or reasonable-sounding cooperative solution outcomes. In some sense, the idea of “implementation” in mechanism design, where we investigate whether there exists a game which can generate socially or coalitionally preferred outcomes noncooperatively, can be thought of as a successful modern branch of the Nash program. Nash’s ’53 noncooperative game simply involves adding a bit of noise into the set of possible outcomes. Consider splitting a dollar again. Let a third party tell each player to name how many cents they want. If the joint requests are feasible, then the dollar is split (with any remainder thrown away), else each player gets nothing. Clearly every split of the dollar on the Pareto frontier is a Nash equilibrium, as is each player requesting the full dollar and getting nothing. However, if there is a tiny bit of noise about whether there is exactly one dollar, or .99 cents, or 1.01 cents, etc., when deciding whether to ask for more money, I will have to weigh the higher payoff if the joint demand is feasible against the payoff zero if my increased demand makes the split impossible and hence neither of us earn anything. In a rough sense, Nash shows that as the distribution of noise becomes degenerate around the true bargaining frontier, players will demand exactly their Nash bargaining outcome. Of course it is interesting that there exists some bargaining game that generates the Nash solution, and the idea that we should study noncooperative games which implement cooperate solution concepts is without a doubt seminal, but this particular game seems very strange to me, as I don’t understand what the source of the noise is, why it becomes degenerate, etc.

On the shoulders of Nash, however, bargaining progressed a huge amount. Three papers in particular are worth your time, although hopefully you have seen these before: Kalai and Smorodinsky 1975 who retaining the axiomatic approach but drop IIA, Rubinstein’s famous 1982 Econometrica on noncooperative bargaining with alternative offers, and Binmore, Rubinstein and Wolinsky on implementation of bargaining solutions which deals with the idea of threats as I did above.

You can read all four Nash papers in the original literally during your lunch hour; this seems to me a worthy way to tip your cap toward a man who literally helped make modern economics possible.

“The Contributions of the Economics of Information to Twentieth Century Economics,” J. Stiglitz (2000)

There have been three major methodological developments in economics since 1970. First, following the Lucas Critique we are reluctant to accept policy advice which is not the result of directed behavior on the part of individuals and firms. Second, developments in game theory have made it possible to reformulate questions like “why do firms exist?”, “what will result from regulating a particular industry in a particular way?”, “what can I infer about the state of the world from an offer to trade?”, among many others. Third, imperfect and asymmetric information was shown to be of first-order importance for analyzing economic problems.

Why is information so important? Prices, Hayek taught us, solve the problem of asymmetric information about scarcity. Knowing the price vector is a sufficient statistic for knowing everything about production processes in every firm, as far as generating efficient behavior is concerned. The simple existence of asymmetric information, then, is not obviously a problem for economic efficiency. And if asymmetric information about big things like scarcity across society does not obviously matter, then how could imperfect information about minor things matter? A shopper, for instance, may not know exactly the price of every car at every dealership. But “Natura non facit saltum”, Marshall once claimed: nature does not make leaps. Tiny deviations from the assumptions of general equilibrium do not have large consequences.

But Marshall was wrong: nature does make leaps when it comes to information. The search model of Peter Diamond, most famously, showed that arbitrarily small search costs lead to firms charging the monopoly price in equilibrium, hence a welfare loss completely out of proportion to the search costs. That is, information costs and asymmetries, even very small ones, can theoretically be very problematic for the Arrow-Debreu welfare properties.

Even more interesting, we learned that prices are more powerful than we’d believed. They convey information about scarcity, yes, but also information about other people’s own information or effort. Consider, for instance, efficiency wages. A high wage is not merely a signal of scarcity for a particular type of labor, but is simultaneously an effort inducement mechanism. Given this dual role, it is perhaps not surprising that general equilibrium is no longer Pareto optimal, even if the planner is as constrained informationally as each agent.

How is this? Decentralized economies may, given information cost constraints, exert too much effort searching, or generate inefficient separating equilibrium that unravel trades. The beautiful equity/efficiency separation of the Second Welfare Theorem does not hold in a world of imperfect information. A simple example on this point is that it is often useful to allow some agents suffering moral hazard worries to “buy the firm”, mitigating the incentive problem, but limited liability means this may not happen unless those particular agents begin with a large endowment. That is, a different endowment, where the agents suffering extreme moral hazard problems begin with more money and are able to “buy the firm”, leads to more efficient production (potentially in a Pareto sense) than an endowment where those workers must be provided with information rents in an economy-distorting manner.

It is a strange fact that many social scientists feel economics to some extent stopped progressing by the 1970s. All the important basic results were, in some sense, known. How untrue this is! Imagine labor without search models, trade without monopolistic competitive equilibria, IO or monetary policy without mechanism design, finance without formal models of price discovery and equilibrium noise trading: all would be impossible given the tools we had in 1970. The explanations that preceded modern game theoretic and information-laden explanations are quite extraordinary: Marshall observed that managers have interests different from owners, yet nonetheless are “well-behaved” in running firms in a way acceptable to the owner. His explanation was to credit British upbringing and morals! As Stiglitz notes, this is not an explanation we would accept today. Rather, firms have used a number of intriguing mechanisms to structure incentives in a way that limits agency problems, and we now possess the tools to analyze these mechanisms rigorously.

Final 2000 QJE (RePEc IDEAS)

“Competition, Imitation and Growth with Step-by-Step Innovation,” P. Aghion, C. Harris, P. Howitt, & J. Vickers (2001)

(One quick PSA before I get to today’s paper: if you happen, by chance, to be a graduate student in the social sciences in Toronto, you are more than welcome to attend my PhD seminar in innovation and entrepreneurship at the Rotman school which begins on Wednesday, the 7th. I’ve put together a really wild reading list, so hopefully we’ll get some very productive discussions out of the course. The only prerequisite is that you know some basic game theory, and my number one goal is forcing the economists to read sociology, the sociologists to write formal theory, and the whole lot to understand how many modern topics in innovation have historical antecedents. Think of it as a high-variance cross-disciplinary educational lottery ticket! If interested, email me at for more details.)

Back to Aghion et al. Let’s kick off 2015 with one of the nicer pieces to come out the ridiculously productive decade or so of theoretical work on growth put together by Philippe Aghion and his coauthors; I wish I could capture the famous alacrity of Aghion’s live presentation of his work, but I fear that’s impossible to do in writing! This paper is based around writing a useful theory to speak to two of the oldest questions in the economics of innovation: is more competition in product markets good or bad for R&D, and is there something strange about giving a firm IP (literally a grant of market power meant to spur innovation via excess rents) at the same time as we enforce antitrust (generally a restriction on market power meant to reduce excess rents)?

Aghion et al come to a few very surprising conclusions. First, the Schumpeterian idea that firms with market power do more R&D is misleading because it ignores the “escape the competition” effect whereby firms have high incentive to innovate when there is a large market that can be captured by doing so. Second, maximizing that “escape the competition” motive may involve making it not too easy to catch up to market technological leaders (by IP or other means). These two theoretical results imply that antitrust (making sure there are a lot of firms competing in a given market, spurring new innovation to take market share from rivals) and IP policy (ensuring that R&D actually needs to be performed in order to gain a lead) are in a sense complements! The fundamental theoretical driver is that the incentive to innovate depends not only on the rents of an innovation, but on the incremental rents of an innovation; if innovators include firms that already active in an industry, policy that makes your current technological state less valuable (because you are in a more competitive market, say) or policy that makes jumping to a better technological state more valuable both increase the size of the incremental rent, and hence the incentive to perform R&D.

Here are the key aspects of a simplified version of the model. An industry is a duopoly where consumers spend exactly 1 dollar per period. The duopolists produce partially substitutable goods, where the more similar the goods the more “product market competition” there is. Each of the duopolists produces their good at a firm-specific cost, and competes in Bertrand with their duopoly rival. At the minimal amount of product market competition, each firm earns constant profit regardless of their cost or their rival’s cost. Firms can invest in R&D which gives some flow probability of lowering their unit cost. Technological laggards sometimes catch up to the unit cost of leaders with exogenous probability; lower IP protection (or more prevalent spillovers) means this probability is higher. We’ll look only at features of this model in the stochastic distribution of technological leadership and lags which is a steady state if there infinite duopolistic industries.

In a model with these features, you always want at least a little competition, essentially for Arrow (1962) reasons: the size of the market is small when market power is large because total unit sales are low, hence the benefit of reducing unit costs is low, hence no one will bother to do any innovation in the limit. More competition can also be good because it increases the probability that two firms are at similar technological levels, in which case each wants to double down on research intensity to gain a lead. At very high levels of competition, the old Schumpeterian story might bind again: goods are so substitutable that R&D to increase rents is pointless since almost all rents are competed away, especially if IP is weak so that rival firms catch up to your unit cost quickly no matter how much R&D you do. What of the optimal level of IP? It’s always best to ensure IP is not too strong, or that spillovers are not too weak, because the benefit of increased R&D effort when firms are at similar technological levels following the spillover exceeds the lost incentive to gain a lead in the first place when IP is not perfectly strong. When markets are really competitive, however, the Schumpeterian insight that some rents need to exist militates in favor of somewhat stronger IP than in less competitive product markets.

Final working paper (RePEc IDEAS) which was published in 2001 in the Review of Economic Studies. This paper is the more detailed one theoretically, but if all of the insight sounds familiar, you may already know the hugely influential follow-up paper by Aghion, Bloom, Blundell, Griffith and Howitt, “Competition and Innovation: An Inverted U Relationship”, published in the QJE in 2005. That paper gives some empirical evidence for the idea that innovation is maximized at intermediate values of product market competition; the Schumpeterian “we need some rents” motive and the “firms innovate to escape competition” motive both play a role. I am actually not a huge fan of this paper – as an empirical matter, I’m unconvinced that most cost-reducing innovation in many industries will never show up in patent statistics (principally for reasons that Eric von Hippel made clear in The Sources of Innovation, which is freely downloadable at that link!). But this is a discussion for another day! One more related paper we have previously discussed is Goettler and Gordon’s 2012 structural work on processor chip innovation at AMD and Intel, which has a very similar within-industry motivation.

Nobel Prize 2014: Jean Tirole

A Nobel Prize for applied theory – now this something I can get behind! Jean Tirole’s prize announcement credits him for his work on market power and regulation, and there is no question that he is among the leaders, if not the world leader, in the application of mechanism design theory to industrial organization; indeed, the idea of doing IO in the absence of this theoretical toolbox seems so strange to me that it’s hard to imagine anyone had ever done it! Economics is sometimes defined by a core principle that agents – people or firms – respond to incentives. Incentives are endogenous; how my bank or my payment processor or my lawyer wants to act depends on how other banks or other processors or other prosecutors act. Regulation is therefore a game. Optimal regulation is therefore a problem of mechanism design, and we now have mathematical tools that allow investigation across the entire space of potential regulating mechanisms, even those that our counterfactual. That is an incredibly powerful methodological advance, so powerful that there will be at least one more Nobel (Milgrom and Holmstrom?) based on this literature.

Because Tirole’s toolbox is theoretical, he has written an enormous amount of “high theory” on the implications of the types of models modern IO economists use. I want to focus in this post on a particular problem where Tirole has stood on both sides of the divide: that of the seemingly obscure question of what can be contracted on.

This literature goes back to a very simple question: what is a firm, and why do they exist? And when they exist, why don’t they grow so large that they become one giant firm a la Schumpeter’s belief in Capitalism, Socialism, and Democracy? One answer is that given by Coase and updated by Williamson, among many others: transaction costs. There are some costs of haggling or similar involved in getting things done with suppliers or independent contractors. When these costs are high, we integrate that factor into the firm. When they are low, we avoid the bureaucratic costs needed to manage all those factors.

For a theorist trained in mechanism design, this is a really strange idea. For one, what exactly are these haggling or transaction costs? Without specifying what precisely is meant, it is very tough to write a model incorporating them and exploring the implications of them. But worse, why would we think these costs are higher outside the firm than inside? A series of papers by Sandy Grossman, Oliver Hart and John Moore point out, quite rightly, that firms cannot make their employees do anything. They can tell them to do something, but the employees will respond to incentives like anyone else. Given that, why would we think the problem of incentivizing employees within an organization is any easier or harder than incentivizing them outside the organization? The solution they propose is the famous Property Rights Theory of the firm (which could fairly be considered the most important paper ever published in the illustrious JPE). This theory says that firms are defined by the assets they control. If we can contract on every future state of the world, then this control shouldn’t matter, but when unforeseen contingencies arise, the firm still has “residual control” of its capital. Therefore, efficiency depends on the allocation of scarce residual control rights, and hence the allocation of these rights inside or outside of a firm are important. Now that is a theory of the firm – one well-specified and based on incentives – that I can understand. (An interesting sidenote: when people think economists don’t really understand the economy because, hey, they’re not rich, we can at least point to Sandy Grossman. Sandy, a very good theorist, left academia to start his own firm, and as far as I know, he is now a billionaire!)

Now you may notice one problem with Grossman, Hart and Moore’s papers. As there was an assumption of nebulous transaction costs in Coase and his followers, there is a nebulous assumption of “incomplete contracts” in GHM. This seems reasonable at first glance: there is no way we could possibly write a contract that covers every possible contingency or future state of the world. I have to imagine everyone that has ever rented an apartment or leased a car or ran a small business has first-hand experience with the nature of residual control rights when some contingency arises. Here is where Tirole comes in. Throughout the 80s and 90s, Tirole wrote many papers using incomplete contracts: his 1994 paper with Aghion on contracts for R&D is right within this literature. In complete contracting, the courts can verify and enforce any contract that relies on observable information, though adverse selection (hidden information by agents) or moral hazard (unverifiable action by agents) may still exist. Incomplete contracting further restricts the set of contracts to a generally simple set of possibilities. In the late 1990s, however, Tirole, along with his fellow Nobel winner Eric Maskin, realized in an absolute blockbuster of a paper that there is a serious problem with these incomplete contracts as usually modeled.

Here is why: even if we can’t ex-ante describe all the future states of the world, we may still ex-post be able to elicit information about the payoffs we each get. As Tirole has noted, firms do not care about indescribable contingencies per se; they only care about how those contingencies affect their payoffs. That means that, at an absolute minimum, the optimal “incomplete contract” better be at least as good as the optimal contract which conditions on elicited payoffs. These payoffs may be stochastic realizations of all of our actions, of course, and hence this insight might not actually mean we can first-best efficiency when the future is really hard to describe. Maskin and Tirole’s 1999 paper shows, incredibly, that indescribability of states is irrelevant, and that even if we can’t write down a contract on states of the world, we can contract on payoff realizations in a way that is just as good as if we could actually write the complete contract.

How could this be? Imagine (here via a simple example of Maskin’s) two firms contracting for R&D. Firm 1 exerts effort e1 and produces a good with value v(e1). Firm 2 invests in some process that will lower the production cost of firm 1’s new good, investing e2 to make production cost equal to c(e2). Payoffs, then, are u1(p-c(e2)-e1) and u2(v(e1)-p-e2). If we knew u1 and u2 and could contract upon it, then the usual Nash implementation literature tells us how to generate efficient levels of e1 and e2 (call them e1*, e2*) by writing a contract: if the product doesn’t have the characteristics of v(e1*) and the production process doesn’t have the characteristics of c(e2*), then we fine the person who cheated. If effort generated stochastic values rather than absolute ones, the standard mechanism design literature tells us exactly when we can still get the first best.

Now, what if v and c are state-dependent, and there are huge number of states of the world? That is, efficient e1* and e2* are now functions of the state of the world realized after we write the initial contract. Incomplete contracting assumed that we cannot foresee all the possible v and c, and hence won’t write a contract incorporating all of them. But, aha!, we can still write a contract that says, look, whatever happens tomorrow, we are going to play a game tomorrow where I say what my v is and you say what your c is. It turns out that there exists such a game which generates truthful revelation of v and c (Maskin and Tirole do this using an idea similar to that of the subgame implementation literature, but the exact features are not terribly important for our purposes). Since the only part of the indescribable state I care about is the part that affects my payoffs, we are essentially done: no matter how many v and c’s there could be in the future, as long as I can write a contract specifying how we each get other to truthfully say what those parameters are, this indescribability doesn’t matter.

Whoa. That is a very, very, very clever insight. Frankly, it is convincing enough that the only role left for property rights theories of the firm are some kind of behavioral theory which restricts even contracts of the Maskin-Tirole sense – and since these contracts are quite a bit simpler in some way than the hundreds of pages of legalese which we see in a lot of real-world contracts on important issues, it’s not clear that bounded rationality or similar theories will get us far.

Where to go from here? Firms, and organizations more generally, exist. I am sure the reason has to do with incentives. But exactly why – well, we still have a lot of work to do in explaining why. And Tirole has played a major role in explaining why.

Tirole’s Walras-Bowley lecture, published in Econometrica in 1999, is a fairly accessible introduction to his current view of incomplete contracts. He has many other fantastic papers, across a wide variety of topics. I particularly like his behavioral theory written mainly with Roland Benabou; see, for instance, their 2003 ReStud on when monetary rewards are bad for incentives.

“Aggregation in Production Functions: What Applied Economists Should Know,” J. Felipe & F. Fisher (2003)

Consider a firm that takes heterogeneous labor and capital inputs L1, L2… and K1, K2…, using these to produce some output Y. Define a firm production function Y=F(K1, K2…, L1, L2…) as the maximal output that can be produced using the given vector of outputs – and note the implicit optimization condition in that definition, which means that production functions are not simply technical relationships. What conditions are required to construct an aggregated production function Y=F(K,L), or more broadly to aggregate across firms an economy-wide production function Y=F(K,L)? Note that the question is not about the definition of capital per se, since defining “labor” is equally problematic when man-hours are clearly heterogeneous, and this question is also not about the more general capital controversy worries, like reswitching (see Samuelson’s champagne example) or the dependence of the return to capital on the distribution of income which, itself, depends on the return to capital.

(A brief aside: on that last worry, why the Cambridge UK types and their modern day followers are so worried about the circularity of the definition of the interest rate, yet so unconcerned about the exact same property of the object we call “wage”, is quite strange to me, since surely if wages equal marginal product, and marginal product in dollars is a function of aggregate demand, and aggregate demand is a function of the budget constraint determined by wages, we are in an identical philosophical situation. I think it’s pretty clear that the focus on “r” rather than “w” is because of the moral implications of capitalists “earning their marginal product” which are less than desirable for people of a certain political persuasion. But I digress; let’s return to more technical concerns.)

It turns out, and this should be fairly well-known, that the conditions under which factors can be aggregated are ridiculously stringent. If we literally want to add up K or L when firms use different production functions, the condition (due to Leontief) is that the marginal rate of substitution between different types of factors in one aggregation, e.g. capital, does not depend on the level of factors not in that aggregation, e.g. labor. Surely this is a condition that rarely holds: how much I want to use, in an example due to Solow, different types of trucks will depend on how much labor I have at hand. A follow-up by Nataf in the 1940s is even more discouraging. Assume every firm uses homogenous labor, every firm uses capital which though homogenous within each firms differs across firms, and every firm has identical constant returns to scale production technology. When can I now write an aggregate production function Y=F(K,L) summing up the capital in each firm K1, K2…? That aggregate function exists if and only if every firm’s production function is additively separable in capital and labor (in which case, the aggregation function is pretty obvious)! Pretty stringent, indeed.

Fisher helps things just a bit in a pair of papers from the 1960s. Essentially, he points out that we don’t want to aggregate for all vectors K and L, but rather we need to remember that production functions measure the maximum output possible when all inputs are used most efficiently. Competitive factor markets guarantee that this assumption will hold in equilibrium. That said, even assuming only one type of labor, efficient factor markets, and a constant returns to scale production function, aggregation is possible if and only if every firm has the same production function Y=F(b(v)K(v),L), where v denotes a given firm and b(v) is a measure of how efficiently capital is employed in that firm. That is, aside from capital efficiency, every firm’s production function must be identical if we want to construct an aggregate production function. This is somewhat better than Nataf’s result, but still seems highly unlikely across a sector (to say nothing of an economy!).

Why, then, do empirical exercises using, say, aggregate Cobb-Douglas seem to give such reasonable parameters, even though the above theoretical results suggest that parameters like “aggregate elasticity of substitution between labor and capital” don’t even exist? That is, when we estimate elasticities or total factor productivities from Y=AK^a*L^b, using some measure of aggregated capital, what are we even estimating? Two things. First, Nelson and Winter in their seminal book generate aggregate date which can almost perfectly be fitted using Cobb-Douglas even though their model is completely evolutionary and does not even involve maximizing behavior by firms, so the existence of a “good fit” alone is, and this should go without saying, not great evidence in support of a model. Second, since ex-post production Y must equal the wage bill plus the capital payments plus profits, Felipe notes that this identity can be algebraically manipulated to Y=AF(K,L) where the form of F depends on the nature of the factor shares. That is, the good fit of Cobb-Douglas or CES can simply reflect an accounting identity even when nothing is known about micro-level elasticities or similar.

So what to do? I am not totally convinced we should throw out aggregate production functions – it surely isn’t a coincidence that Solow residuals for TFP match are estimated to be high in places where our intuition says technological change has been rapid. Because of results like this, it doesn’t strike me that aggregate production functions are measuring arbitrary things. However, if we are using parameters from these functions to do counterfactual analysis, we really ought know better exactly what approximations or assumptions are being baked into the cake, and it doesn’t seem that we are quite there yet. Until we are, a great deal of care should be taken in assigning interpretations to estimates based on aggregate production models. I’d be grateful for any pointers in the comments to recent work on this problem.

Final published version (RePEc IDEAS. The “F. Fisher” on this paper is the former Clark Medal winner and well-known IO economist Franklin Fisher; rare is it to find a nice discussion of capital issues written by someone who is firmly part of the economics mainstream and completely aware of the major theoretical results from “both Cambridges”. Tip of the cap to Cosma Shalizi for pointing out this paper.

%d bloggers like this: