Category Archives: Classics

Nobel Prize 2014: Jean Tirole

A Nobel Prize for applied theory – now this something I can get behind! Jean Tirole’s prize announcement credits him for his work on market power and regulation, and there is no question that he is among the leaders, if not the world leader, in the application of mechanism design theory to industrial organization; indeed, the idea of doing IO in the absence of this theoretical toolbox seems so strange to me that it’s hard to imagine anyone had ever done it! Economics is sometimes defined by a core principle that agents – people or firms – respond to incentives. Incentives are endogenous; how my bank or my payment processor or my lawyer wants to act depends on how other banks or other processors or other prosecutors act. Regulation is therefore a game. Optimal regulation is therefore a problem of mechanism design, and we now have mathematical tools that allow investigation across the entire space of potential regulating mechanisms, even those that our counterfactual. That is an incredibly powerful methodological advance, so powerful that there will be at least one more Nobel (Milgrom and Holmstrom?) based on this literature.

Because Tirole’s toolbox is theoretical, he has written an enormous amount of “high theory” on the implications of the types of models modern IO economists use. I want to focus in this post on a particular problem where Tirole has stood on both sides of the divide: that of the seemingly obscure question of what can be contracted on.

This literature goes back to a very simple question: what is a firm, and why do they exist? And when they exist, why don’t they grow so large that they become one giant firm a la Schumpeter’s belief in Capitalism, Socialism, and Democracy? One answer is that given by Coase and updated by Williamson, among many others: transaction costs. There are some costs of haggling or similar involved in getting things done with suppliers or independent contractors. When these costs are high, we integrate that factor into the firm. When they are low, we avoid the bureaucratic costs needed to manage all those factors.

For a theorist trained in mechanism design, this is a really strange idea. For one, what exactly are these haggling or transaction costs? Without specifying what precisely is meant, it is very tough to write a model incorporating them and exploring the implications of them. But worse, why would we think these costs are higher outside the firm than inside? A series of papers by Sandy Grossman, Oliver Hart and John Moore point out, quite rightly, that firms cannot make their employees do anything. They can tell them to do something, but the employees will respond to incentives like anyone else. Given that, why would we think the problem of incentivizing employees within an organization is any easier or harder than incentivizing them outside the organization? The solution they propose is the famous Property Rights Theory of the firm (which could fairly be considered the most important paper ever published in the illustrious JPE). This theory says that firms are defined by the assets they control. If we can contract on every future state of the world, then this control shouldn’t matter, but when unforeseen contingencies arise, the firm still has “residual control” of its capital. Therefore, efficiency depends on the allocation of scarce residual control rights, and hence the allocation of these rights inside or outside of a firm are important. Now that is a theory of the firm – one well-specified and based on incentives – that I can understand. (An interesting sidenote: when people think economists don’t really understand the economy because, hey, they’re not rich, we can at least point to Sandy Grossman. Sandy, a very good theorist, left academia to start his own firm, and as far as I know, he is now a billionaire!)

Now you may notice one problem with Grossman, Hart and Moore’s papers. As there was an assumption of nebulous transaction costs in Coase and his followers, there is a nebulous assumption of “incomplete contracts” in GHM. This seems reasonable at first glance: there is no way we could possibly write a contract that covers every possible contingency or future state of the world. I have to imagine everyone that has ever rented an apartment or leased a car or ran a small business has first-hand experience with the nature of residual control rights when some contingency arises. Here is where Tirole comes in. Throughout the 80s and 90s, Tirole wrote many papers using incomplete contracts: his 1994 paper with Aghion on contracts for R&D is right within this literature. In complete contracting, the courts can verify and enforce any contract that relies on observable information, though adverse selection (hidden information by agents) or moral hazard (unverifiable action by agents) may still exist. Incomplete contracting further restricts the set of contracts to a generally simple set of possibilities. In the late 1990s, however, Tirole, along with his fellow Nobel winner Eric Maskin, realized in an absolute blockbuster of a paper that there is a serious problem with these incomplete contracts as usually modeled.

Here is why: even if we can’t ex-ante describe all the future states of the world, we may still ex-post be able to elicit information about the payoffs we each get. As Tirole has noted, firms do not care about indescribable contingencies per se; they only care about how those contingencies affect their payoffs. That means that, at an absolute minimum, the optimal “incomplete contract” better be at least as good as the optimal contract which conditions on elicited payoffs. These payoffs may be stochastic realizations of all of our actions, of course, and hence this insight might not actually mean we can first-best efficiency when the future is really hard to describe. Maskin and Tirole’s 1999 paper shows, incredibly, that indescribability of states is irrelevant, and that even if we can’t write down a contract on states of the world, we can contract on payoff realizations in a way that is just as good as if we could actually write the complete contract.

How could this be? Imagine (here via a simple example of Maskin’s) two firms contracting for R&D. Firm 1 exerts effort e1 and produces a good with value v(e1). Firm 2 invests in some process that will lower the production cost of firm 1’s new good, investing e2 to make production cost equal to c(e2). Payoffs, then, are u1(p-c(e2)-e1) and u2(v(e1)-p-e2). If we knew u1 and u2 and could contract upon it, then the usual Nash implementation literature tells us how to generate efficient levels of e1 and e2 (call them e1*, e2*) by writing a contract: if the product doesn’t have the characteristics of v(e1*) and the production process doesn’t have the characteristics of c(e2*), then we fine the person who cheated. If effort generated stochastic values rather than absolute ones, the standard mechanism design literature tells us exactly when we can still get the first best.

Now, what if v and c are state-dependent, and there are huge number of states of the world? That is, efficient e1* and e2* are now functions of the state of the world realized after we write the initial contract. Incomplete contracting assumed that we cannot foresee all the possible v and c, and hence won’t write a contract incorporating all of them. But, aha!, we can still write a contract that says, look, whatever happens tomorrow, we are going to play a game tomorrow where I say what my v is and you say what your c is. It turns out that there exists such a game which generates truthful revelation of v and c (Maskin and Tirole do this using an idea similar to that of the subgame implementation literature, but the exact features are not terribly important for our purposes). Since the only part of the indescribable state I care about is the part that affects my payoffs, we are essentially done: no matter how many v and c’s there could be in the future, as long as I can write a contract specifying how we each get other to truthfully say what those parameters are, this indescribability doesn’t matter.

Whoa. That is a very, very, very clever insight. Frankly, it is convincing enough that the only role left for property rights theories of the firm are some kind of behavioral theory which restricts even contracts of the Maskin-Tirole sense – and since these contracts are quite a bit simpler in some way than the hundreds of pages of legalese which we see in a lot of real-world contracts on important issues, it’s not clear that bounded rationality or similar theories will get us far.

Where to go from here? Firms, and organizations more generally, exist. I am sure the reason has to do with incentives. But exactly why – well, we still have a lot of work to do in explaining why. And Tirole has played a major role in explaining why.

Tirole’s Walras-Bowley lecture, published in Econometrica in 1999, is a fairly accessible introduction to his current view of incomplete contracts. He has many other fantastic papers, across a wide variety of topics. I particularly like his behavioral theory written mainly with Roland Benabou; see, for instance, their 2003 ReStud on when monetary rewards are bad for incentives.

“Aggregation in Production Functions: What Applied Economists Should Know,” J. Felipe & F. Fisher (2003)

Consider a firm that takes heterogeneous labor and capital inputs L1, L2… and K1, K2…, using these to produce some output Y. Define a firm production function Y=F(K1, K2…, L1, L2…) as the maximal output that can be produced using the given vector of outputs – and note the implicit optimization condition in that definition, which means that production functions are not simply technical relationships. What conditions are required to construct an aggregated production function Y=F(K,L), or more broadly to aggregate across firms an economy-wide production function Y=F(K,L)? Note that the question is not about the definition of capital per se, since defining “labor” is equally problematic when man-hours are clearly heterogeneous, and this question is also not about the more general capital controversy worries, like reswitching (see Samuelson’s champagne example) or the dependence of the return to capital on the distribution of income which, itself, depends on the return to capital.

(A brief aside: on that last worry, why the Cambridge UK types and their modern day followers are so worried about the circularity of the definition of the interest rate, yet so unconcerned about the exact same property of the object we call “wage”, is quite strange to me, since surely if wages equal marginal product, and marginal product in dollars is a function of aggregate demand, and aggregate demand is a function of the budget constraint determined by wages, we are in an identical philosophical situation. I think it’s pretty clear that the focus on “r” rather than “w” is because of the moral implications of capitalists “earning their marginal product” which are less than desirable for people of a certain political persuasion. But I digress; let’s return to more technical concerns.)

It turns out, and this should be fairly well-known, that the conditions under which factors can be aggregated are ridiculously stringent. If we literally want to add up K or L when firms use different production functions, the condition (due to Leontief) is that the marginal rate of substitution between different types of factors in one aggregation, e.g. capital, does not depend on the level of factors not in that aggregation, e.g. labor. Surely this is a condition that rarely holds: how much I want to use, in an example due to Solow, different types of trucks will depend on how much labor I have at hand. A follow-up by Nataf in the 1940s is even more discouraging. Assume every firm uses homogenous labor, every firm uses capital which though homogenous within each firms differs across firms, and every firm has identical constant returns to scale production technology. When can I now write an aggregate production function Y=F(K,L) summing up the capital in each firm K1, K2…? That aggregate function exists if and only if every firm’s production function is additively separable in capital and labor (in which case, the aggregation function is pretty obvious)! Pretty stringent, indeed.

Fisher helps things just a bit in a pair of papers from the 1960s. Essentially, he points out that we don’t want to aggregate for all vectors K and L, but rather we need to remember that production functions measure the maximum output possible when all inputs are used most efficiently. Competitive factor markets guarantee that this assumption will hold in equilibrium. That said, even assuming only one type of labor, efficient factor markets, and a constant returns to scale production function, aggregation is possible if and only if every firm has the same production function Y=F(b(v)K(v),L), where v denotes a given firm and b(v) is a measure of how efficiently capital is employed in that firm. That is, aside from capital efficiency, every firm’s production function must be identical if we want to construct an aggregate production function. This is somewhat better than Nataf’s result, but still seems highly unlikely across a sector (to say nothing of an economy!).

Why, then, do empirical exercises using, say, aggregate Cobb-Douglas seem to give such reasonable parameters, even though the above theoretical results suggest that parameters like “aggregate elasticity of substitution between labor and capital” don’t even exist? That is, when we estimate elasticities or total factor productivities from Y=AK^a*L^b, using some measure of aggregated capital, what are we even estimating? Two things. First, Nelson and Winter in their seminal book generate aggregate date which can almost perfectly be fitted using Cobb-Douglas even though their model is completely evolutionary and does not even involve maximizing behavior by firms, so the existence of a “good fit” alone is, and this should go without saying, not great evidence in support of a model. Second, since ex-post production Y must equal the wage bill plus the capital payments plus profits, Felipe notes that this identity can be algebraically manipulated to Y=AF(K,L) where the form of F depends on the nature of the factor shares. That is, the good fit of Cobb-Douglas or CES can simply reflect an accounting identity even when nothing is known about micro-level elasticities or similar.

So what to do? I am not totally convinced we should throw out aggregate production functions – it surely isn’t a coincidence that Solow residuals for TFP match are estimated to be high in places where our intuition says technological change has been rapid. Because of results like this, it doesn’t strike me that aggregate production functions are measuring arbitrary things. However, if we are using parameters from these functions to do counterfactual analysis, we really ought know better exactly what approximations or assumptions are being baked into the cake, and it doesn’t seem that we are quite there yet. Until we are, a great deal of care should be taken in assigning interpretations to estimates based on aggregate production models. I’d be grateful for any pointers in the comments to recent work on this problem.

Final published version (RePEc IDEAS. The “F. Fisher” on this paper is the former Clark Medal winner and well-known IO economist Franklin Fisher; rare is it to find a nice discussion of capital issues written by someone who is firmly part of the economics mainstream and completely aware of the major theoretical results from “both Cambridges”. Tip of the cap to Cosma Shalizi for pointing out this paper.

On Gary Becker

Gary Becker, as you must surely know by now, has passed away. This is an incredible string of bad luck for the University of Chicago. With Coase and Fogel having passed recently, and Director, Stigler and Friedman dying a number of years ago, perhaps Lucas and Heckman are the only remaining giants from Chicago’s Golden Age.

Becker is of course known for using economic methods – by which I mean constrained rational choice – to expand economics beyond questions of pure wealth and prices to question of interest to social science at large. But this contribution is too broad, and he was certainly not the only one pushing such an expansion; the Chicago Law School clearly was doing the same. For an economist, Becker’s principal contribution can be summarized very simply: individuals and households are producers as well as consumers, and rational decisions in production are as interesting to analyze as rational decisions in consumption. As firms must purchase capital to realize their productive potential, humans much purchase human capital to improve their own possible utilities. As firms take actions today which alter constraints tomorrow, so do humans. These may seem to be trite statements, but that are absolutely not: human capital, and dynamic optimization of fixed preferences, offer a radical framework for understanding everything from topics close to Becker’s heart, like educational differences across cultures or the nature of addiction, to the great questions of economics like how the world was able to break free from the dreadful Malthusian constraint.

Today, the fact that labor can augment itself with education is taken for granted, which is a huge shift in how economists think about production. Becker, in his Nobel Prize speech: “Human capital is so uncontroversial nowadays that it may be difficult to appreciate the hostility in the 1950s and 1960s toward the approach that went with the term. The very concept of human capital was alleged to be demeaning because it treated people as machines. To approach schooling as an investment rather than a cultural experience was considered unfeeling and extremely narrow. As a result, I hesitated a long time before deciding to call my book Human Capital, and hedged the risk by using a long subtitle. Only gradually did economists, let alone others, accept the concept of human capital as a valuable tool in the analysis of various economic and social issues.”

What do we gain by considering the problem of human capital investment within the household? A huge amount! By using human capital along with economic concepts like “equilibrium” and “private information about types”, we can answer questions like the following. Does racial discrimination wholly reflect differences in tastes? (No – because of statistical discrimination, underinvestment in human capital by groups that suffer discrimination can be self-fulfilling, and, as in Becker’s original discrimination work, different types of industrial organization magnify or ameliorate tastes for discrimination in different ways.) Is the difference between men and women in traditional labor roles a biological matter? (Not necessarily – with gains to specialization, even very small biological differences can generate very large behavioral differences.) What explains many of the strange features of labor markets, such as jobs with long tenure, firm boundaries, etc.? (Firm-specific human capital requires investment, and following that investment there can be scope for hold-up in a world without complete contracts.) The parenthetical explanations in this paragraph require completely different policy responses from previous, more naive explanations of the phenomena at play.

Personally, I find human capital most interesting in understanding the Malthusian world. Malthus conjectured the following: as productivity improved for some reason, excess food will appear. With excess food, people will have more children and population will grow, necessitating even more food. To generate more food, people will begin farming marginal land, until we wind up with precisely the living standards per capita that prevailed before the productivity improvement. We know, by looking out our windows, that the world in 2014 has broken free from Malthus’ dire calculus. But how? The critical factors must be that as productivity improves, population does not grow, or else grows slower than the continued endogenous increases in productivity. Why might that be? The quantity-quality tradeoff. A productivity improvement generates surplus, leading to demand for non-agricultural goods. Increased human capital generates more productivity on those goods. Parents have fewer kids but invest more heavily in their human capital so that they can work in the new sector. Such substitution is only partial, so in order to get wealthy, we need a big initial productivity improvement to generate demand for the goods in the new sector. And thus Malthus is defeated by knowledge.

Finally, a brief word on the origin of human capital. The idea that people take deliberate and costly actions to improve their productivity, and that formal study of this object may be useful, is modern: Mincer and Schultz in the 1950s, and then Becker with his 1962 article and famous 1964 book. That said, economists (to the chagrin of some other social scientists!) have treated humans as a type of capital for much longer. A fascinating 1966 JPE [gated] traces this early history. Petty, Smith, Senior, Mill, von Thunen: they all thought an accounting of national wealth required accounting for the productive value of the people within the nation, and 19th century economists frequently mention that parents invest in their children. These early economists made such claims knowing they were controversial; Walras clarifies that in pure theory “it is proper to abstract completely from considerations of justice and practical expediency” and to regard human beings “exclusively from the point of view of value in exchange.” That is, don’t think we are imagining humans as being nothing other than machines for production; rather, human capital is just a useful concept when discussing topics like national wealth. Becker, unlike the caricature where he is the arch-neoliberal, was absolutely not the first to “dehumanize” people by rationalizing decisions like marriage or education in a cost-benefit framework; rather, he is great because he was the first to show how powerful an analytical concept such dehumanization could be!

“X-Efficiency,” M. Perelman (2011)

Do people still read Leibenstein’s fascinating 1966 article “Allocative Efficiency vs. X-Efficiency”? They certainly did at one time: Perelman notes that in the 1970s, this article was the third-most cited paper in all of the social sciences! Leibenstein essentially made two points. First, as Harberger had previously shown, distortions like monopoly simply as a matter of mathematics can’t have large welfare impacts. Take monopoly. for instance. The deadweight loss is simply the change in price times the change in quantity supplied times .5 times the percentage of the economy run by monopolist firms. Under reasonable looking demand curves, those deadweight triangles are rarely going to be even ten percent of the total social welfare created in a given industry. If, say, twenty percent of the final goods economy is run by monopolists, then, we only get a two percent change in welfare (and this can be extended to intermediate goods with little empirical change in the final result). Why, then, worry about monopoly?

The reason to worry is Leibenstein’s second point: firms in the same industry often have enormous differences in productivity, and there is tons of empirical evidence that firms do a better job of minimizing costs when under the selection pressures of competition (Schmitz’ 2005 JPE on iron ore producers provides a fantastic demonstration of this). Hence, “X-inefficiency”, which Perelman notes is named after Tolstoy’s “X-factor” in the performance of armies from War and Peace, and not just just allocative efficiency may be important. Draw a simple supply-demand graph and you will immediately see that big “X-inefficiency rectangles” can swamp little Harberger deadweight loss triangles in their welfare implications. So far, so good. These claims, however, turned out to be incredibly controversial.

The problem is that just claiming waste is really a broad attack on a fundamental premise of economics, profit maximization. Stigler, in his well-named X-istence of X-efficiency (gated pdf), argues that we need to be really careful here. Essentially, he is suggesting that information differences, principal-agent contracting problems, and many other factors can explain dispersion in costs, and that we ought focus on those factors before blaming some nebulous concept called waste. And of course he’s correct. But this immediately suggests a shift from traditional price theory to a mechanism design based view of competition, where manager and worker incentives interact with market structure to produce outcomes. I would suggest that this project is still incomplete, that the firm is still too much of a black box in our basic models, and that this leads to a lot of misleading intuition.

For instance, most economists will agree that perfectly price discriminating monopolists have the same welfare impact as perfect competition. But this intuition is solely based on black box firms without any investigation of how those two market structures affect the incentive for managers to collect costly information of efficiency improvements, on the optimal labor contracts under the two scenarios, etc. “Laziness” of workers is an equilibrium outcome of worker contracts, management monitoring, and worker disutility of effort. Just calling that “waste” as Leibenstein does is not terribly effective analysis. It strikes me, though, that Leibenstein is correct when he implicitly suggests that selection in the marketplace is more primitive than profit maximization: I don’t need to know much about how manager and worker incentives work to understand that more competition means inefficient firms are more likely to go out of business. Even in perfect competition, we need to be careful about assuming that selection automatically selects away bad firms: it is not at all obvious that the efficient firms can expand efficiently to steal business from the less efficient, as Chad Syverson has rigorously discussed.

So I’m with Perelman. Yes, Leibenstein’s evidence for X-inefficiency was weak, and yes, he conflates many constraints with pure waste. But on the basic points – that minimized costs depend on the interaction of incentives with market structure instead of simply on technology, and that heterogeneity in measured firm productivity is critical to economic analysis – Leibenstein is far more convincing that his critics. And while Syverson, Bloom, Griffith, van Reenen and many others are opening up the firm empirically to investigate the issues Leibenstein raised, there is still great scope for us theorists to more carefully integrate price theory and mechanism problems.

Final article in JEP 2011 (RePEc IDEAS). As always, a big thumbs up to the JEP for making all of their articles ungated and free to read.

On Coase’s Two Famous Theorems

Sad news today that Ronald Coase has passed away; he was still working, often on the Chinese economy, at the incredible age of 102. Coase is best known to economists for two statements: that transaction costs explain many puzzles in the organization of society, and that pricing for durable goods presents a particular worry since even a monopolist selling a durable good needs to “compete” with its future and past selves. Both of these statements are horribly, horribly misunderstood, particularly the first.

Let’s talk first about transaction costs, as in “The Nature of the Firm” and “The Problem of Social Cost”, which are to my knowledge the most cited and the second most cited papers in economics. The Problem of Social Cost leads with its famous cattle versus crops example. A farmer wishes to grow crops, and a rancher wishes his cattle to roam where the crops grow. Should we make the rancher liable for damage to the crops (or restrain the rancher from letting his cattle roam at all!), or indeed ought we restrain the farmer from building a fence where the cattle wish to roam? Coase points out that in some sense both parties are causally responsible for the externality, that there is some socially efficient amount of cattle grazing and crop planting, and that if a bargain can be reached costlessly, then there is some set of side payments where the rancher and the farmer are both better off than having the crops eaten or the cattle fenced. Further, it doesn’t matter whether you give grazing rights to the cattle and force the farmer to pay for the “right” to fence and grow crops, or whether you give farming rights and force the rancher to pay for the right to roam his cattle.

This basic principle applies widely in law, where Coase had his largest impact. He cites a case where confectioner machines shake a doctor’s office, making it impossible for the doctor to perform certain examinations. The court restricts the ability of the confectioner to use the machine. But Coase points out that if the value of the machine to the confectioner exceeds the harm of shaking to the doctor, then there is scope for a mutually beneficial side payment whereby the machine is used (at some level) and one or the other is compensated. A very powerful idea indeed.

Powerful, but widely misunderstood. I deliberately did not mention property rights above. Coase is often misunderstood (and, to be fair, he does at points in the essay imply this misunderstanding) as saying that property rights are important, because once we have property rights, we have something that can “be priced” when bargaining. Hence property rights + externalities + no transaction costs should lead to no inefficiency if side payments can be made. Dan Usher famously argued that this is “either tautological, incoherent, or wrong”. Costless bargaining is efficient tautologically; if I assume people can agree on socially efficient bargains, then of course they will. The fact that side payments can be agreed upon is true even when there are no property rights at all. Coase says that “[i]t is necessary to know whether the damaging business is liable or not for damage since without the establishment of this initial delimitation of rights there can be no market transactions to transfer and recombine them.” Usher is correct: that statement is wrong. In the absence of property rights, a bargain establishes a contract between parties with novel rights that needn’t exist ex-ante.

But all is not lost for Coase. Because the real point of his paper begins with Section VI, not before, when he notes that the case without transaction costs isn’t the interesting one. The interesting case is when transaction costs make bargaining difficult. What you should take from Coase is that social efficiency can be enhanced by institutions (including the firm!) which allow socially efficient bargains to be reached by removing restrictive transaction costs, and particularly that the assignment of property rights to different parties can either help or hinder those institutions. One more thing to keep in mind about the Coase Theorem (which Samuelson famously argued was not a theorem at all…): Coase implicitly is referring to Pareto efficiency in his theorem, but since property rights are an endowment, we know from the Welfare Theorems that benefits exceeds costs is not sufficient for maximizing social welfare.

Let’s now consider the Coase Conjecture: this conjecture comes, I believe, from a very short 1972 paper, Durability and Monopoly. The idea is simple and clever. Let a monopolist own all of the land in the US. If there was a competitive market in land, the price per unit would be P and all Q units will be sold. Surely a monopolist will sell a reduced quantity Q2 less than Q at price P2 greater than P? But once those are sold, we are in trouble, since the monopolist still has Q-Q2 units of land. Unless the monopolist can commit to never sell that additional land, we all realize he will try to sell it sometime later, at a new maximizing price P3 which is greater than P but less than P2. He then still has some land left over, which he will sell even cheaper in the next period. Hence, why should anyone buy in the first period, knowing the price will fall (and note that the seller who discounts the future has the incentive to make the length between periods of price cutting arbitrarily short)? The monopolist with a durable good is thus unable to make rents. Now, Coase essentially never uses mathematical theorems in his papers, and you game theorists surely can see that there are many auxiliary assumptions about beliefs and the like running in the background here.

Luckily, given the importance of this conjecture to pricing strategies, antitrust, auctions, etc., there has been a ton of work on the problem since 1972. Nancy Stokey (article gated) has a famous paper written here at MEDS showing that the conjecture only holds strictly when the seller is capable of selling in continuous time and the buyers are updating beliefs continuously, though approximate versions of the conjecture hold when periods are discrete. Gul, Sonnenschein and Wilson flesh out the model more completely, generally showing the conjecture to hold in well-defined stationary equilibrium across various assumptions about the demand curve. McAfee and Wiseman show in a recent ReStud that even the tiniest amount of “capacity cost”, or a fee that must be paid in any period for X amount of capacity (i.e., the need to hire sales agents for the land), destroys the Coase reasoning. The idea is that in the final few periods, when I am selling to very few people, even a small capacity cost is large relative to the size of the market, so I won’t pay it; backward inducting, then, agents in previous periods know it is not necessarily worthwhile to wait, and hence they buy earlier at the higher price. It goes without saying that there are many more papers in the formal literature.

(Some final notes: Coase’s Nobel lecture is well worth reading, as it summarizes the most important thread in his work: “there [are] costs of using the pricing mechanism.” It is these costs that explain why, though markets in general have such amazing features, even in capitalist countries there are large firms run internally as something resembling a command state. McCloskey has a nice brief article which generally blames Stigler for the misunderstanding of Coase’s work. Also, while gathering some PDFs for this article, I was shocked to see that Ithaka, who run JSTOR, is now filing DMCA takedowns with Google against people who host some of these legendary papers (like “Problem of Social Cost”) on their academic websites. What ridiculousness from a non-profit that claims its mission is to “help the academic community use digital technologies to preserve the scholarly record.”)

“A Penny for your Quotes: Patent Citations and the Value of Innovation,” M. Trajtenberg (1990)

This is one of those classic papers where the result is so well-known I’d never bothered to actually look through the paper itself. Manuel Trajtenberg, in the late 1980s, wrote a great book about Computed Tomography, or CAT scans. He gathered exhaustive data on sales by year to hospitals across the US, the products/attributes available at any time, and the prices paid. Using some older results from economic theory, a discrete choice model can be applied to infer willingness-to-pay for various types of CAT scanners over time, and from there to infer the total social surplus being generated at any time. Even better, Trajtenberg was able to calculate the lifetime discounted value of innovations occurring during any given period by looking at the eventual diffusion path of those technologies; that is, if a representative consumer is willing to pay Y in 1981 for CAT scanner C, and the CAT scanner diffused to 50 percent market share over the next five years, we can integrate the willingness to pay over the diffusion curve to get a rough estimate of the social surplus generated. CAT innovations during their heyday (roughly the 1970s, before MRI began to diffuse) generated about 17 billion dollars of surplus in 1982 dollars.

That alone is interesting, but Trajtenberg takes this fact one step further. There has long been a debate about whether patent citations tell you much about actual innovation. We know from a variety of sources that most important inventions are not patented, that many low-quality inventions of little social value are patented, and that patents are used in enormously different ways depending on market structure. Since Trajtenberg has an actual measure of social welfare created by newly-introduced products in each period, and a measure of industry R&D in each period, and a measure counting patents issued in CT in each period (nearly 500 in total), he can actually check: is patenting activity actually correlated with socially beneficial innovation?

The answer, it turns out, is no. A count of patents, at any reasonable lag and any restriction to “core” CT firms or otherwise, never has a correlation with change in total social value of more than .13. On the other hand, patents lagged five months has a correlation of .933 with industry R&D. No surprise, R&D appears to buy patents at a pretty constant rate, but not to buy important breakthroughs. This doesn’t, however, mean patent data is worthless to the analyst. Instead of looking at patents, we can look at citation-weighted patents. A patent that gets cited 10 times is surely more important than one which is issued and never heard from again. Weighing patents by citation count, the correlation between the number of weighted patents (lagged a few months to give products time to reach the market) and total social welfare created is in the area of .75! This result has been confirmed many, many, many times since Trajtenberg’s paper. Harhoff et al (1999) found, using survey data, that each single patent citation for highly-cited patents is a signal that the patent has a additional private value of a million US dollars. Hall, Jaffe and Trajtenberg (2005) found that, using Tobin’s Q on stock market data holding firm R&D and total number of patents constant, an additional patent citation improves firm value by an average of 3%.

Final 1990 RAND copy (IDEAS page).

“Price Formation of Fish,” A.P Barten & L.J. Bettendorf (1989)

I came across this nice piece of IO in a recent methodological book by John Sutton, which I hope to cover soon. Sutton recalls Lionel Robbins’ famous Essay on the Nature of Significance of Economic Science. In that essay, Robbins claims the goal of the empirically-minded economist is to estimate stable (what we now call “structural”) parameters whose stability we know a priori from theory. (As an aside, it is tragic that Hurwicz’ 1962 “On the Structural Form of Interdependent Systems”, from which Robbins’ idea gets its modern treatment, is not freely available online; all I see is a snippet from the conference volume it appeared at here). Robbins gives the example of an empiricist trying to estimate the demand for haddock by measuring prices and quantities each day, controlling for weather and the like, and claiming that the average elasticity has some long-run meaning; this, he says, is a fool’s errand.

Sutton points out how interesting that example is: if anything, fish are an easy good to examine! They are a good with easy-to-define technical characteristics sold in competitive wholesale markets. Barten and Bettendorf point out another interesting property: fish are best described by an inverse demand system, where consumers determine the price paid as a function of the quantity of fish in the market rather than vice versa, since quantity in the short run is essentially fixed. To the theorist, there is no difference between demand and inverse demand, but to the empiricist, that little error term must be added to the exogenous variables if we are to handle statistical variation correctly. Any IO economist worth their salt knows how to estimate common demand systems like AIDS, but how should we interpret parameters in inverse demand systems?

Recall that, in theory, Marshallian demand is a homogeneous of degree zero function of total expenditures and prices. Using the homogeneity, we have that the vector quantity demand q is a function of P, the fraction of total expenditure paid for each unit of each good. Inverting that function gives P as a function of q. Since inverse demand is the result of a first-order condition from utility maximization, we can restate P as a function of marginal utilities and quantities. Taking the derivative of P, with some judicious algebra, one can state the (normalized) inverse demand as the sum of moves along an indifference surface and moves across indifference surfaces; in particular, dP=gP’dq+Gdq, where g is a scalar and G is an analogue of the Slutsky matrix for inverse demand, symmetric and negative semidefinite. All we need to do know is to difference our data and estimate that system (although the authors do a bit more judicious algebra to simplify the computational estimation).

One more subtle step is required. When we estimate an inverse demand system, we may wish to know how substitutable or complementary any two goods are. Further, we want such an estimate to be invariant to arbitrary monotone increasing changes in an underlying utility function (the form of which is not assumed here). It turns out that Allais (in his 1943 text on “pure economics” which, as far as I know, is yet to be translated!) has shown how to construct just such a measure. Yet another win for theory, and for Robbins’ intuition: it is hopeless to atheoretically estimate cross-price elasticities or similar measures of substitutability atheoretically, since these parameters are determined simultaneously. It is only as a result of theory (here, nothing more than “demand comes from utility maximizers” is used) that we can even hope to tease out underlying parameters like these elasticities. The huge numbers of “reduced-form” economists these days who do not understand what the problem is here really need to read through papers of this type; atheoretical training is, in my view, a serious danger to the grand progress made by economics since Haavelmo and Samuelson.

It is the methodology that is important here; the actual estimates are secondary. But let’s state them anyway: the fish sold in the Belgian markets are quite own-price elastic, have elasticities that are consistent with demand-maximizing consumers, and have patterns of cross-price elasticities across fish varieties that are qualitatively reasonable (bottom-feeders are highly substitutable with each other, etc.) and fairly constant across a period of two decades.

Final version in EER (No IDEAS version). This paper was in the European Economic Review, an Elsevier journal that is quickly being killed off since the European Economic Association pulled out of their association with Elsevier to run their own journal, the JEEA. The editors of the main journal in environmental economics have recently made the same type of switch, and of course, a group of eminent theorist made a similar exit when Theoretical Economics began. Jeff Ely has recently described how TE came about; that example makes it quite clear that journals are actually quite inexpensive to run. Even though we economists are lucky to have nearly 100% “green” open access, where preprints are self-archived by authors, we still have lots of work to do to get to a properly ungated world. The Econometric Society, for example, spends about $900,000 for all of its activities aside from physically printing journals, a cost that could still be recouped in an open access world. Much of that is for running conferences, giving honoraria, etc, but let us be very conservative and estimate no income is received aside from subscriptions to its three journals, including archives. This suggests that a complete open access journal and archives for the 50 most important journals in the field requires, very conservatively, revenue of $15 million per year, and probably much less. This seems a much more effective use of NSF and EU moneys that funding a few more graduate research assistants.

“Without Consent or Contract,” R. W. Fogel (1989)

Word comes that Bob Fogel, an absolute giant in economic history and a Nobel Prize winner, passed away today. I first encountered Fogel in a class a decade or so ago taught by Robert Margo, another legendary scholar of the economics of American history.

Fogel’s most famous contribution is summarized in the foreword to the very readable Without Consent or Contract. “Although the slave system was horribly retrogressive in its social, political, and ideological aspects, it was quite advanced by the standards of the time in its technology and economic organization. The paradox is only apparent…because the paradox rests on the widely held assumption that technological efficiency is inherently good. It is this beguiling assumption that is false and, when applied to slavery, insidious.”

Roughly, it was political change alone, not economic change, which could have led to the end of slavery in America. The plantation system was, in fact, a fairly efficient system in the economic sense, and was not in danger of petering out on its own accord. Evidence on this point was laid out in technical detail in Fogel and Engerman’s “Time on the Cross”. In that text, evidence from an enormous number of sources is brought to bear on the value of a slave over time; McCloskey has called Fogel “a carpenter of history…measure, measure again, measure again.” The idea that the economic effects of history can be (and are) wildly different from the moral or political effects remains misunderstood; Melissa Dell’s wonderful paper on the Peruvian mita is a great example of a terrible social policy which nonetheless had positive long-run economic effects. As historians disdain “Whig history”, the idea that things improve as time marches on, economists ought disdain “Whig economics”, the idea that growth-inducing policies are somehow linked to moral ones.

There is much beyond the slavery research, of course. In one of the most famous papers in economic history, Fogel studied the contribution of the railroad to American economic growth (Google has this at only 86 citations; how is such a low number possible?). He notes that, as economists, we should care about the marginal benefit, not the absolute benefit, of the railroad. In the absence of rail, steamboats and canals were still possible (and would likely have been built in the midwest). He famously claims that the US would have reached its income in January 1890 by the end of March 1890 had there been no rail at all, a statement very much contrary to traditional historical thinking.

Fogel’s later life was largely devoted to his project on the importance of improved nutrition and its interaction with economic growth, particularly since the 1700s. If you’ve not seen these statistics, it is amazing just how short and skinny the average human was before the modern era. There has been an enormous debate over the relative role of nutrition, vis-a-vis technologies, knowledge like germ theory, or embodied or diffused knowledge, in the increased stature of man: Angus Deaton summarizes the literature nicely. In particular, my read is that the thesis whereby better nutrition causes a great rise in human incomes is on fairly shaky ground, though the debate is by no means settled.

Amazon has Without Consent or Contract for sale for under 15 bucks, well worth it. Some quick notes: Fogel was by no means a lone voice in cliometrics; for example, Conrad and Meyer in a 1958 JPE make very much the same point as Fogel concerning the economic success of slavery, using tools from capital theory in the argument. Concerning the railroad, modern work suggests Fogel may have understated its importance. Donaldson and Hornbeck, two of the best young economic historians in the world, use some developments in modern trade theory to argue that increased market access due to rail, measured as market access is capitalized into farmland, was far more important to GDP growth than Fogel suggested.

Paul Samuelson’s Contributions to Welfare Economics, K. Arrow (1983)

I happened to come across a copy of a book entitled “Paul Samuelson and Modern Economic Theory” when browsing the library stacks recently. Clear evidence of his incredible breadth are in the section titles: Arrow writes about his work on social welfare, Houthhaker on consumption theory, Patinkin on money, Tobin on fiscal policy, Merton on financial economics, and so on. Arrow’s chapter on welfare economics was particularly interesting. This book comes from the early 80s, which is roughly the end of social welfare as a major field of study in economics. I was never totally clear on the reason for this – is it simply that Arrow’s Possibility Theorem, Sen’s Liberal Paradox, and the Gibbard-Satterthwaite Theorem were so devastating to any hope of “general” social choice rules?

In any case, social welfare is today little studied, but Arrow mentions a number of interesting results which really ought be better known. Bergson-Samuelson, conceived when the two were in graduate school together, is rightfully famous. After a long interlude of confused utilitarianism, Pareto had us all convinced that we should dismiss cardinal utility and interpersonal utility comparisons. This seems to suggest that all we can say about social welfare is that we should select a Pareto-optimal state. Bergson and Samuelson were unhappy with this – we suggest individuals should have preferences which represent an order (complete and transitive) over states, and the old utilitarians had a rule which imposed a real number for society’s value of any state (hence an order). Being able to order states from a social point of view seems necessary if we are to make decisions. Some attempts to extend Pareto did not give us an order. (Why is an order important? Arrow does not discuss this, but consider earlier attempts at extending Pareto like Kaldor-Hicks efficiency: going from state s to state s’ is KH-efficient if there exist ex-post transfers under which the change is Paretian. Let person a value the bundle (1,1)>(2,0)>(1,0)>all else, and person b value the bundle (1,1)>(0,2)>(0,1)>all else. In state s, person a is allocated (2,0) and person b (0,1). In state s’, person a is allocated (1,0) and person b is allocated (0,2). Note that going from s to s’ is a Kaldor-Hicks improvement, but going from s’ to s is also a Kaldor-Hicks improvement!)

Bergson and Samuelson wanted to respect individual preferences – society can’t prefer s to s’ if s’ is a Pareto improvement on s in the individual preference relations. Take the relation RU. We will say that sRUs’ if all individuals weakly prefer s to s’. Not that though RU is not complete, it is transitive. Here’s the great, and non-obvious, trick. The Polish mathematician Szpilrajn has a great 1930 theorem which says that if R is a transitive relation, then there exists a complete relation R2 which extends R; that is, if sRs’ then sR2s’, plus we complete the relation by adding some more elements. This is not a terribly easy proof, it turns out. That is, there exists social welfare orders which are entirely ordinal and which respect Pareto dominance. Of course, there may be lots of them, and which you pick is a problem of philosophy more than economics, but they exist nonetheless. Note why Arrow’s theorem doesn’t apply: we are starting with given sets of preferences and constructing a social preference, rather than attempting to find a rule that maps any individual preferences into a social rule. There have been many papers arguing that this difference doesn’t matter, so all I can say is that Arrow himself, in this very essay, accepts that difference completely. (One more sidenote here: if you wish to start with individual utility functions, we can still do everything in an ordinal way. It is not obvious that every indifference map can be mapped to a utility function, and not even true without some type of continuity assumption, especially if we want the utility functions to themselves be continuous. A nice proof of how we can do so using a trick from probability theory is in Neuefeind’s 1972 paper, which was followed up in more generality by Mount and Reiter here at MEDS then by Chichilnisky in a series of papers. Now just sum up these mapped individual utilities, and I have a Paretian social utility function which was constructed entirely in an ordinal fashion.)

Now, this Bergson-Samuelson seems pretty unusable. What do we learn that we don’t know from a naive Pareto property? Here are two great insights. First, choose any social welfare function from the set we have constructed above. Let individuals have non-identical utility functions. In general, there is no social welfare function which is maximized by always keeping every individual’s income identical in all states of the world! The proof of this is very easy if we use Harsanyi’s extension of Bergson-Samuelson: if agents are Expected Utility maximizers, than any B-S social welfare function can be written as the weighted linear combination of individual utility functions. As relative prices or the social production possibilities frontier changes, the weights are constant, but the individual marginal utilities are (generically) not. Hence if it was socially optimal to endow everybody with equal income before the relative price change, it (generically) is not later, no matter which Pareto-respecting measure of social welfare your society chooses to use! That is, I think, an astounding result for naive egalitarianism.

Here’s a second one. Surely any good economist knows policies should be evaluated according to cost-benefit analysis. If, for instance, the summed willingness-to-pay for a public good exceeds the cost of the public good, then society should buy it. When, however, does a B-S social welfare function allow us to make such an inference? Generically, such an inference is only possible if the distribution of income is itself socially optimal, since willingness-to-pay depends on the individual budget constraints. Indeed, even if demand estimation or survey evidence suggests that there is very little willingness-to-pay for a public good, society may wish to purchase the good. This is true even if the underlying basis for choosing the particular social welfare function we use has nothing at all to do with equity, and further since the B-S social welfare function respects individual preferences via the Paretian criterion, the reason we build the public good also has nothing to do with paternalism. Results of this type are just absolutely fundamental to policy analysis, and are not at all made irrelevant by the impossibility results which followed Arrow’s theorem.

This is a book chapter, so I’m afraid I don’t have an online version. The book is here. Arrow is amazingly still publishing at the age of 91; he had an interesting article with the underrated Partha Dasgupta in the EJ a couple years back. People claim that relative consumption a la Veblen matters in surveys. Yet it is hard to find such effects in the data. Why is this? Assume I wish to keep up with the Joneses when I move to a richer place. If I increase consumption today, I am decreasing savings, which decreases consumption even more tomorrow. How my desire to change consumption today if I have richer peers then depends on that dynamic tradeoff, which Arrow and Dasgupta completely characterize.

“Returns to Scale in Research & Development: What Does the Schumpeterian Hypothesis Imply?,” F. Fisher & P. Temin (1973)

Schumpeter famously argued for the economic importance of market power. Even though large firms cause static inefficiency, they had dynamic benefits in that large firms demand more invention since they can extract more revenue from each new product. Further, they supply more invention, Schumpeter hypothesized, since the rate of invention has increasing returns to scale in the number of inventors, and in the number of other employees at the firm. (Axioms A and B). The second part of that statement may be for many reasons; for instance, if the output of a research project could be many potential products, a larger firm has the ability to capitalize on many of those new projects, whereas a small firm might have more limited complementary capabilities. Often, this hypothesis has been tested by checking whether larger firms are more research intensive, meaning that larger firms have a higher percentage of their workforce doing research (Hypothesis 1). Alternatively, a direct reading of Schumpeter is that a 1% increase in the non-research staff of a firm leads to a more than 1% increase in total R&D output of a firm, where output is just the number of research workers times each worker’s average output as a function of firm size (Hypothesis 2).

And here is where theory comes into play. Are axioms A and B necessary or sufficient for either hypothesis 1 or 2? If they don’t imply hypothesis 1, then the idea of testing the Schumpeterian axioms about increasing returns to scale by examining researcher employment is wrong-headed. If they don’t imply hypothesis 2, then Schumpeter’s qualitative argument is incomplete in the first place. Fisher and Temin (that’s Franklin Fisher and Peter Temin, two guys who, it goes without saying, have had quite some careers since they wrote this paper in the early 70s!) show that, in fact, for both hypotheses the axioms are neither necessary nor sufficient.

An even more basic problem wasn’t noticed by Fisher and Temin, but instead was pointed out by Carlos Rodriguez in a 1979 comment. If Axiom 1 holds, and the average product per researcher is increasing in the number of researchers, then marginal product always exceeds average product. If market equilibrium means I pay all research workers their marginal product, then I will be making a loss if I operate at the “optimal” quantity. Hence I will hire no research workers at all. So step one to interpreting Schumpeter, then, is to restate his two axioms. A weaker condition might be that if the number of research and the number of nonresearch workers increase at the same rate, then average product per research worker is increasing. This is implied by Axioms A and B, but doesn’t rely on always-increasing average product per research worker (Axiom C). This is good for checking our two hypotheses, since anything that would have been implied by Axioms A and B is still implied by our more theoretically-grounded axiom C.

So what does our axiom imply about the link between research staff size and firm size? Unsurprisingly, nothing at all! Surely the optimal quantity of research workers depends on the marginal product of more research workers as firm size grows, and not on the average product of those workers. Let’s prove it. Let F(R,S) is the average product per research worker as a function of R, the number of researchers, and S, the number of other employees at the firm. I hire research workers as long as their marginal product exceeds the researcher wage rate. The marginal product of total research output is the derivative of R*F(R,S) with respect to R, or F+R*dF/dR. As S increases, this marginal product goes up if and only if dF/dS+R*dF^2/dRdS>0. That is, I hire more research workers in equilibrium if my non-research staff is bigger according to a function that depends on the second derivative of the average output per researcher. But my axioms had only to do with the first derivative! Further, if dF/dS+R*dF^2/dRdS>0, then larger firms have a larger absolute number of scientists than smaller firms, but this implication is completely independent of the Schumpeterian axioms. What’s worse, even that stronger assumption involving the second derivative does not imply anything about the share of research workers on the staff.

The moral is the same one you were probably taught you first day of economics class: using reasoning about averages to talk about equilibrium behavior, so dependent on marginals, can lead you astray very quickly!

1971 working paper; the final version was published in JPE 1973 (IDEAS). Related to the comment by Rodriguez, Fisher and Temin point out here that the problem with increasing returns to scale does not ruin their general intuition, for the reasons I stated above. What about the empirics of Schumpeter’s prediction? Broadly, there is not much support for a link between firm size and research intensity, though the literature on this is quite contentious. Perhaps I will cover it in another post.

Follow

Get every new post delivered to your Inbox.

Join 204 other followers

%d bloggers like this: