Author Archives: afinetheorem

“The Limits of Price Discrimination,” D. Bergemann, B. Brooks and S. Morris (2013)

Rakesh Vohra, who much to the regret of many of us at MEDS has recently moved on to a new and prestigious position, pointed out a clever paper today by Bergemann, Brooks and Morris (the first and third names you surely know, the second is a theorist on this year’s market). Beyond some clever uses of linear algebra in the proofs, the results of the paper are in and of themselves very interesting. The question is the following: if a regulator, or a third party, can segment consumers by willingness-to-pay and provide that information to a monopolist, what are the effects on welfare and profits?

In a limited sense, this is an old question. Monopolies generate deadweight loss as they sell at a price above marginal cost. Monopolies that can perfectly price discriminate remove that deadweight loss but also steal all of the consumer surplus. Depending on your social welfare function, this may be a good or bad thing. When markets can be segmented (i.e., third degree price discrimination) with no chance of arbitrage, we know that monopolist profits are weakly higher since the uniform monopoly price could be maintained in both markets, but the effect on consumer surplus is ambiguous.

Bergemann et al provide two really interesting results. First, if you can choose the segmentation, it is always possible to segment consumers such that monopoly profits are just the profits gained under the uniform price, but quantity sold is nonetheless efficient. Further, there exist segmentations such that producer surplus P is anything between the uniform price profit P* and the perfect price discrimination profit P**, and such that consumer surplus plus consumer surplus P+C is anything between P* and P**! This seems like magic, but the method is actually pretty intuitive.

Let’s generate the first case, where producer profit is the uniform price profit P* and consumer surplus is maximal, C=P**-P*. In any segmentation, the monopolist can always charge P* to every segment. So if we want consumers to capture all of the surplus, there can’t be “too many” high-value consumers in a segment, since otherwise the monopolist would raise their price above P*. Let there be 3 consumer types, with the total market uniformly distributed across the three, such that valuations are 1, 2 and 3. Let marginal cost be constant at zero. The profit-maximizing price is 2, earning the monopolist 2*(2/3)=4/3. But what if we tell the monopolist that consumers can either be Class A or Class B. Class A consists of all consumers with willingness-to-pay 1 and exactly enough consumers with WTP 2 and 3 that the monopolist is just indifferent between choosing price 1 or price 2 for Class A. Class B consists of the rest of the types 2 and 3 (and since the relative proportion of type 2 and 3 in this Class is the same as in the market as a whole, where we already know the profit maximizing price is 2 with only types 2 and 3 buying, the profit maximizing price remains 2 here). Some quick algebra shows that if Class A consists of all of the WTP 1 consumers and exactly 1/2 of the WTP 2 and 3 consumers, then the monopolist is indifferent between charging 1 and 2 to Class A, and charges 2 to Class B. Therefore, it is an equilibrium for all consumers to buy the good, the monopolist to earn uniform price profits P*, and consumer surplus to be maximized. The paper formally proves that this intuition holds for general assumptions about (possibly continuous) consumer valuations.

The other two “corner cases” for bundles of consumer and producer surplus are also easy to construct. Maximal producer surplus P** with consumer surplus 0 is simply the case of perfect price discrimination: the producer knows every consumer’s exact willingness-to-pay. Uniform price producer surplus P* and consumer surplus 0 is constructed by mixing the very low WTP consumers with all of the very high types (along with some subset of consumers with less extreme valuations), such that the monopolist is indifferent between charging the monopolist price or just charging the high type price so that everyone below the high type does not buy. Then mix the next highest WTP types with low but not quite as low WTP types, and continue iteratively. A simple argument based on a property of convex sets allows mixtures of P and C outside the corner cases; Rakesh has provided an even more intuitive proof than that given in the paper.

Now how do we use this result in policy? At a first pass, since information is always good for the seller (weakly) and ambiguous for the consumer, a policymaker should be particularly worried about bundlers providing information about willingness-to-pay that is expected to drastically lower consumer surplus while only improving rent extraction by sellers a small bit. More works needs to be done in specific cases, but the mathematical setup in this paper provides a very straightforward path for such applied analysis. It seems intuitive that precise information about consumers with willigness-to-pay below the monopoly price is unambiguously good for welfare, whereas information bundles that contain a lot of high WTP consumers but also a relatively large number of lower WTP consumers will lower total quantity sold and hence social surplus.

I am also curious about the limits of price discrimination in the oligopoly case. In general, the ability to price discriminate (even perfectly!) can be very good for consumers under oligopoly. The intuition is that under uniform pricing, I trade-off stealing your buyers by lowering prices against earning less from my current buyers; the ability to price discriminate allows me to target your buyers without worrying about the effect on my own current buyers, hence the reaction curves are steeper, hence consumer surplus tends to increase (see Section 7 of Mark Armstrong’s review of the price discrimination literature). With arbitrary third degree price discrimination, however, I imagine mathematics similar to that in the present paper could prove similarly elucidating.

2013 Working Paper (IDEAS version).

“On the Origin of States: Stationary Bandits and Taxation in Eastern Congo,” R. S. de la Sierra (2013)

The job market is yet again in full swing. I won’t be able to catch as many talks this year as I would like to, but I still want to point out a handful of papers that I consider particularly elucidating. This article, by Columbia’s de la Sierra, absolutely fits that category.

The essential question is, why do states form? Would that all young economists interested in development put their effort toward such grand questions! The old Rousseauian idea you learned your first year of college, where individuals come together voluntarily for mutual benefit, seems contrary to lots of historical evidence. Instead, war appears to be a prime mover for state formation; armed groups establish a so-called “monopoly on violence” in an area for a variety of reasons, and proto-state institutions evolve. This basic idea is widespread in the literature, but it is still not clear which conditions within an area lead armed groups to settle rather than to pillage. Further, examining these ideas empirically seems quite problematic for two reasons, first because states themselves are the ones who collect data hence we rarely observe anything before states have formed, and second, because most of the planet has long since been under the rule of a state (with apologies to James Scott!)

De la Sierra brings some economics to this problem. What is the difference between pillaging and sustained state-like forms? The pillager can only extract assets on its way through, while the proto-state can establish “taxes”. What taxes will it establish? If the goal is long-run revenue maximization, Ramsey long ago told us that it is optimal to tax elements that are inelastic. If labor can flee, but the output of the mine can not, then you ought tax the output of the mine highly and set a low poll tax. If labor supply is inelastic but output can be hidden from the taxman, then use a high poll tax. Thus, when will bandits form a state instead of just pillaging? When there is a factor which can be dynamically taxed at such a rate that the discounted tax revenue exceeds what can be pillaged today. Note that the ability to, say, restrict movement along roads, or to expand output through state-owned capital, changes relevant tax elasticities, so at a more fundamental level, capacity by rebels along these margins are also important (and I imagine that extending de la Sierra’s paper will involve the evolutionary development of these types of capacities).

This is really an important idea. It is not that there is a tradeoff between producing and pillaging. Instead, there is a three way tradeoff between producing in your home village, joining an armed group to pillage, and joining an armed group that taxes like a state! The armed group that taxes will, as a result of its desire to increase tax revenue, perhaps introduce institutions that increase production in the area under its control. And to the extent that institutions persist, short-run changes that cause potential bandits to form taxing relationships may actually lead to long-run increases in productivity in a region.

De la Sierra goes a step beyond theory, investigating these ideas empirically in the Congo. Eastern Congo during and after the Second Congo War was characterized by a number of rebel groups that occasionally just pillaged, but occasionally formed stable tax relationships with villages that could last for years. That is, the rebels occasionally implemented something looking like states. The theory above suggests that exogenous changes in the ability to extract tax revenue (over a discounted horizon) will shift the rebels from pillagers to proto-states. And, incredibly, there were a number of interesting exogenous changes that had exactly that effect.

The prices of coltan and gold both suffered price shocks during the war. Coltan is heavy, hard to hide, and must be shipped by plane in the absence of roads. Gold is light, easy to hide, and can simply be carried from the mine on jungle footpaths. When the price of coltan rises, the maximal tax revenue of a state increases since taxable coltan production is relatively inelastic. This is particularly true near airstrips, where the coltan can actually be sold. When the price of gold increases, the maximal tax revenue does not change much, since gold is easy to hide, and hence the optimal tax is on labor rather than on output. An exogenous rise in coltan prices should encourage proto-state formation in areas with coltan, then, while an exogenous rise is gold prices should have little impact on the pillage vs. state tradeoff. Likewise, a government initiative to root out rebels (be they stationary or pillaging) decreases the expected number of years a proto-state can extract rents, hence makes pillaging relatively more lucrative.

How to confirm these ideas, though, when there was no data collected on income, taxes, labor supply, or proto-state existence? Here is the crazy bit – 11 locals were hired in Eastern Congo to travel to a large number of villages, spend a week there querying families and village elders about their experiences during the war, the existence of mines, etc. The “state formation” in these parts of Congo is only a few years in the past, so it is at least conceivable that memories, suitably combined, might actually be reliable. And indeed, the data do seem to match aggregate trends known to monitors of the war. What of the model predictions? They all seem to hold, and quite strongly: the ability to extract more tax revenue is important for proto-state formation, and areas where proto-states existed do appear to have retained higher productive capacity years later perhaps as a result of the proto-institutions those states developed. Fascinating. Even better, because there is a proposed mechanism rather than an identified treatment effect, we can have some confidence that this course is, to some extent, externally valid!

December 2013 working paper (No IDEAS page). You may wonder what a study like this costs (particularly if you are, like me, a theorist using little more than chalk and a chalkboard); I have no idea, but de la Sierra’s CV lists something like a half million dollars of grants, an incredible total for a graduate student. On a personal level, I spent a bit of time in Burundi a number of years ago, including visiting a jungle camp where rebels from the Second Congo War were still hiding. It was pretty amazing how organized even these small groups were in the areas they controlled; there was nothing anarchic about it.

Dale Mortensen as Micro Theorist

Northwestern’s sole Nobel Laureate in economics, Dale Mortensen, passed overnight; he remained active as a teacher and researcher over the past few years, though I’d be hearing word through the grapevine about his declining health over the past few months. Surely everyone knows Mortensen the macroeconomist for his work on search models in the labor market. There is something odd here, though: Northwestern has really never been known as a hotbed of labor research. To the extent that researchers rely on their coworkers to generate and work through ideas, how exactly did Mortensen became such a productive and influential researcher?

Here’s an interpretation: Mortensen’s critical contribution to economics is as the vector by which important ideas in micro theory entered real world macro; his first well-known paper is literally published in a 1970 book called “Microeconomic Foundations of Employment and Inflation Theory.” Mortensen had the good fortune to be a labor economist working in the 1970s and 1980s at a school with a frankly incredible collection of microeconomic theorists; during those two decades, Myerson, Milgrom, Loury, Schwartz, Kamien, Judd, Matt Jackson, Kalai, Wolinsky, Satterthwaite, Reinganum and many others were associated with Northwestern. And this was a rare condition! Game theory is everywhere today, and pioneers in that field (von Neumann, Nash, Blackwell, etc.) were active in the middle of the century. Nonetheless, by the late 1970s, game theory in the social sciences was close to dead. Paul Samuelson, the great theorist, wrote essentially nothing using game theory between the early 1950s and the 1990s. Quickly scanning the American Economic Review from 1970-1974, I find, at best, one article per year that can be called game-theoretic.

What is the link between Mortensen’s work and developments in microeconomic theory? The essential labor market insight of search models (an insight which predates Mortensen) is that the number of hires and layoffs is substantial even in the depth of a recession. That is, the rise in the unemployment rate cannot simply be because the marginal revenue of the potential workers is always less than the cost, since huge numbers of the unemployed are hired during recessions (as others are fired). Therefore, a model which explains changes in churn rather than changes in the aggregate rate seems qualitatively important if we are to develop policies to address unemployment. This suggests that there might be some use in a model where workers and firms search for each other, perhaps with costs or other frictions. Early models along this line by Mortensen and others were generally one-sided and hence non-strategic: they had the flavor of optimal stopping problems.

Unfortunately, Diamond in a 1971 JET pointed out that Nash equilibrium in two-sided search leads to a conclusion that all workers are paid their reservation wage: all employers pay the reservation wage, workers believe this to be true hence do not engage in costly search to switch jobs, hence the belief is accurate and nobody can profitably deviate. Getting around the “Diamond Paradox” involved enriching the model of who searches when and the extent to which old offers can be recovered; Mortensen’s work with Burdett is a nice example. One also might ask whether laissez faire search is efficient or not: given the contemporaneous work of micro theorists like Glenn Loury on mathematically similar problems like the patent race, you might imagine that efficient search is unlikely.

Beyond the efficiency of matches themselves is the question of how to split surplus. Consider a labor market. In the absence of search frictions, Shapley (first with Gale, later with Shubik) had shown in the 1960s and early 1970s the existence of stable two-sided matches even when “wages” are included. It turns out these stable matches are tightly linked to the cooperative idea of a core. But what if this matching is dynamic? Firms and workers meet with some probability over time. A match generates surplus. Who gets this surplus? Surely you might imagine that the firm should have to pay a higher wage (more of the surplus) to workers who expect to get good future offers if they do not accept the job today. Now we have something that sounds familiar from non-cooperative game theory: wage is based on the endogenous outside options of the two parties. It turns out that noncooperative game theory had very little to say about bargaining until Rubinstein’s famous bargaining game in 1982 and the powerful extensions by Wolinsky and his coauthors. Mortensen’s dynamic search models were a natural fit for those theoretic developments.

I imagine that when people hear “microfoundations”, they have in mind esoteric calibrated rational expectations models. But microfoundations in the style of Mortensen’s work is much more straightforward: we simply cannot understand even the qualitative nature of counterfactual policy in the absence of models that account for strategic behavior. And thus the role for even high micro theory, which investigates the nature of uniqueness of strategic outcomes (game theory) and the potential for a planner to improve welfare through alternative rules (mechanism design). Powerful tools indeed, and well used by Mortensen.

Tunzelmann and the Nature of Social Savings from Steam

Research Policy, the premier journal for innovation economists, recently produced a symposium on the work of Nick von Tunzelmann. Tunzelmann is best known for exploring the social value of the invention of steam power. Many historians had previously granted great importance to the steam engine as a driver of the Industrial Revolution. However, as with Fogel’s argument that the railroad was less important to the American economy than previously believed (though see Donaldson and Hornbeck’s amendment claiming that market access changes due to rail were very important), the role of steam in the Industrial Revolution may have been overstated.

This is surprising. To my mind, the four most important facts for economics to explain is why the world economy (in per capita terms) stagnated until the early 1800s, why cumulative per-capita growth began then in a corner of Northwest Europe, why growth at the frontier has continued to the present, and why growth at the frontier has been so consistent over this period. The consistency is really surprising, given that individual non-frontier country growth rates, and World GDP growth, has vacillated pretty wildly on a decade-by-decade basis.

Malthus’ explanation still explains the first puzzle best. But there remain many competing explanations for how exactly the Malthusian trap was broken. The idea that a thrifty culture or expropriation of colonies was critical sees little support from economic historians; as McCloskey writes, “Thrifty self-discipline and violent expropriation have been too common in human history to explain a revolution utterly unprecedented in scale and unique to Europe around 1800.” The problem, more generally, of explaining a large economic X on the basis of some invention/program/institution Y is that basically everything in the economic world is a complement. Human capital absent good institutions has little value, modern management techniques absent large markets is ineffective, etc. The problem is tougher when it comes to inventions. Most “inventions” that you know of have very little immediate commercial importance, and a fair ex-post reckoning of the critical parts of the eventual commercial product often leaves little role for the famous inventor.

What Tunzelmann and later writers in his tradition point out is that even though Watt’s improvement to the steam engine was patented in 1769, steam produces less horsepower than water in the UK as late as 1830, and in the US as late as the Civil War. Indeed, even today, hydropower based on the age-old idea of the turbine is still an enormous factor in the siting of electricity-hungry industries. It wasn’t until the invention of high-pressure steam engines like the Lancanshire boiler in the 1840s that textile mills really saw steam power as an economically viable source of energy. Most of the important inventions in the textile industry were designed originally for non-steam power sources.

The economic historian Nicholas Crafts supports Tunzelmann’s original argument against the importance of steam using a modern growth accounting framework. Although the cost of steam power fell rapidly following Watt, and especially after the Corliss engine in the mid 19th century, steam was still a relatively small part of economy until the mid-late 19th century. Therefore, even though productivity growth within steam was quick, only a tiny portion of overall TFP growth in the early Industrial Revolution can be explained by steam. Growth accounting exercises have a nice benefit over partial equilibrium social savings calculations because the problem that “everything is a complement” is taken care of so long as you believe the Cobb-Douglas formulation.

The December 2013 issue of Research Policy (all gated) is the symposium on Tunzelmann. For some reason, Tunzelmann’s “Steam Power and British Industrialization Until 1860″ is quite expensive used, but any decent library should have a copy.

Site Note: 2014 Job Market

Apologies for the slow rate of posting throughout the fall. Though I have a big backlog of posts coming, the reason for the delay has been that I’ve spent the last few months preparing for the job market. As the academic job market is essentially a matching problem, below I describe my basic research program; if you happen to be a reader of the blog on the “demand side” from an economics department, business school or policy school, and my research looks like it might be a good fit, I would be eternally grateful for a quick email (k-bryan AT kellogg.northwestern.edu).

Most broadly, I study the process of innovation. Innovation (inclusive of the diffusion of technology) is fundamental to growth, and growth is fundamental to improving human welfare. What, then, could be a more exciting topic to study? Methodologically, I tend to find theory the most useful way to answer the questions I want to answer. Because the main benefit of theory is generalizability (to counterfactuals, welfare estimates, and the like), I try to ensure that my theory is well grounded both by using detailed historical examples within the papers, and by drawing heavily on existing empirical work by economists, historians and sociologists. Beyond innovation, I have a side interest in pure theory and in the history of thought; both these areas provide the “tools” that an applied theorist uses.

Recently, I’ve worked primarily on two questions: why do firms work on the type of research they do, and how does government policy affect the diffusion of invention? On the first question, I have three papers.

My coauthor Jorge Lemus and I have developed an analytically tractable model of direction choice in invention, where there are many inventions available at any time, and successful invention by some firm affects which research targets become available next. We shut down all sources of inefficient firm behavior in the existing literature, and still find three sources of inefficiency generated by direction choice alone. We fully characterize how this inefficiency operates on a number of “invention graphs”. This is actually a pretty cool model which is really easy to use if you are familiar with the patent race literature.

In my job market paper, I use the invention graph model to study how government R&D works when firms may have distorted directional incentives. The principal result is a bit sobering: many policies like patents and R&D tax credits that are effective at increasing the rate of invention on socially valuable projects will, in general, exacerbate distortions on the direction of invention. Essentially, firms may distort to projects that are easy even though those projects are not terribly profitable. With this type of distortion, any policy that increases the payoff to R&D generally will increase the payoff of the inefficient research target by a larger percentage than the payoff of the efficient research target. I show how these policy distortions may have operated in the early nuclear reactor industry, where historians have long worried that the eventual dominant technology, light water reactors, was ex-ante inefficient.

My third paper on directional inefficiency is more purely historical. How can a country invent a pioneer technology but wind up having no important firms in the commercial industry building on that technology? I suggest that commercial products are made up of a set of inventions. A nation may have within its borders everything necessary for a technological breakthrough, but lack comparative advantage in the remaining steps between that breakthrough and a commercial product; roughly, Teece’s complementary assets operates at a national and not only a firm level. Worse, patent policy can give pioneer inventors incentives to limit the diffusion of knowledge necessary for the eventual commercial product. I show how these ideas operate in the airplane industry, where ten years after the Wright Brothers’ first flight, there was essentially no domestic production of US-designed planes.

On diffusion more purely, I have three papers in progress. The first, which we are preparing for a medical journal and hence cannot put online, suggests that open access to medical research makes that research much more likely to be used in an eventual invention. My coauthor Yasin Ozcan and I generate this result from a dataset that merges every research article in 46 top medical journals since 2005 with every US patent application since that date; if you’ve ever worked with the raw patent application files, for which there is no standard citation practice, you will understand the data challenge here! We have a second paper in the works taking this merged dataset back to the 1970s, with the goal of providing a better descriptive understanding of the trends in lab-to-market diffusion. My third paper on diffusion is more theoretical. I consider processes that diffuse simultaneously across multiple networks (or a “multigraph”): inventions may diffuse via trade routes or pure geography, recessions may spread across geography or the input-output chain, diseases may spread via sexual or non-sexual contact, etc. I provide an axiomatic measure of the “importance” of each network even when only a single instance of the diffusion can be seen, and show how this measure can answer counterfactual questions like “what if the diffusion began in a different place?”

My CV, a teaching statement, and an extended research statement can be found on my academic website. I am quite excited about the research program above; I would love to chat if you are at a convivial department filled with bright students and curious academics that is hiring this year.

“Identifying Technology Spillovers and Product Market Rivalry,” N. Bloom, M. Schankerman & J. Van Reenen (2013)

R&D decisions are not made in a vacuum: my firm both benefits from information about new technologies discovered by others, and is harmed when other firms create new products that steal from my firm’s existing product lines. Almost every workhorse model in innovation is concerned with these effects, but measuring them empirically, and understanding how they interact, is difficult. Bloom, Schankerman and van Reenen have a new paper with a simple but clever idea for understanding these two effects (and it will be no surprise to readers given how often I discuss their work that I think these three are doing some of the world’s best applied micro work these days).

First, note that firms may be in the same technology area but not in the same product area; Intel and Motorola work on similar technologies, but compete on very few products. In a simple model, firms first choose R&D, knowledge is produced, and then firms compete on the product market. The qualitative results of this model are as you might expect: firms in a technology space with many other firms will be more productive due to spillovers, and may or may not actually perform more R&D depending on the nature of diminishing returns in the knowledge production function. Product market rivalry is always bad for profits, does not affect productivity, and increases R&D only if research across firms is a strategic complement; this strategic complementarity could be something like a patent race model, where if firms I compete with are working hard trying to invent the Next Big Thing, then I am incentivized to do even more R&D so I can invent first.

On the empirical side, we need a measure of “product market similarity” and “technological similarity”. Let there be M product classes and N patent classes, and construct vectors for each firm of their share of sales across product classes and share of R&D across patent classes. There are many measures of the similarity of a vector, of course, including a well-known measure in innovation from Jaffe. Bloom et al, after my heart, note that we really ought use measures that have proper axiomatic microfoundations; though they do show the properties of a variety of measures of similarity, they don’t actually show the existence (or impossibility) of their optimal measure of similarity. This sounds like a quick job for a good microtheorist.

With similarity measures, all that’s left to do is run regressions of technological and product market similarity, as well as all sorts of fixed effects, on outcomes like R&D performed, productivity (measured using patents or out of a Cobb-Douglas equation) and market value (via the Griliches-style Tobin’s Q). These guys know their econometrics, so I’m omitting many details here, but I should mention that they do use the idea from Wilson’s 2009 ReSTAT of basically random changes in state R&D tax laws as an IV for the cost of R&D; this is a great technique, and very well implemented by Wilson, but getting these state-level R&D costs is really challenging and I can easily imagine a future where the idea is abused by naive implementation.

The results are actually pretty interesting. Qualitatively, the empirical results look quite like the theory, and in particular, the impact of technological similarity looks really important; having lots of firms working on similar technologies but working in different industries is really good for your firm’s productivity and profits. Looking at a handful of high-tech sectors, Bloom et al estimate that the marginal social return on R&D is on the order of 40 percentage points higher than the marginal private return of R&D, implying (with some huge caveats) that R&D in the United States might be something like 3 times smaller than it ought to be. This estimate is actually quite similar to what researchers using other methods have estimated. Interestingly, since bigger firms tend to work in more dense parts of the technology space, they tend to generate more spillovers, hence the common policy prescription of giving smaller firms higher R&D tax credits may be a mistake.

Two caveats. As far as I can tell, the model does not allow a role for absorptive capacity, where firm’s ability to integrate outside knowledge is endogenous to their existing R&D stock. Second, the estimated marginal private rate of return on R&D is something like 20 percent for the average firm; many other papers have estimated very high private benefits from research, but I have a hard time interpreting these estimates. If there really are 20% rates of return lying around, why aren’t firms cranking up their research? At least anecdotally, you hear complaints from industries like pharma about low returns from R&D. Third, there are some suggestive comments near the end about how government subsidies might be used to increase R&D given these huge social returns. I would be really cautious here, since there is quite a bit of evidence that government-sponsored R&D generates a much lower private and social rate of return that the other forms of R&D.

Final July 2013 Econometrica version (IDEAS version). Thumbs up to Nick Bloom for making the final version freely available on his website. The paper has an exhaustive appendix with technical details, as well as all of the data freely available for you to play with.

“Maximal Revenue with Multiple Goods: Nonmonotonicity and Other Observations,” S. Hart and P. Reny (2013)

One of the great theorems in economics is Myerson’s optimal auction when selling one good to any number of buyers with private valuations of the good. Don’t underrate this theorem just because it is mathematically quite simple: the question it answers is how to sell any object, using any mechanism (an auction, a price, a bargain, a two step process involving eliciting preferences first, and so on). That an idea as straightforward as the revelation principle makes that question possible to answer is truly amazing.

Myerson’s paper was published 32 years ago. Perhaps more amazing that the simplicity of the Myerson auction is how difficult it has been to derive similar rules for selling more than one good at a time, selling goods more than once, or selling goods when the sellers compete. The last two have well known problems that make analysis difficult: dynamic mechanisms suffer from the “ratchet effect”: a buyer won’t reveal information if she knows a seller will use it in subsequent sales, and competing mechanisms can have an IR constraint which is generated as an equilibrium condition. Hart and Reny, in a new note, show some great examples of the difficulty with the first difficult mechanism problem, selling multiple goods. In particular, increases in the distribution of private values (in the sense of first order stochastic dominance) can lower the optimal revenue, and randomized mechanisms can raise it.

Consider first increases in the private value distribution. This is strange: if for any state of the world, I value your goods more, it seems reasonable that there is “more surplus to extract” for the seller. And indeed, not only does the Myerson optimal auction with a single good have the property that increases in private values lead to increased seller revenue, but Hart and Reny show that any incentive-compatible mechanism with a single good has this property.

(This is actually very easy to prove, and a property I’d never seen stated for the general IC case, so I’ll show it here. If q(x) is the probability I get the good given the buyer’s reported type x, and s(x) is the seller revenue given this report, then incentive compatibility when the true type is x requires that

q(x)x-s(x)>=q(y)x-s(y)

and incentive compatibility when the true type is y requires that

q(y)y-s(y)>=q(x)y-s(x)

Combining these inequalities gives

(q(x)-q(y))x>=s(x)-s(y)>=(q(x)-q(y))y

Hence when x>y, q(x)-q(y)>=0, therefore s(x)>=s(y) for all x and y, therefore a fosd shift in the buyer valuations increases the expected value of seller revenue s. Nice!)

When selling multiple goods, however, this nice property doesn’t hold. Why not? Imagine the buyer might have values for (one unit of each of) 2 goods I am trying to sell of (1,1), (1,2), (2,2) or (2,3). Imagine (2,3) is a very common set of private values. If I knew this buyer’s type, I would extract 5 from him, though if I price the bundle including one of each good at 5, then none of the other types will buy. Also, if I want to extract 5 from type (2,3), then I also can’t sell unit 1 alone for less than 2 or unit 2 alone for less than 3, in which case the only other buyer will be (2,2) buying good 1 alone for price 2. Let’s try lowering the price of the bundle to 4. Now, satisfying (2,3)’s incentive compatibility constraints, we can charge as little as 1 for the first good bought by itself and 2 for the second good bought by itself: at those prices, (2,3) will buy the bundle, (2,2) will buy the first good for 1, and (1,2) will buy the second good for 2. This must look strange to you already: when the buyer’s type goes up from (1,2) to (2,2), the revenue to the seller falls from 2 to 1! And it turns out the prices and allocations described are the optimal mechanism when (2,2) is much less common than the other types. Essentially, across the whole distribution of private values the most revenue can be extracted by selling the second good, so the optimal way to satisfy IC constraints involves making the IC tightest for those whose relative value of the second good is high. Perhaps we ought call the rents (2,2) earns in this example “uncommon preference rents”!

Even crazier is that an optimal sale of multiple goods might involve using random mechanisms. It is easy to show that with a single good, a random mechanism (say, if you report your value for the good is 1 dollar, the mechanism assigns you the good with probability .5 for a payment of 50 cents) does no better than a deterministic mechanism. A footnote in the Hart and Reny paper credits Aumann for the idea that this is actually pretty strange: a mechanism is a sequential game where the designer moves first. It is intuitive that being able to randomize would be useful in these types of situations; in a matching pennies game, I would love to be able to play .5 heads and .5 tails when moving first! But the optimal mechanism with a single good does not have this property, for an intuitive reason. Imagine I will sell the good for X. Every type with private value below X does not buy, and those with types V>=X earn V-X in information rents. Offering to sell the good with probability .5 for price X/2 does not induce anybody new to buy, and selling with probability .5 for price Y less than X/2 causes some of the types close to X to switch to buying the lottery, lowering the seller revenue. Indeed, it can be verified that the revenue from the lottery just described is exactly the revenue from a mechanism which offered to sell the good with probability 1 for price 2Y<X.

With multiple goods, however, let high and low type buyers be equally common. Let the high type buyer be indifferent between buying two goods for X, buying the first good only for Y, and buying the second good only for Z (where IC requires that X generate the most seller revenue as long as buyer values are nonnegative). Let the low type value only the first good, at value W less than Y. How can I sell to the low type without violating the high type’s IC constraints? Right now, if the high type buyer has values V1 and V2 for the two goods, the indifference assumptions means V1+V2-X=V1-Y=V2-Z. Can I offer a fractional sale (with probability p) of the first good at price Y2 such that pW-Y2>=0, yet pV1-Y2<V1+V2-X=V1-Y? Sure. The low value buyer is just like the low value buyers in the single good case, but the high value buyer dislikes fractional sales because in order to buy the lottery, she is giving up her purchase of the second good. Giving up buying the second good costs her V2 in welfare whether a fraction or the whole of good 1 is sold, but the benefit of deviating is lower with the lottery.

April 2013 working paper (IDEAS version). Update: Sergiu notes there is a new version as of December 21 on his website. (Don’t think the paper is all bad news for deriving general properties of optimal mechanisms. In addition to the results above, Hart and Reny also show a nice result about mechanisms where there are multiple optimal responses by buyers, some good for the seller and some less so. It turns out that whenever you have a mechanism of this type, there is another mechanism that uniquely generates revenue arbitrarily close to the seller-optimal from among those multiple potential buyer actions in the first mechanism.)

“Is Knowledge Trapped Inside the Ivory Tower?,” M. Bikard (2013)

Simultaneous discovery, as famously discussed by Merton, is really a fascinating idea. On the one hand, we have famous examples like Bell and Gray sending in patents for a telephone on exactly the same day. On the other hand, when you investigate supposed examples of simultaneous discovery more closely, it is rarely the case that the discoveries are that similar. The legendary Jacob Schmookler described – in a less-than-politically-correct way! – historians who see patterns of simultaneous discovery everywhere as similar to tourists who think “all Chinamen look alike.” There is sufficient sociological evidence today that Schmookler largely seems correct: simultaneous discovery, like “lucky” inventions, are much less common than the man on the street believes (see, e.g., Simon Schaeffer’s article on the famous story of the dragon dream and the invention of benzene for a typical reconstruction of how “lucky” inventions actually happen).

Michaƫl Bikard thinks we are giving simultaneous discovery too little credit as a tool for investigating important topics in the economics of innovation. Even if simultaneous discovery is uncommon, it still exists. If there were an automated process to generate a large set of simultaneous inventions (on relatively more minor topics than the telephone), there are tons of interesting questions we can answer, since we would have compelling evidence of the same piece of knowledge existing in different places at the same time. For instance, how important are agglomeration economies? Does a biotech invention get developed further if it is invented on Route 128 in Massachusetts instead of in Lithuania?

Bikard has developed an automated process to do this (and that linked paper also provides a nice literature review concerning simultaneous discovery). Just scrape huge number of articles and their citations, look for pairs of papers which were published at almost the same time and cited frequently in the future, and then limit further to articles which have a “Jaccard index” which implies that they are frequently cited together if they are cited at all. Applying this technique to the life sciences, he finds 578 examples of simultaneous discovery; chatting with a randomly selected sample of the researchers, most mentioned the simultaneous discovery without being asked, though at least one claimed his idea had been stolen! 578 is a ton: this is more than double the number that the historical analysis in Merton discovered, and as noted, many of the Merton multiples are not really examples of simultaneous discovery at all.

He then applies this dataset in a second paper, asking whether inventions in academia are used more often (because of the culture of openness) or whether private sector inventions are used more often in follow-up inventions (because the control rights can help even follow-up inventors extract rents). It turns out that private-sector inventors of the identical invention are three times more likely to patent, but even excluding the inventors themselves, the private sector inventions are cited 10-20% more frequently in future patents. The sample size of simultaneous academic-private discovery is small, so this evidence is only suggestive. You might imagine that the private sector inventors are more likely to be colocated near other private sector firms in the same area; we think that noncodified aspects of knowledge flow locally, so it wouldn’t be surprising that the private sector multiple was cited more often in future patents.

Heavy caveats are also needed on the sample. This result certainly doesn’t suggest that, overall, private sector workers are doing more “useful” work than Ivory Tower researchers, since restricting the sample to multiple discoveries limits the potential observations to areas where academia and the private sector are working on the same type of discovery. Certainly, academics and the private sector often work on different types of research, and openness is probably more important in more basic discoveries (where transaction or bargaining costs on follow-up uses are more distortionary). In any case, the method for identifying simultaneous discoveries is quite interesting indeed; if you are empirically minded, there are tons of interesting questions you could investigate with such a dataset.

September 2012 working paper (No IDEAS version). Forthcoming in Management Science.

“Back to Basics: Basic Research Spillovers, Innovation Policy and Growth,” U. Akcigit, D. Hanley & N. Serrano-Velarde (2013)

Basic and applied research, you might imagine, differ in a particular manner: basic research has unexpected uses in a variety of future applied products (though it sometimes has immediate applications), while applied research is immediately exploitable but has fewer spillovers. An interesting empirical fact is that a substantial portion of firms report that they do basic research, though subject to a caveat I will mention at the end of this post. Further, you might imagine that basic and applied research are complements: success in basic research in a given area expands the size of the applied ideas pond which can be fished by firms looking for new applied inventions.

Akcigit, Hanley and Serrano-Velarde take these basic facts and, using some nice data from French firms, estimate a structural endogenous growth model with both basic and applied research. Firms hire scientists then put them to work on basic or applied research, where the basic research “increases the size of the pond” and occasionally is immediately useful in a product line. The government does “Ivory Tower” basic research which increases the size of the pond but which is never immediately applied. The authors give differential equations for this model along a balanced growth path, have the government perform research equal to .5% of GDP as in existing French data, and estimate the remaining structural parameters like innovation spillover rates, the mean “jump” in productivity from an innovation, etc.

The pretty obvious benefit of structural models as compared to estimating simple treatment effects is counterfactual analysis, particularly welfare calculations. (And if I may make an aside, the argument that structural models are too assumption-heavy and hence non-credible is nonsense. If the mapping from existing data to the actual questions of interest is straightforward, then surely we can write a straightforward model generating that external validity. If the mapping from existing data to the actual question of interest is difficult, then it is even more important to formally state what mapping you have in mind before giving policy advice. Just estimating a treatment effect off some particular dataset and essentially ignoring the question of external validity because you don’t want to take a stand on how it might operate makes me wonder why I, the policymaker, should take your treatment effect seriously in the first place. It seems to me that many in the profession already take this stance – Deaton, Heckman, Whinston and Nevo, and many others have published papers on exactly this methodological point – and therefore a decade from now, you will find it equally as tough to publish a paper that doesn’t take external validity seriously as it is to publish a paper with weak internal identification today.)

Back to the estimates: the parameters here suggest that the main distortion is not that firms perform too little R&D, but that they misallocate between basic and applied R&D; the basic R&D spills over to other firms by increasing the “size of the pond” for everybody, hence it is underperformed. This spillover, estimated from data, is of substantial quantitative importance. The problem, then, is that uniform subsidies like R&D tax credits will just increase total R&D without alleviating this misallocation. I think this is a really important result (and not only because I have a theory paper myself, coming at the question of innovation direction from the patent race literature rather than the endogenous growth literature, which generates essentially the same conclusion). What you really want to do to increase welfare is increase the amount of basic research performed. How to do this? Well, you could give heterogeneous subsidies to basic and applied research, but this would involve firms reporting correctly, which is a very difficult moral hazard problem. Alternatively, you could just do more research in academia, but if this is never immediately exploited, it is less useful than the basic research performed in industry which at least sometimes is used in products immediately (by assumption); shades of Aghion, Dewatripont and Stein (2008 RAND) here. Neither policy performs particularly well.

I have two small quibbles. First, basic research in the sense reported by national statistics following the Frascati manual is very different from basic research in the sense of “research that has spillovers”; there is a large literature on this problem, and it is particularly severe when it comes to service sector work and process innovation. Second, the authors suggest at one point that Bayh-Dole style university licensing of research is a beneficial policy: when academic basic research can now sometimes be immediately applied, we can easily target the optimal amount of basic research by increasing academic funding and allowing academics to license. But this prescription ignores the main complaint about Bayh-Dole, which is that academics begin, whether for personal or institutional reasons, to shift their work from high-spillover basic projects to low-spillover applied projects. That is, it is not obvious the moral hazard problem concerning targeting of subsidies is any easier at the academic level than at the private firm level. In any case, this paper is very interesting, and well worth a look.

September 2013 Working Paper (RePEc IDEAS version).

“X-Efficiency,” M. Perelman (2011)

Do people still read Leibenstein’s fascinating 1966 article “Allocative Efficiency vs. X-Efficiency”? They certainly did at one time: Perelman notes that in the 1970s, this article was the third-most cited paper in all of the social sciences! Leibenstein essentially made two points. First, as Harberger had previously shown, distortions like monopoly simply as a matter of mathematics can’t have large welfare impacts. Take monopoly. for instance. The deadweight loss is simply the change in price times the change in quantity supplied times .5 times the percentage of the economy run by monopolist firms. Under reasonable looking demand curves, those deadweight triangles are rarely going to be even ten percent of the total social welfare created in a given industry. If, say, twenty percent of the final goods economy is run by monopolists, then, we only get a two percent change in welfare (and this can be extended to intermediate goods with little empirical change in the final result). Why, then, worry about monopoly?

The reason to worry is Leibenstein’s second point: firms in the same industry often have enormous differences in productivity, and there is tons of empirical evidence that firms do a better job of minimizing costs when under the selection pressures of competition (Schmitz’ 2005 JPE on iron ore producers provides a fantastic demonstration of this). Hence, “X-inefficiency”, which Perelman notes is named after Tolstoy’s “X-factor” in the performance of armies from War and Peace, and not just just allocative efficiency may be important. Draw a simple supply-demand graph and you will immediately see that big “X-inefficiency rectangles” can swamp little Harberger deadweight loss triangles in their welfare implications. So far, so good. These claims, however, turned out to be incredibly controversial.

The problem is that just claiming waste is really a broad attack on a fundamental premise of economics, profit maximization. Stigler, in his well-named X-istence of X-efficiency (gated pdf), argues that we need to be really careful here. Essentially, he is suggesting that information differences, principal-agent contracting problems, and many other factors can explain dispersion in costs, and that we ought focus on those factors before blaming some nebulous concept called waste. And of course he’s correct. But this immediately suggests a shift from traditional price theory to a mechanism design based view of competition, where manager and worker incentives interact with market structure to produce outcomes. I would suggest that this project is still incomplete, that the firm is still too much of a black box in our basic models, and that this leads to a lot of misleading intuition.

For instance, most economists will agree that perfectly price discriminating monopolists have the same welfare impact as perfect competition. But this intuition is solely based on black box firms without any investigation of how those two market structures affect the incentive for managers to collect costly information of efficiency improvements, on the optimal labor contracts under the two scenarios, etc. “Laziness” of workers is an equilibrium outcome of worker contracts, management monitoring, and worker disutility of effort. Just calling that “waste” as Leibenstein does is not terribly effective analysis. It strikes me, though, that Leibenstein is correct when he implicitly suggests that selection in the marketplace is more primitive than profit maximization: I don’t need to know much about how manager and worker incentives work to understand that more competition means inefficient firms are more likely to go out of business. Even in perfect competition, we need to be careful about assuming that selection automatically selects away bad firms: it is not at all obvious that the efficient firms can expand efficiently to steal business from the less efficient, as Chad Syverson has rigorously discussed.

So I’m with Perelman. Yes, Leibenstein’s evidence for X-inefficiency was weak, and yes, he conflates many constraints with pure waste. But on the basic points – that minimized costs depend on the interaction of incentives with market structure instead of simply on technology, and that heterogeneity in measured firm productivity is critical to economic analysis – Leibenstein is far more convincing that his critics. And while Syverson, Bloom, Griffith, van Reenen and many others are opening up the firm empirically to investigate the issues Leibenstein raised, there is still great scope for us theorists to more carefully integrate price theory and mechanism problems.

Final article in JEP 2011 (RePEc IDEAS). As always, a big thumbs up to the JEP for making all of their articles ungated and free to read.

Follow

Get every new post delivered to your Inbox.

Join 169 other followers

%d bloggers like this: