Category Archives: Mechanisms

“The Limits of Price Discrimination,” D. Bergemann, B. Brooks and S. Morris (2013)

Rakesh Vohra, who much to the regret of many of us at MEDS has recently moved on to a new and prestigious position, pointed out a clever paper today by Bergemann, Brooks and Morris (the first and third names you surely know, the second is a theorist on this year’s market). Beyond some clever uses of linear algebra in the proofs, the results of the paper are in and of themselves very interesting. The question is the following: if a regulator, or a third party, can segment consumers by willingness-to-pay and provide that information to a monopolist, what are the effects on welfare and profits?

In a limited sense, this is an old question. Monopolies generate deadweight loss as they sell at a price above marginal cost. Monopolies that can perfectly price discriminate remove that deadweight loss but also steal all of the consumer surplus. Depending on your social welfare function, this may be a good or bad thing. When markets can be segmented (i.e., third degree price discrimination) with no chance of arbitrage, we know that monopolist profits are weakly higher since the uniform monopoly price could be maintained in both markets, but the effect on consumer surplus is ambiguous.

Bergemann et al provide two really interesting results. First, if you can choose the segmentation, it is always possible to segment consumers such that monopoly profits are just the profits gained under the uniform price, but quantity sold is nonetheless efficient. Further, there exist segmentations such that producer surplus P is anything between the uniform price profit P* and the perfect price discrimination profit P**, and such that consumer surplus plus consumer surplus P+C is anything between P* and P**! This seems like magic, but the method is actually pretty intuitive.

Let’s generate the first case, where producer profit is the uniform price profit P* and consumer surplus is maximal, C=P**-P*. In any segmentation, the monopolist can always charge P* to every segment. So if we want consumers to capture all of the surplus, there can’t be “too many” high-value consumers in a segment, since otherwise the monopolist would raise their price above P*. Let there be 3 consumer types, with the total market uniformly distributed across the three, such that valuations are 1, 2 and 3. Let marginal cost be constant at zero. The profit-maximizing price is 2, earning the monopolist 2*(2/3)=4/3. But what if we tell the monopolist that consumers can either be Class A or Class B. Class A consists of all consumers with willingness-to-pay 1 and exactly enough consumers with WTP 2 and 3 that the monopolist is just indifferent between choosing price 1 or price 2 for Class A. Class B consists of the rest of the types 2 and 3 (and since the relative proportion of type 2 and 3 in this Class is the same as in the market as a whole, where we already know the profit maximizing price is 2 with only types 2 and 3 buying, the profit maximizing price remains 2 here). Some quick algebra shows that if Class A consists of all of the WTP 1 consumers and exactly 1/2 of the WTP 2 and 3 consumers, then the monopolist is indifferent between charging 1 and 2 to Class A, and charges 2 to Class B. Therefore, it is an equilibrium for all consumers to buy the good, the monopolist to earn uniform price profits P*, and consumer surplus to be maximized. The paper formally proves that this intuition holds for general assumptions about (possibly continuous) consumer valuations.

The other two “corner cases” for bundles of consumer and producer surplus are also easy to construct. Maximal producer surplus P** with consumer surplus 0 is simply the case of perfect price discrimination: the producer knows every consumer’s exact willingness-to-pay. Uniform price producer surplus P* and consumer surplus 0 is constructed by mixing the very low WTP consumers with all of the very high types (along with some subset of consumers with less extreme valuations), such that the monopolist is indifferent between charging the monopolist price or just charging the high type price so that everyone below the high type does not buy. Then mix the next highest WTP types with low but not quite as low WTP types, and continue iteratively. A simple argument based on a property of convex sets allows mixtures of P and C outside the corner cases; Rakesh has provided an even more intuitive proof than that given in the paper.

Now how do we use this result in policy? At a first pass, since information is always good for the seller (weakly) and ambiguous for the consumer, a policymaker should be particularly worried about bundlers providing information about willingness-to-pay that is expected to drastically lower consumer surplus while only improving rent extraction by sellers a small bit. More works needs to be done in specific cases, but the mathematical setup in this paper provides a very straightforward path for such applied analysis. It seems intuitive that precise information about consumers with willigness-to-pay below the monopoly price is unambiguously good for welfare, whereas information bundles that contain a lot of high WTP consumers but also a relatively large number of lower WTP consumers will lower total quantity sold and hence social surplus.

I am also curious about the limits of price discrimination in the oligopoly case. In general, the ability to price discriminate (even perfectly!) can be very good for consumers under oligopoly. The intuition is that under uniform pricing, I trade-off stealing your buyers by lowering prices against earning less from my current buyers; the ability to price discriminate allows me to target your buyers without worrying about the effect on my own current buyers, hence the reaction curves are steeper, hence consumer surplus tends to increase (see Section 7 of Mark Armstrong’s review of the price discrimination literature). With arbitrary third degree price discrimination, however, I imagine mathematics similar to that in the present paper could prove similarly elucidating.

2013 Working Paper (IDEAS version).

“Maximal Revenue with Multiple Goods: Nonmonotonicity and Other Observations,” S. Hart and P. Reny (2013)

One of the great theorems in economics is Myerson’s optimal auction when selling one good to any number of buyers with private valuations of the good. Don’t underrate this theorem just because it is mathematically quite simple: the question it answers is how to sell any object, using any mechanism (an auction, a price, a bargain, a two step process involving eliciting preferences first, and so on). That an idea as straightforward as the revelation principle makes that question possible to answer is truly amazing.

Myerson’s paper was published 32 years ago. Perhaps more amazing that the simplicity of the Myerson auction is how difficult it has been to derive similar rules for selling more than one good at a time, selling goods more than once, or selling goods when the sellers compete. The last two have well known problems that make analysis difficult: dynamic mechanisms suffer from the “ratchet effect”: a buyer won’t reveal information if she knows a seller will use it in subsequent sales, and competing mechanisms can have an IR constraint which is generated as an equilibrium condition. Hart and Reny, in a new note, show some great examples of the difficulty with the first difficult mechanism problem, selling multiple goods. In particular, increases in the distribution of private values (in the sense of first order stochastic dominance) can lower the optimal revenue, and randomized mechanisms can raise it.

Consider first increases in the private value distribution. This is strange: if for any state of the world, I value your goods more, it seems reasonable that there is “more surplus to extract” for the seller. And indeed, not only does the Myerson optimal auction with a single good have the property that increases in private values lead to increased seller revenue, but Hart and Reny show that any incentive-compatible mechanism with a single good has this property.

(This is actually very easy to prove, and a property I’d never seen stated for the general IC case, so I’ll show it here. If q(x) is the probability I get the good given the buyer’s reported type x, and s(x) is the seller revenue given this report, then incentive compatibility when the true type is x requires that

q(x)x-s(x)>=q(y)x-s(y)

and incentive compatibility when the true type is y requires that

q(y)y-s(y)>=q(x)y-s(x)

Combining these inequalities gives

(q(x)-q(y))x>=s(x)-s(y)>=(q(x)-q(y))y

Hence when x>y, q(x)-q(y)>=0, therefore s(x)>=s(y) for all x and y, therefore a fosd shift in the buyer valuations increases the expected value of seller revenue s. Nice!)

When selling multiple goods, however, this nice property doesn’t hold. Why not? Imagine the buyer might have values for (one unit of each of) 2 goods I am trying to sell of (1,1), (1,2), (2,2) or (2,3). Imagine (2,3) is a very common set of private values. If I knew this buyer’s type, I would extract 5 from him, though if I price the bundle including one of each good at 5, then none of the other types will buy. Also, if I want to extract 5 from type (2,3), then I also can’t sell unit 1 alone for less than 2 or unit 2 alone for less than 3, in which case the only other buyer will be (2,2) buying good 1 alone for price 2. Let’s try lowering the price of the bundle to 4. Now, satisfying (2,3)’s incentive compatibility constraints, we can charge as little as 1 for the first good bought by itself and 2 for the second good bought by itself: at those prices, (2,3) will buy the bundle, (2,2) will buy the first good for 1, and (1,2) will buy the second good for 2. This must look strange to you already: when the buyer’s type goes up from (1,2) to (2,2), the revenue to the seller falls from 2 to 1! And it turns out the prices and allocations described are the optimal mechanism when (2,2) is much less common than the other types. Essentially, across the whole distribution of private values the most revenue can be extracted by selling the second good, so the optimal way to satisfy IC constraints involves making the IC tightest for those whose relative value of the second good is high. Perhaps we ought call the rents (2,2) earns in this example “uncommon preference rents”!

Even crazier is that an optimal sale of multiple goods might involve using random mechanisms. It is easy to show that with a single good, a random mechanism (say, if you report your value for the good is 1 dollar, the mechanism assigns you the good with probability .5 for a payment of 50 cents) does no better than a deterministic mechanism. A footnote in the Hart and Reny paper credits Aumann for the idea that this is actually pretty strange: a mechanism is a sequential game where the designer moves first. It is intuitive that being able to randomize would be useful in these types of situations; in a matching pennies game, I would love to be able to play .5 heads and .5 tails when moving first! But the optimal mechanism with a single good does not have this property, for an intuitive reason. Imagine I will sell the good for X. Every type with private value below X does not buy, and those with types V>=X earn V-X in information rents. Offering to sell the good with probability .5 for price X/2 does not induce anybody new to buy, and selling with probability .5 for price Y less than X/2 causes some of the types close to X to switch to buying the lottery, lowering the seller revenue. Indeed, it can be verified that the revenue from the lottery just described is exactly the revenue from a mechanism which offered to sell the good with probability 1 for price 2Y<X.

With multiple goods, however, let high and low type buyers be equally common. Let the high type buyer be indifferent between buying two goods for X, buying the first good only for Y, and buying the second good only for Z (where IC requires that X generate the most seller revenue as long as buyer values are nonnegative). Let the low type value only the first good, at value W less than Y. How can I sell to the low type without violating the high type’s IC constraints? Right now, if the high type buyer has values V1 and V2 for the two goods, the indifference assumptions means V1+V2-X=V1-Y=V2-Z. Can I offer a fractional sale (with probability p) of the first good at price Y2 such that pW-Y2>=0, yet pV1-Y2<V1+V2-X=V1-Y? Sure. The low value buyer is just like the low value buyers in the single good case, but the high value buyer dislikes fractional sales because in order to buy the lottery, she is giving up her purchase of the second good. Giving up buying the second good costs her V2 in welfare whether a fraction or the whole of good 1 is sold, but the benefit of deviating is lower with the lottery.

April 2013 working paper (IDEAS version). Update: Sergiu notes there is a new version as of December 21 on his website. (Don’t think the paper is all bad news for deriving general properties of optimal mechanisms. In addition to the results above, Hart and Reny also show a nice result about mechanisms where there are multiple optimal responses by buyers, some good for the seller and some less so. It turns out that whenever you have a mechanism of this type, there is another mechanism that uniquely generates revenue arbitrarily close to the seller-optimal from among those multiple potential buyer actions in the first mechanism.)

“X-Efficiency,” M. Perelman (2011)

Do people still read Leibenstein’s fascinating 1966 article “Allocative Efficiency vs. X-Efficiency”? They certainly did at one time: Perelman notes that in the 1970s, this article was the third-most cited paper in all of the social sciences! Leibenstein essentially made two points. First, as Harberger had previously shown, distortions like monopoly simply as a matter of mathematics can’t have large welfare impacts. Take monopoly. for instance. The deadweight loss is simply the change in price times the change in quantity supplied times .5 times the percentage of the economy run by monopolist firms. Under reasonable looking demand curves, those deadweight triangles are rarely going to be even ten percent of the total social welfare created in a given industry. If, say, twenty percent of the final goods economy is run by monopolists, then, we only get a two percent change in welfare (and this can be extended to intermediate goods with little empirical change in the final result). Why, then, worry about monopoly?

The reason to worry is Leibenstein’s second point: firms in the same industry often have enormous differences in productivity, and there is tons of empirical evidence that firms do a better job of minimizing costs when under the selection pressures of competition (Schmitz’ 2005 JPE on iron ore producers provides a fantastic demonstration of this). Hence, “X-inefficiency”, which Perelman notes is named after Tolstoy’s “X-factor” in the performance of armies from War and Peace, and not just just allocative efficiency may be important. Draw a simple supply-demand graph and you will immediately see that big “X-inefficiency rectangles” can swamp little Harberger deadweight loss triangles in their welfare implications. So far, so good. These claims, however, turned out to be incredibly controversial.

The problem is that just claiming waste is really a broad attack on a fundamental premise of economics, profit maximization. Stigler, in his well-named X-istence of X-efficiency (gated pdf), argues that we need to be really careful here. Essentially, he is suggesting that information differences, principal-agent contracting problems, and many other factors can explain dispersion in costs, and that we ought focus on those factors before blaming some nebulous concept called waste. And of course he’s correct. But this immediately suggests a shift from traditional price theory to a mechanism design based view of competition, where manager and worker incentives interact with market structure to produce outcomes. I would suggest that this project is still incomplete, that the firm is still too much of a black box in our basic models, and that this leads to a lot of misleading intuition.

For instance, most economists will agree that perfectly price discriminating monopolists have the same welfare impact as perfect competition. But this intuition is solely based on black box firms without any investigation of how those two market structures affect the incentive for managers to collect costly information of efficiency improvements, on the optimal labor contracts under the two scenarios, etc. “Laziness” of workers is an equilibrium outcome of worker contracts, management monitoring, and worker disutility of effort. Just calling that “waste” as Leibenstein does is not terribly effective analysis. It strikes me, though, that Leibenstein is correct when he implicitly suggests that selection in the marketplace is more primitive than profit maximization: I don’t need to know much about how manager and worker incentives work to understand that more competition means inefficient firms are more likely to go out of business. Even in perfect competition, we need to be careful about assuming that selection automatically selects away bad firms: it is not at all obvious that the efficient firms can expand efficiently to steal business from the less efficient, as Chad Syverson has rigorously discussed.

So I’m with Perelman. Yes, Leibenstein’s evidence for X-inefficiency was weak, and yes, he conflates many constraints with pure waste. But on the basic points – that minimized costs depend on the interaction of incentives with market structure instead of simply on technology, and that heterogeneity in measured firm productivity is critical to economic analysis – Leibenstein is far more convincing that his critics. And while Syverson, Bloom, Griffith, van Reenen and many others are opening up the firm empirically to investigate the issues Leibenstein raised, there is still great scope for us theorists to more carefully integrate price theory and mechanism problems.

Final article in JEP 2011 (RePEc IDEAS). As always, a big thumbs up to the JEP for making all of their articles ungated and free to read.

“Incentives for Unaware Agents,” E.L. von Thadden & X. Zhao (2012)

There is a paradox that troubles a lot of applications of mechanism design: complete contracts (or, indeed, conditional contracts of any kind!) appear to be quite rare in the real world. One reason for this may be that agents are simply unaware of what they can do, an idea explored by von Thadden and Zhao in this article as well as by Rubinstein and Glazer in a separate 2012 paper in the JPE. I like the opening example in Rubinstein and Glazer:

“I went to a bar and was told it was full. I asked the bar hostess by what time one should arrive in order to get in. She said by 12 PM and that once the bar is full you can only get in if you are meeting a friend who is already inside. So I lied and said that my friend was already inside. Without having been told, I would not have known which of the possible lies to tell in order to get in.”

The contract itself gave the agent the necessary information. If I don’t specify the rule that patrons whose friend is inside are allowed entry, then only those who are aware of that possibility will ask. Of course, some patrons who I do wish to allow in, because their friend actually is inside, won’t know to ask unless I tell them. If the harm to the bar from previously unaware people learning and then lying overwhelms the gain from allowing unaware friends in, then the bar is better off not giving an explicit “contract”. Similar problems occur all the time. There are lots of behavioral explanations (recall the famous Israeli daycare which was said to have primed people into an “economic relationship” state of mind by setting a fine for picking kids up late, leading to more lateness, not less). But the bar story above relies on no behavioral action aside from agents having a default (ask about the friend clause if aware, or don’t ask if unaware) which can be removed if agents are informed about their real possible actions when given a contract.

When all agents are unaware, the tradeoff is simple, as above: I make everyone aware of their true actions if the cost of providing incentive rents is exceeded by the benefit of agents switching to actions I prefer more. Imagine that agents can not clean, partially clean, or fully clean their tools at the end of the workday (giving some stochastic output of cleanliness). They get no direct utility out of cleaning, and indeed get disutility the more time they spend cleaning. If there is no contract, they default to partially cleaning. If there is a contract, then if all cleaning pays the same the agent will exert zero effort and not clean. The only reason I might offer high-powered incentives, then, is if the benefit of getting agents to fully clean their tools exceeds the IC rents I will have to pay them once the contract is in place.

More interesting is the case with aware and unaware agents, when I don’t know which agent is which. The unaware agents gets contracts that pay the same wage no matter what their output, and the aware agents can get high-powered incentives. Solving the contracting problem involves a number of technical difficulties (standard envelope theorem arguments won’t work), but the solution is fairly intuitive. Offer two incomplete wage contracts w(x) and v(x). Let v(x) just fully insure: no matter what the output, the wage is the same. Let w(x) increase with better outputs. Choose the full insurance wage v low enough that the unaware agents’ participation constraint just binds. Then offer just enough rents in w(x) that the aware agents, who can take any action they want, actually take the planner preferred action. Unlike in a standard screening problem, I can manipulate this problem by just telling unaware agents about their possible actions: it turns out that profits only increase by making these agents aware if there are sufficiently few unaware agents in the population.

Some interesting sidenotes. Unawareness is “stable” in the sense that unaware agents will never be told they are unaware, and hence if we played this game for two periods, they would remain unaware. It is not optimal for aware agents to make unaware agents aware, since the aware earn information rents as a result of that unawareness. It is not optimal for the planner to make unaware agents aware: the firm is maximizing total profit, announcements strictly decrease wages of aware agents (by taking their information rents), and don’t change unaware agents rents (they get zero since their wage is always chosen to make their PC bind, as is usual for “low types” in screening problems). Interesting.

2009 working paper (IDEAS). Final version in REStud 2012. The Rubinstein/Glazer paper takes a slightly different tack. Roughly, it says that contract designers can write a codex of rules, where you are accepted if you satisfy all the rules. An agent made aware of the rules can figure out how to lie if it involves only lying about one rule. A patient, for instance, may want a painkiller prescription. He can lie about any (unverifiable) condition, but he is only smart enough to lie once. The question is, which codices are not manipulable?

“On the Equivalence of Bayesian and Dominant Strategy Implementation,” A. Gershkov et al (2012)

Let’s take a quick break from the job market. Bayesian incentive compatibility is not totally satisfying when trying to implement some mechanism, as each agent must care about the rationality and actions of other agents. Dominant strategy incentive compatibility (DSIC) is much cleaner: no matter what other agents do, the action that makes me best off is just to state my preferences truthfully. For example, the celebrated Vickrey-Clarke-Groves mechanism is dominant strategy incentive compatible. Even if other agents lie or make mistakes, the transfers I pay (or receive) are set so that I have just enough incentive to report my true type.

Many clever mechanisms are not DSIC, however. Consider the Cremer-McLean auction with correlated types. There are a bunch of bidders with correlated private values for an oil field. If the values were independent, in the optimal mechanism, the winning bidder gets an information rent. Consider a first price auction, where all of our values are distributed uniformly on [0,1]. I draw .7 for my value. I won’t actually bid .7, because bidding slightly lower increases my profit when I win, but only slightly decreases my probability of winning. If the auctioneer wants to use a scheme that reveals my true value, he better give me some incentive to reveal; the second-price auction does just that, by making me pay only the second-highest bid if I win.

Cremer-McLean does the following. If I want to enter the auction, I need to make a side bet that gives me zero expected payoff if both I and everyone else report our true types, and a very negative payoff if somebody lies. In particular, charge me an entry fee that depends only on what I think other people’s bids will be. Conditional on everyone else telling the truth, I am perfectly happy as a Bayesian to report truthfully. My willingness to accept the bet reveals, because of correlated private values, something about my own private value. But now the auctioneer knows our true values, and hence can extract the full surplus in the auction without paying any information rents. Many, many people – Milgrom and Wilson, famously – find this mechanism rather unsatisfying in the real world. Certainly, it’s tough to think of any social choice mechanism that uses the Cremer-McLean strategy, or even something similar. This may be because it relies heavily on knife-edge Bayesian reasoning and common knowledge of rationality among the players, much stronger conditions than we need for dominant strategy incentive compatibility.

Manelli and Vincent have a 2010 Econometrica with a phenomenal result: in many simple cases, Bayesian IC gives me nothing that I can’t get from a DSIC. This makes sense in some ways: consider the equivalence of the Bayesian mechanism of a sealed-bid auction and the dominant strategy mechanism of a second price auction. What Gershkov et al do is extent that equivalence to a much broader class of social choice implementation. In particular, take any social choice function with one dimensional independent types, and quasi-linear utility for each agent. If there is a Bayesian IC mechanism to implement some social choice, then I can write down allocations and transfers which are DSIC and give the exact same interim (meaning after individuals learn their private types) expected utility for every agent. That is, in any auction with independent private values and linear utility, there is nothing I can do with Bayesian mechanisms that I can’t do with much more plausible mechanisms.

How does this work? Recall that the biggest difference between Bayesian IC and DSIC is that a mechanism is Bayesian IC (on a connected type space) if expected utility from an allocation rule is non-decreasing in my own type, and DSIC if utility from the allocation rule is non-decreasing in my own type no matter what the types of the other agents. Gershkov et al give the following example. I want to give an object to the agent with the highest value, as long as her value is not more than .5 higher than the other agent. Both agent’s values are independent drawn from U[0,1]. If the difference between the two is more than .5, I want to allocate the good to no one. Just giving an agent the good with probability 1 if his type is higher than the other agent’s report and lower than the other agent’s report plus .5 is Bayesian incentive compatible (the marginal of expected utility is nondecreasing in my type, so there must exist transfer payments that implement), but not DSIC: if the other agent reports his type minus .1, then I want to shade an equal amount. However, consider just giving an agent the good with probability equal to the minimum of his report and .5. If my type is .7, then I get the object with probability .5. This is exactly the interim probability I would get the object in the Bayesian mechanism. Further, the allocation probability is increasing in my own type no matter what the other agent’s type, so there must exist transfer payments that implement in dominant strategies. The general proof relies on extending a mathematical proof from the early 1990s: if a bounded, non-negative function of several variables generates monotone, one-dimensional marginals, then there must exist a non-negative function with the same bound, and the same marginals, that is monotone is each coordinate. The first function looks a lot like the condition on allocation rules for Bayesian IC, and the second the condition on allocation rules for DSIC…

Final Econometrica preprint (IDEAS version)

“Learning About the Future and Dynamic Efficiency,” A. Gershkov & B. Moldovanu (2009)

How am I to set a price when buyers arrive over time and I have a good that will expire, such as a baseball ticket or an airplane seat?  “Yield management” pricing is widespread in industries like these, but the standard methods tend to involve nonstrategic agents.  But a lack of myopia can sometimes be very profitable.  Consider a home sale.  Buyers arrive slowly, and the seller doesn’t know the distribution of potential buyer values.  It’s possible that if I report a high value when I arrive first, the seller will Bayesian update about the future and will not sell me the house, since they believe that other buyers also value the house highly. If I report a low value, however, I may get the house.

Consider the following numerical example from Gershkov and Moldovanu.  There are two agents, one arriving now and one arriving tomorrow.  The seller doesn’t know whether the agent values are IID in [0,1] or IID in [1,2], but puts 50 percent weight on each possibility.  With complete information, the dynamically efficient thing to do would be to sell to the first agent if she reports a value in [.5,1]U[1.5,2]. With incomplete information, however, there is no transfer than can simultaneously get the first agent to tell the truth when her value is in [.5,1] and tell the truth when her value is in [1,1.5].  By the revelation principle, then, there can be no dynamically efficient pricing mechanism.

Consider a more general problem, with N goods with qualities q1,q2..qN, and one buyer arriving each period.  The buyer has a value x(i) drawn from a distribution F, and he gets utility x(i)*q(j) if he receives good j.   Incomplete information by itself turns out not to be a major problem, as long as the seller knows the distribution: just find the optimal history-dependent cutoffs using a well-known result from Operations Research, then choose VCG style payments to ensure each agent reports truthfully.  If the distribution from which buyer values is unknown, as in the example above, then seller’s learn about what the optimal cutoffs should be from the buyer’s reports. Unsurprisingly, we will need something like the following: since cutoffs depend on my report, implementation depends on the maximal amount the cutoff can change having a derivative less than one in my type.   If the derivative is less than one, then the multiplicative nature of buyer utilities means that there will be no incentive to lie about your valuation in order to alter the seller’s beliefs about the buyer value distribution.

http://www.econ2.uni-bonn.de/moldovanu/pdf/learning-about-the-future-and-dynamic-efficiency.pdf (IDEAS version).  Final version published in the September 2009 AER. I previously wrote about a followup by the same authors for the case where the seller does not observe the arrival time of potential buyers, in addition to not knowing the buyer’s values.  

“Decentralization, Hierarchies and Incentives: A Mechanism Design Perspective,” D. Mookherjee (2006)

Lerner, Hayek, Lange and many others in the middle of the 20th century wrote exhaustively about the possibility for centralized systems like communism to perform better than decentralized systems like capitalism. The basic tradeoff is straightforward: in a centralized system, we can account for distributional concerns, negative externalities, etc., while a decentralized system can more effectively use local information. This type of abstract discussion about ideal worlds actually has great applications even to the noncommunist world: we often have to decide between centralization or decentralization within the firm, or within the set of regulators. I am continually amazed by how often the important Hayekian argument is misunderstood. The benefit of capitalism can’t have much to do with profit incentives per se, since (almost) every employee of a modern firm is a not an owner, and hence is incentivized to work hard only by her labor contract. A government agency could conceivably use precisely the same set of contracts and get precisely the same outcome as the private firm (the principle-agent problem is identical in the two cases). The big difference is thus not profit incentive but the use of dispersed information.

Mookherjee, in a recent JEL survey, considers decentralization from the perspective of mechanism design. What is interesting here is that, if the revelation principle applies, there is no reason to use any decentralized decisionmaking system over a centralized one where the boss tells everyone exactly what they should do. That is, any contract where I could subcontract to A who then subsubcontracts to B is weakly dominated by a contract where I get both A and B to truthfully reveal their types and then contract with each myself. The same logic applies, for example, to whether a firm should have middle management or not. This suggests that if we want to explain decentralization in firms, we have only two roads to go down: first, show conditions where decentralization is equally good to centralization, or second, investigate cases where the revelation principle does not apply. In the context of recent discussions on this site of what “good theory” is, I would suggest that this is a great example of a totally nonpredictive theorem (revelation) being quite useful (in narrowing down potential explanations of decentralization) to a specific set of users (applied economic theorists).

(I am assuming most readers of a site like this are familiar with the revelation principle, but if not, it is just a couple lines of math to prove. Assume agents have information or types a in a set A. If I write them a contract F, they will tell me their type is G(a)=a’ where G is just a function that, for all a in A, chooses a’ to maximize u(F(a’)), where u is the utility the agent gets from the contract F by reporting a’. The contract given to an agent of type a, then, leads to outcome F(G(a)). If this contract exists, then just let H be “the function concatenating F(G(.))”. H is now a “truthful” contract, since it is in each agent’s interest just to reveal their true type. That is, the revelation principle guarantees that any outcome from a mechanism, no matter how complicated or involving how many side payments or whatever, can be replicated by a contract where each agent just states what they know truthfully to the principal.)

First, when can we do just as well with decentralization and centralization even when the revelation principle applies? Consider choosing whether to (case 1) hire A who also subcontracts some work to B, or (case 2) just hiring both A and B directly. If A is the only one who knows B’s production costs, then A will need to get informational rents in case 1 unless A and B produce perfectly complementary goods: without such rents, A has an incentive to produce a larger share of production by reporting that B is a high cost producer. Indeed, A is essentially “extracting” information rents both from B and from the principal by virtue of holding information that the principal cannot access. A number of papers have shown that this problem can be eliminated if A is risk-neutral and has an absence of limited liability (so I can tax away ex-ante information rents), contracting is top-down (I contract with A before she learns B’s costs), and A’s production quantity is known (so I can optimally subsidize or tax this production).

More interesting is to consider when revelation fails. Mookherjee notes that the proof of the revelation principle requires 1) noncollusion among agents, 2) absence of communication costs, information processing costs, or contract complexity costs, and 3) no possibility of ex-post contract renegotiation by the principal. I note here that both the present paper, and the hierarchy literature in general, tends to shy away from ongoing relationships, but these are obviously relevant in many cases, and we know that in dynamic mechanism design, the revelation principle will not hold. The restricted message space literature is still rather limited, mainly because mechanism design theory at this point does not give any simple results like the revelation principle when the message space is restricted. It’s impossible to go over every result Mookherjee describes – this is a survey paper after all – but here is a brief summary. Limited message spaces are not a panacea since the restrictions required for limited message space to motivate decentralization, and particularly middle management, are quite strong. Collusion among agents does offer some promise, though. Imagine A and B are next to each other on an assembly line, and B can see A’s effort. The principal just sees whether the joint production is successful or not. For a large number of parameters, Baliga and Sjostrom (1998) proved that delegation is optimal: for example, pay B a wage conditional on output, and let him and A negotiate on the side how to divvy up that payment.

Much more work on the design of organizations is needed, that is for sure.

http://people.bu.edu/dilipm/publications/jeldecsurvrev.pdf (Final working paper – published in June 2006 JEL)

“Dynamic Costs and Moral Hazard,” G. Arie (2011)

(A quick note: it’s almost job market time here in the world of economics. Today’s paper is a job market paper by Guy Arie, a theorist here in MEDS who I saw present a really interest model the other day. If your department is looking for a micro theorist this year, definitely give a skim through the Northwestern candidates’ papers; we have 8-10 students in theory, and the job market talks I’ve been to have been very nice overall. OK, on to Arie’s paper…)

All but the simplest dynamic mechanism design problems prove difficult, particularly because the intuition of the revelation principle fails. In a dynamic problem, I can not only lie about my type, but I can also lie about my type conditional on the history generated by the mechanism. A particular worry among agents is that if, in a truthful mechanism, the principle can elicit my type in period 1, he will then use that information to punish me in future periods (so-called “ratcheting”). The constraints in dynamic mechanism design, then, are much larger: I am going to need to worry about history-dependent deviations, and I am going to need to be sure that ratcheting is not a concern.

Arie studies what looks like a straightforward contracting problem. Imagine a risk-neutral salesman selling over a year (a technical note: the salesman have limited liability so I can’t just sell them the firm right away). They have N potential buyers which they can try to sell to. If they try to sell to one in a given period, a sale is made with probability p. Under any sensible contract, the salesmen will try to sell to the easiest clients first, so that the effort needed to make a sales attempt is increasing in the number of sales attempts I’ve made previously. The salesmen’s boss can only observe the number of sales made, but does not know how many sales have been attempted. A successful sale is worth V to the boss.

Now hopefully you see the problem: if I knew how may sales had been attempted, I would pay the salesmen just enough to cover his cost of effort at any given time, and I would ask him to keep making sales until the wage payment, adjusted for the probability of success, was higher than the value of the additional sale. The salesman will make exactly zero profits after accounting for his cost of effort. But when the boss does not know how many sales are attempted, the salesman has a nice deviation: just sit on his laurels in period 1, and only then start trying to make sales. The payment for each successful sale is now higher than the cost to the salesman of attempting the sale because the salesman has “delayed” making the easy sales until later in the year when the “bonus” is higher. Worse, if I try to condition payments on past sales, a salesmen who is unlucky with his first few “easy” sale attempts will just stop working altogether because he will be getting paid for “easy” sales but will in fact be trying to make relatively hard sales. Evidently these problems actually comes up empirically.

So what can the boss do? The standard dynamic moral hazard solution without increasing costs but with limited liability would be something like the following: have the salesman try to sell, giving him “credit” for each sale. Once he has enough credit, give him the firm (meaning give him the profits from each future sale). The promise of this big bonus in the future is incentive enough for the employee to work hard now. If he gets unlucky and does not make early sales, fire him, because no matter how lucky he gets in the future, he’ll never make back enough credit to get the firm.

With increasing costs, things are not so simple. Arie manages to write the problem as a linear program – binary effort and risk neutrality are quite important here – and then notes that it is not obvious that we can apply a one-stage deviation principle; indeed, a simple example shows that checking one-stage deviations is not equivalent to checking incentive compatibility here. But, the problem can be transformed, as all linear programs (as well as many other mathematical programs!) into a dual. The dual has nice properties. Essentially, the choice variables of the dual will be the shadow prices on the ex-ante probability I will ask the salesman to try to sell after a given public history. If that shadow price is strictly positive, then asking the salesman to try harder after a given history increases the principle’s profits, even accounting for the dynamic effects such a request has on incentives to shirk in the past. The dual formulation offers a number of other nice interpretations which make the potential solution easier to see – the usefulness of shadow prices should come as no surprise to economists, given that it has been the critical element in results going back at least to Ramsey’s taxation paper and Hotelling’s resource extraction model, both in the early part of the 20th century.

So what is the optimal contract? Don’t pay the agent anything except “credit”. When this credit gets too low, fire the agent. When credit is sufficiently high, pay the agent a fixed rate per success over the next N periods, where the pay each period is just high enough to incentivize the agent to exert effort N periods in the future. Essentially, the agent is paid a very high piece rate at the end of the contract. Until that point, successful sales are rewarded only by conditions making it easier for the agent to begin getting his bonuses; e.g., “if you make a sale today, then after two more consecutive sales, I’ll put you on the bonus schedule for a month, but if you don’t make a sale today, you’ll need four more sales, and will only get the bonuses for two weeks”. And why do I tell the employee only to work for N more periods rather than just give him the firm and let him work until the moment it is too costly to exert any more effort? Essentially, I am destroying the “value” of the future firm after every failed sale in order to keep the agent from shirking; by reducing the future value of the firm after a failure, I can still give the agent who was unlucky some reason to keep working, but in such a way that the destroyed future value of the firm makes shirking unprofitable.

http://www.kellogg.northwestern.edu/faculty/arie/GuyArie_DynamicCosts.pdf (Job market working paper)

“Collaborating,” A. Bonatti & J. Horner (2011)

(Apologies for the long delay since the last post. I’ve been in that tiniest of Southeast Asian backwaters, East Timor, talking to UN and NGO folks about how the new democracy is coming along. The old rule of thumb is that you need 25 years of free and fair elections before society consolidates a democracy, but we still have a lot to learn about how that process takes place. I have some theoretical ideas about how to avoid cozy/corrupt links between government ministers and the private sector in these unconsolidated democracies, and I wanted to get some anecdotes which might guide that theory. And in case you’re wondering: I would give pretty high odds that, for a variety of reasons, the Timorese economy is going absolutely nowhere fast. Now back to the usual new research summaries…)

Teamwork is essential, you’re told from kindergarten on. But teamwork presents a massive moral hazard problem: how do I make sure the other guy does his share? In the static setting, Alchain-Demsetz (1972) and a series of papers by Holmstrom (May He Win His Deserved Nobel) have long ago discussed why people will free ride when their effort is hidden, and what contracts can be written to avoid this problem. Bonatti and Horner make the problem dynamic, and with a few pretty standard tricks from optimal control develop some truly counterintuitive results.

The problem is the following. N agents are engaged in working on a project which is “good” with probability p. Agents exert costly effort continuously over time. Depending on the effort exerted by agents at any given time, a breakthrough occurs with some probability if the project is good, but never occurs if the project is bad. Over time, given effort along the equilibrium path, agents become more and more pessimistic about the project being good if no breakthrough occurs. The future is discounted. Agents only observe their own effort choice (but have correct beliefs about the effort of others in equilibrium). This means that off-path, beliefs of effort exertion are not common knowledge: if I deviate and work harder now, and no breakthrough occurs, then I am more pessimistic than others about the goodness of the project since I know, and they don’t, that a higher level of effort was put in.

In this setting, not only do agents shirk (hoping the other agents will pick up the slack), but they also procrastinate. Imagine a two-period world. In a two period world, I can shift some effort to period 2, in the hope that the other agent’s period 1 effort will lead to a success. I don’t want to work extremely hard in period 1 when all that this leads to is wasted effort because my teammate has already solved the problem in that period. Note that this procrastination motive is not optimal when the team is of size 1: you need a coauthor to justify your slacking! Better monitoring here does not help, surprisingly. If I can see how much effort my opponent puts in each period, then what happens? If I decrease my period 1 effort, and this is observable by both agents, then my teammate will not be so pessimistic about the success of the project in period 2. Hence, she will work harder in period 2. Hence, each agent has an incentive to work less in period 1 vis-a-vis the hidden action case. (Of course, you may wonder why this is an equilibrium; that is, why doesn’t the teammate play grim trigger and punish me for shirking? It turns out there are a number of reasonable equilibria in the case with observable actions, some of which give higher welfare and some of which give lower welfare than under hidden action. The point is just that allowing observability doesn’t necessarily help things.)

So what have we learned? Three things in particular. First, work in teams gives extra incentive to procrastinate compared to solo work. Second, this means that setting binding deadlines can be welfare improving; the authors further show that the larger the team, the tighter the deadline necessary. Third, letting teams observe how hard the other is working is not necessarily optimal. Surely observability by a principal would be welfare-enhancing – the contract could be designed to look like dynamic Holmstrom – but observability between the agents is not necessarily so. Interesting stuff.

http://cowles.econ.yale.edu/P/cd/d16b/d1695.pdf (Final Cowles Foundation WP – paper published in April 2011 AER)

“Contracting with Repeated Moral Hazard and Private Evaluations,” W. Fuchs (2007)

Firms often want to evaluate employees subjectively or using private information – feedback from an employee’s clients, for instance – not available to the agent. Solving repeated games with private monitoring and no verification is difficult. Using some clever mathematics, William Fuchs merges the results of MacLeod (2003), where in a one-shot game firms must burn money sometimes if they are to incentivize workers, and Levin (2003), where optimal infinite reputational contracts are considered. Fuchs is different from MacLeod in that he considers a finite repeated game, rather than a one-shot game, and different from Levin in that he shows that the “full review” property Levin uses to solve for a pseudo-optimal contract is actually restrictive: a firm can do better by bundling reviews and termination periods such that reviews are only held every T periods.

Under risk neutrality, the usual tradeoff will guide any solution: firms need to punish workers for bad realizations of output, but firms must not have an incentive to lie to the employee, since otherwise they will report output is low when it is actually high, and the proposed contract will not be an equilibrium.

In either the finite or the infinite period case, the intuition above suggests that after any realization, the agent’s continuation value must be higher if the output was higher (to incentivize him to work) and the principle’s continuation value must be the same no matter the output (to incentivize her not to lie). In the finite case, this requires money to be burned at some point if the agent is going to be incentivized to always put in effort, since sometimes that effort will result in low output, and the principle can’t earn surplus by reporting low surplus: rather, he just has to burn the money. A footnote in this paper notes that this type of money burning actually does sometimes occur: in professional baseball, when players are fined by their teams, the team gives the money to charity rather than keeping it.

It is also straightforward to show that for any finite-period relational contract in the Fuchs setting, there is a payoff-equivalent contract that just pays efficiency wages to the agent each period (i.e., pays the agent his expected production given full effort) until he is fired. No bonuses are necessary. Essentially, the principle’s full value from the original contract in period 0 is paid to her by the agent; the relational contract thereafter makes the principle indifferent between firing and not firing the agent in every period. That is, the principle has no incentive problem. Let the agent collect the remaining surplus in every period. The agent will not want to quit because he collects all of the surplus after period 0. Making the agent work with full effort until termination just requires setting the termination date such that the appropriate amount of money burning occurs.

The previous results lead immediately to the following link between unlimited money-burning in dynamic games and equilibrium results in static games: if I can burn as much value in the last period as I want, and I can also just pay the agent (accounting for discounting) his wage in the final period, then in this equilibrium there is no need to give the agent updates about how he is doing (since I expect full effort in every period anyway), and the whole problem just collapses to a static game.

In the infinite-period game, the finite results suggest that as the length of the game goes to infinity, an optimal contract burns an arbitrarily large amount of money arbitrarily far in the future. This isn’t satisfying; for one, we only release information to the agent “at the end” whch is infinitely far away. Fuchs instead endogenizes money burning by capping the amount of money burned at the total surplus of the game. He then extends Levin by considering T-period review contracts, where the principal reveals her evaluation of the agent every T periods, rather than every period as in Levin. The results above that termination contracts can be found which are payoff-equivalent to contracts involving complicated wage and bonus schemes still hold, so let a T-period review contract fire the agent with probability B if performance is “unsatisfactory” after T periods. If the employee passes evaluation, a new T-period review starts with a clean slate. Linking as many periods as possible together is optimal because the amount of money the principle needs to burn in each evaluation period is independent of the length of the period; the intuition here is that if get effort from the agent if the principle pledges to burn money far in the future, then that same pledge will be even stronger in future periods since the future where money is burned is not as far away.

There are two simple notes at the end of the paper on how to avoid money burning. If there are two agents, as in Lazaer and Rosen tournaments, the principle can credibly commit to just pay whichever agent produces higher output, so no burning is necessary. Alternatively,the principle can hire a manager, pay her a fixed wage, and have the manager report publicly whether output was good or not; since the manager’s wage is independent of the report, there are no longer any incentive problems.

Even with these nice results, the general problem of optimal relational contracting is still open; dynamic mechanisms with imperfect monitoring are hard.

ftp://ftp.cemfi.es/pdf/papers/wshop/paper%20W.Fuchs.pdf (Working paper – Final version published in AER 2007)

Follow

Get every new post delivered to your Inbox.

Join 176 other followers

%d bloggers like this: