Category Archives: Mechanisms

The 2018 Fields Medal and its Surprising Connection to Economics!

The Fields Medal and Nevanlinna Prizes were given out today. They represent the highest honor possible for young mathematicians and theoretical computer scientists, and are granted only once every four years. The mathematics involved is often very challenging for outsiders. Indeed, the most prominent of this year’s winners, the German Peter Scholze, is best known for his work on “perfectoid spaces”, and I honestly have no idea how to begin explaining them aside from saying that they are useful in a number of problems in algebraic geometry (the lovely field mapping results in algebra – what numbers solve y=2x – and geometry – noting that those solutions to y=2x form a line). Two of this year’s prizes, however, the Fields given to Alessio Figalli and the Nevanlinna to Constantinos Daskalakis, have a very tight connection to an utterly core question in economics. Indeed, both of those men have published work in economics journals!

The problem of interest concerns how best to sell an object. If you are a monopolist hoping to sell one item to one consumer, where the consumer’s valuation of the object is only known to the consumer but commonly known to come from a distribution F, the mechanism that maximizes revenue is of course the Myerson auction from his 1981 paper in Math OR. The solution is simple: make a take it or leave it offer at a minimum price (or “reserve price”) which is a simple function of F. If you are selling one good and there are many buyers, then revenue is maximized by running a second-price auction with the exact same reserve price. In both cases, no potential buyer has any incentive to lie about their true valuation (the auction is “dominant strategy incentive compatible”). And further, seller revenue and expected payments for all players are identical to the Myerson auction in any other mechanism which allocates goods the same way in expectation, with minor caveats. This result is called “revenue equivalence”.

The Myerson paper is an absolute blockbuster. The revelation principle, the revenue equivalence theorem, and a solution to the optimal selling mechanism problem all in the same paper? I would argue it’s the most important result in economics since Arrow-Debreu-McKenzie, with the caveat that many of these ideas were “in the air” in the 1970s with the early ideas of mechanism design and Bayesian game theory. The Myerson result is also really worrying if you are concerned with general economic efficiency. Note that the reserve price means that the seller is best off sometimes not selling the good to anyone, in case all potential buyers have private values below the reserve price. But this is economically inefficient! We know that there exists an allocation mechanism which is socially efficient even when people have private information about their willingness to pay: the Vickrey-Clarke-Groves mechanism. This means that market power plus asymmetric information necessarily destroys social surplus. You may be thinking we know this already: an optimal monopoly price is classic price theory generates deadweight loss. But recall that a perfectly-price-discriminating monopolist sells to everyone whose willingness-to-pay exceeds the seller’s marginal cost of production, hence the only reason monopoly generates deadweight loss in a world with perfect information is that we constrain them to a “mechanism” called a fixed price. Myerson’s result is much worse: letting a monopolist use any mechanism, and price discriminate however they like, asymmetric information necessarily destroys surplus!

Despite this great result, there remain two enormous open problems. First, how should we sell a good when we will interact with the same buyer(s) in the future? Recall the Myerson auction involves bidders truthfully revealing their willingness to pay. Imagine that tomorrow, the seller will sell the same object. Will I reveal my willingness to pay truthfully today? Of course not! If I did, tomorrow the seller would charge the bidder with the highest willingness-to-pay exactly that amount. Ergo, today bidders will shade down their bids. This is called the “ratchet effect”, and despite a lot of progress in dynamic mechanism design, we have still not fully solved for the optimal dynamic mechanism in all cases.

The other challenging problem is one seller selling many goods, where willingness to pay for one good is related to willingness to pay for the others. Consider, for example, selling cable TV. Do you bundle the channels together? Do you offer a menu of possible bundles? This problem is often called “multidimensional screening”, because you are attempting to “screen” buyers such that those with high willingness to pay for a particular good actually pay a high price for that good. The optimal multidimensional screen is a devil of a problem. And it is here that we return to the Fields and Nevanlinna prizes, because they turn out to speak precisely to this problem!

What could possibly be the connection between high-level pure math and this particular pricing problem? The answer comes from the 18th century mathematician Gaspard Monge, founder of the Ecole Polytechnique. He asked the following question: what is the cheapest way to move mass from X to Y, such as moving apples from a bunch of distribution centers to a bunch of supermarkets. It turns out that without convexity or linearity assumptions, this problem is very hard, and it was not solved until the late 20th century. Leonid Kantorovich, the 1975 Nobel winner in economics, paved the way for this result by showing that there is a “dual” problem where instead of looking for the map from X to Y, you look for the probability that a given mass in Y comes from X. This dual turns out to be useful in that there exists an object called a “potential” which helps characterize the optimal transport problem solution in a much more tractable way than searching across any possible map.

Note the link between this problem and our optimal auction problem above, though! Instead of moving mass most cheaply from X to Y, we are looking to maximize revenue by assigning objects Y to people with willingness-to-pay drawn from X. So no surprise, the solution to the optimal transport problem when X has a particular structure and the solution to the revenue maximizing mechanism problem are tightly linked. And luckily for us economists, many of the world’s best mathematicians, including 2010 Fields winner Cedric Villani, and this year’s winner Alessio Figalli, have spent a great deal of effort working on exactly this problem. Ivar Ekeland has a nice series of notes explaining the link between the two problems in more detail.

In a 2017 Econometrica, this year’s Nevanlinna winner Daskalakis and his coauthors Alan Deckelbaum and Christos Tzamos, show precisely how to use strong duality in the optimal transport problem to solve the general optimal mechanism problem when selling multiple goods. The paper is very challenging, requiring some knowledge of measure theory, duality theory, and convex analysis. That said, the conditions they give to check an optimal solution, and the method to find the optimal solution, involve a reasonably straightforward series of inequalities. In particular, the optimal mechanism involves dividing the hypercube of potential types into (perhaps infinite) regions who get assigned the same prices and goods (for example, “you get good A and good B together with probability p at price X”, or “if you are unwilling to pay p1 for A, p2 for B, or p for both together, you get nothing”).

This optimal mechanism has some unusual properties. Remember that the Myerson auction for one buyer is “simple”: make a take it or leave it offer at the reserve price. You may think that if you are selling many items to one buyer, you would likewise choose a reserve price for the whole bundle, particularly when the number of goods with independently distributed values becomes large. For instance, if there are 1000 cable channels, and a buyer has value distributed uniformly between 0 and 10 cents for each channel, then by a limit theorem type argument it’s clear that the willingness to pay for the whole bundle is quite close to 50 bucks. So you may think, just price at a bit lower than 50. However, Daskalakis et al show that when there are sufficiently many goods with i.i.d. uniformly-distributed values, it is never optimal to just set a price for the whole bundle! It is also possible to show that the best mechanism often involves randomization, where buyers who report that they are willing to pay X for item a and Y for item b will only get the items with probability less than 1 at specified price. This is quite contrary to my intuition, which is that in most mechanism problems, we can restrict focus to deterministic assignment. It was well-known that multidimensional screening has weird properties; for example, Hart and Reny show that an increase in buyer valuations can cause seller revenue from the optimal mechanism to fall. The techniques Daskalakis and coauthors develop allow us to state exactly what we ought do in these situations previously unknown in the literature, such as when we know we need mechanisms more complicated than “sell the whole bundle at price p”.

The history of economics has been a long series of taking tools from the frontier of mathematics, from the physics-based analogues of the “marginalists” in the 1870s, to the fixed point theorems of the early game theorists, the linear programming tricks used to analyze competitive equilibrium in the 1950s, and the tropical geometry recently introduced to auction theory by Elizabeth Baldwin and Paul Klemperer. We are now making progress on pricing issues that have stumped some of the great theoretical minds in the history of the field. Multidimensional screening is an incredibly broad topic: how ought we regulate a monopoly with private fixed and marginal costs, how ought we tax agents who have private costs of effort and opportunities, how ought a firm choose wages and benefits, and so on. Knowing the optimum is essential when it comes to understanding when we can use simple, nearly-correct mechanisms. Just in the context of pricing, using related tricks to Daskalakis, Gabriel Carroll showed in a recent Econometrica that bundling should be avoided when the principal has limited knowledge about the correlation structure of types, and my old grad school friend Nima Haghpanah has shown, in a paper with Jason Hartline, that firms should only offer high-quality and low-quality versions of their products if consumers’ values for the high-quality good and their relative value for the low versus high quality good are positively correlated. Neither of these results are trivial to prove. Nonetheless, a hearty cheers to our friends in pure mathematics who continue to provide us with the tools we need to answer questions at the very core of economic life!

Advertisements

Nobel Prize 2016 Part II: Oliver Hart

The Nobel Prize in Economics was given yesterday to two wonderful theorists, Bengt Holmstrom and Oliver Hart. I wrote a day ago about Holmstrom’s contributions, many of which are simply foundational to modern mechanism design and its applications. Oliver Hart’s contribution is more subtle and hence more of a challenge to describe to a nonspecialist; I am sure of this because no concept gives my undergraduate students more headaches than Hart’s “residual control right” theory of the firm. Even stranger, much of Hart’s recent work repudiates the importance of his most famous articles, a point that appears to have been entirely lost on every newspaper discussion of Hart that I’ve seen (including otherwise very nice discussions like Applebaum’s in the New York Times). A major reason he has changed his beliefs, and his research agenda, so radically is not simply the whims of age or the pressures of politics, but rather the impact of a devastatingly clever, and devastatingly esoteric, argument made by the Nobel winners Eric Maskin and Jean Tirole. To see exactly what’s going on in Hart’s work, and why there remains many very important unsolved questions in this area, let’s quickly survey what economists mean by “theory of the firm”.

The fundamental strangeness of firms goes back to Coase. Markets are amazing. We have wonderful theorems going back to Hurwicz about how competitive market prices coordinate activity efficiently even when individuals only have very limited information about how various things can be produced by an economy. A pencil somehow involves graphite being mined, forests being explored and exploited, rubber being harvested and produced, the raw materials brought to a factory where a machine puts the pencil together, ships and trains bringing the pencil to retail stores, and yet this decentralized activity produces a pencil costing ten cents. This is the case even though not a single individual anywhere in the world knows how all of those processes up the supply chain operate! Yet, as Coase pointed out, a huge amount of economic activity (including the majority of international trade) is not coordinated via the market, but rather through top-down Communist-style bureaucracies called firms. Why on Earth do these persistent organizations exist at all? When should firms merge and when should they divest themselves of their parts? These questions make up the theory of the firm.

Coase’s early answer is that something called transaction costs exist, and that they are particularly high outside the firm. That is, market transactions are not free. Firm size is determined at the point where the problems of bureaucracy within the firm overwhelm the benefits of reducing transaction costs from regular transactions. There are two major problems here. First, who knows what a “transaction cost” or a “bureaucratic cost” is, and why they differ across organizational forms: the explanation borders on tautology. Second, as the wonderful paper by Alchian and Demsetz in 1972 points out, there is no reason we should assume firms have some special ability to direct or punish their workers. If your supplier does something you don’t like, you can keep them on, or fire them, or renegotiate. If your in-house department does something you don’t like, you can keep them on, or fire them, or renegotiate. The problem of providing suitable incentives – the contracting problem – does not simply disappear because some activity is brought within the boundary of the firm.

Oliver Williamson, a recent Nobel winner joint with Elinor Ostrom, has a more formal transaction cost theory: some relationships generate joint rents higher than could be generated if we split ways, unforeseen things occur that make us want to renegotiate our contract, and the cost of that renegotiation may be lower if workers or suppliers are internal to a firm. “Unforeseen things” may include anything which cannot be measured ex-post by a court or other mediator, since that is ultimately who would enforce any contract. It is not that everyday activities have different transaction costs, but that the negotiations which produce contracts themselves are easier to handle in a more persistent relationship. As in Coase, the question of why firms do not simply grow to an enormous size is largely dealt with by off-hand references to “bureaucratic costs” whose nature was largely informal. Though informal, the idea that something like transaction costs might matter seemed intuitive and had some empirical support – firms are larger in the developing world because weaker legal systems means more “unforeseen things” will occur outside the scope of a contract, hence the differential costs of holdup or renegotiation inside and outside the firm are first order when deciding on firm size. That said, the Alchian-Demsetz critique, and the question of what a “bureaucratic cost” is, are worrying. And as Eric van den Steen points out in a 2010 AER, can anyone who has tried to order paper through their procurement office versus just popping in to Staples really believe that the reason firms exist is to lessen the cost of intrafirm activities?

Grossman and Hart (1986) argue that the distinction that really makes a firm a firm is that it owns assets. They retain the idea that contracts may be incomplete – at some point, I will disagree with my suppliers, or my workers, or my branch manager, about what should be done, either because a state of the world has arrived not covered by our contract, or because it is in our first-best mutual interest to renegotiate that contract. They retain the idea that there are relationship-specific rents, so I care about maintaining this particular relationship. But rather than rely on transaction costs, they simply point out that the owner of the asset is in a much better bargaining position when this disagreement occurs. Therefore, the owner of the asset will get a bigger percentage of rents after renegotiation. Hence the person who owns an asset should be the one whose incentive to improve the value of the asset is most sensitive to that future split of rents.

Baker and Hubbard (2004) provide a nice empirical example: when on-board computers to monitor how long-haul trucks were driven began to diffuse, ownership of those trucks shifted from owner-operators to trucking firms. Before the computer, if the trucking firm owns the truck, it is hard to contract on how hard the truck will be driven or how poorly it will be treated by the driver. If the driver owns the truck, it is hard to contract on how much effort the trucking firm dispatcher will exert ensuring the truck isn’t sitting empty for days, or following a particularly efficient route. The computer solves the first problem, meaning that only the trucking firm is taking actions relevant to the joint relationship which are highly likely to be affected by whether they own the truck or not. In Grossman and Hart’s “residual control rights” theory, then, the introduction of the computer should mean the truck ought, post-computer, be owned by the trucking firm. If these residual control rights are unimportant – there is no relationship-specific rent and no incompleteness in contracting – then the ability to shop around for the best relationship is more valuable than the control rights asset ownership provides. Hart and Moore (1990) extends this basic model to the case where there are many assets and many firms, suggesting critically that sole ownership of assets which are highly complementary in production is optimal. Asset ownership affects outside options when the contract is incomplete by changing bargaining power, and splitting ownership of complementary assets gives multiple agents weak bargaining power and hence little incentive to invest in maintaining the quality of, or improving, the assets. Hart, Schleifer and Vishny (1997) provide a great example of residual control rights applied to the question of why governments should run prisons but not garbage collection. (A brief aside: note the role that bargaining power plays in all of Hart’s theories. We do not have a “perfect” – in a sense that can be made formal – model of bargaining, and Hart tends to use bargaining solutions from cooperative game theory like the Shapley value. After Shapley’s prize alongside Roth a few years ago, this makes multiple prizes heavily influenced by cooperative games applied to unexpected problems. Perhaps the theory of cooperative games ought still be taught with vigor in PhD programs!)

There are, of course, many other theories of the firm. The idea that firms in some industries are big because there are large fixed costs to enter at the minimum efficient scale goes back to Marshall. The agency theory of the firm going back at least to Jensen and Meckling focuses on the problem of providing incentives for workers within a firm to actually profit maximize; as I noted yesterday, Holmstrom and Milgrom’s multitasking is a great example of this, with tasks being split across firms so as to allow some types of workers to be given high powered incentives and others flat salaries. More recent work by Bob Gibbons, Rebecca Henderson, Jon Levin and others on relational contracting discusses how the nexus of self-enforcing beliefs about how hard work today translates into rewards tomorrow can substitute for formal contracts, and how the credibility of these “relational contracts” can vary across firms and depend on their history.

Here’s the kicker, though. A striking blow was dealt to all theories which rely on the incompleteness or nonverifiability of contracts by a brilliant paper of Maskin and Tirole (1999) in the Review of Economic Studies. Theories relying on incomplete contracts generally just hand-waved that there are always events which are unforeseeable ex-ante or impossible to verify in court ex-post, and hence there will always scope for disagreement about what to do when those events occur. But, as Maskin and Tirole correctly point out, agent don’t care about anything in these unforeseeable/unverifiable states except for what the states imply about our mutual valuations from carrying on with a relationship. Therefore, every “incomplete contract” should just involve the parties deciding in advance that if a state of the world arrives where you value keeping our relationship in that state at 12 and I value it at 10, then we should split that joint value of 22 at whatever level induces optimal actions today. Do this same ex-ante contracting for all future profit levels, and we are done. Of course, there is still the problem of ensuring incentive compatibility – why would the agents tell the truth about their valuations when that unforeseen event occurs? I will omit the details here, but you should read the original paper where Maskin and Tirole show a (somewhat convoluted but still working) mechanism that induces truthful revelation of private value by each agent. Taking the model’s insight seriously but the exact mechanism less seriously, the paper basically suggests that incomplete contracts don’t matter if we can truthfully figure out ex-post who values our relationship at what amount, and there are many real-world institutions like mediators who do precisely that. If, as Maskin and Tirole prove (and Maskin described more simply in a short note), incomplete contracts aren’t a real problem, we are back to square one – why have persistent organizations called firms?

What should we do? Some theorists have tried to fight off Maskin and Tirole by suggesting that their precise mechanism is not terribly robust to, for instance, assumptions about higher-order beliefs (e.g., Aghion et al (2012) in the QJE). But these quibbles do not contradict the far more basic insight of Maskin and Tirole, that situations we think of empirically as “hard to describe” or “unlikely to occur or be foreseen”, are not sufficient to justify the relevance of incomplete contracts unless we also have some reason to think that all mechanisms which split rent on the basis of future profit, like a mediator, are unavailable. Note that real world contracts regularly include provisions that ex-ante describe how contractual disagreement ex-post should be handled.

Hart’s response, and this is both clear from his CV and from his recent papers and presentations, is to ditch incompleteness as the fundamental reason firms exist. Hart and Moore’s 2007 AER P&P and 2006 QJE are very clear:

Although the incomplete contracts literature has generated some useful insights about firm boundaries, it has some shortcomings. Three that seem particularly important to us are the following. First, the emphasis on noncontractible ex ante investments seems overplayed: although such investments are surely important, it is hard to believe that they are the sole drivers of organizational form. Second, and related, the approach is ill suited to studying the internal organization of firms, a topic of great interest and importance. The reason is that the Coasian renegotiation perspective suggests that the relevant parties will sit down together ex post and bargain to an efficient outcome using side payments: given this, it is hard to see why authority, hierarchy, delegation, or indeed anything apart from asset ownership matters. Finally, the approach has some foundational weaknesses [pointed out by Maskin and Tirole (1999)].

To my knowledge, Oliver Hart has written zero papers since Maskin-Tirole was published which attempt to explain any policy or empirical fact on the basis of residual control rights and their necessary incomplete contracts. Instead, he has been primarily working on theories which depend on reference points, a behavioral idea that when disagreements occur between parties, the ex-ante contracts are useful because they suggest “fair” divisions of rent, and induce shading and other destructive actions when those divisions are not given. These behavioral agents may very well disagree about what the ex-ante contract means for “fairness” ex-post. The primary result is that flexible contracts (e.g., contracts which deliberately leave lots of incompleteness) can adjust easily to changes in the world but will induce spiteful shading by at least one agent, while rigid contracts do not permit this shading but do cause parties to pursue suboptimal actions in some states of the world. This perspective has been applied by Hart to many questions over the past decade, such as why it can be credible to delegate decision making authority to agents; if you try to seize it back, the agent will feel aggrieved and will shade effort. These responses are hard, or perhaps impossible, to justify when agents are perfectly rational, and of course the Maskin-Tirole critique would apply if agents were purely rational.

So where does all this leave us concerning the initial problem of why firms exist in a sea of decentralized markets? In my view, we have many clever ideas, but still do not have the perfect theory. A perfect theory of the firm would need to be able to explain why firms are the size they are, why they own what they do, why they are organized as they are, why they persist over time, and why interfirm incentives look the way they do. It almost certainly would need its mechanisms to work if we assumed all agents were highly, or perfectly, rational. Since patterns of asset ownership are fundamental, it needs to go well beyond the type of hand-waving that makes up many “resource” type theories. (Firms exist because they create a corporate culture! Firms exist because some firms just are better at doing X and can’t be replicated! These are outcomes, not explanations.) I believe that there are reasons why the costs of maintaining relationships – transaction costs – endogenously differ within and outside firms, and that Hart is correct is focusing our attention on how asset ownership and decision making authority affects incentives to invest, but these theories even in their most endogenous form cannot do everything we wanted a theory of the firm to accomplish. I think that somehow reputation – and hence relational contracts – must play a fundamental role, and that the nexus of conflicting incentives among agents within an organization, as described by Holmstrom, must as well. But we still lack the precise insight to clear up this muddle, and give us a straightforward explanation for why we seem to need “little Communist bureaucracies” to assist our otherwise decentralized and almost magical market system.

Nobel Prize 2016 Part I: Bengt Holmstrom

The Nobel Prize in Economics has been announced, and what a deserving prize it is: Bengt Holmstrom and Oliver Hart have won for the theory of contracts. The name of this research weblog is “A Fine Theorem”, and it would be hard to find two economists whose work is more likely to elicit such a description! Both are incredibly deserving; more than five years ago on this site, I discussed how crazy it was that Holmstrom had yet to win!. The only shock is the combination: a more natural prize would have been Holmstrom with Paul Milgrom and Robert Wilson for modern applied mechanism design, and Oliver Hart with John Moore and Sandy Grossman for the theory of the firm. The contributions of Holmstrom and Hart are so vast that I’m splitting this post into two, so as to properly cover the incredible intellectual accomplishments of these two economists.

The Finnish economist Bengt Holmstrom did his PhD in operations research at Stanford, advised by Robert Wilson, and began his career at my alma mater, the tiny department of Managerial Economics and Decision Sciences at Northwestern’s Kellogg School. To say MEDS struck gold with their hires in this era is an extreme understatement: in 1978 and 1979 alone, they hired Holmstrom and his classmate Paul Milgrom (another Wilson student from Stanford), hired Nancy Stokey promoted Nobel laureate Roger Myerson to Associate Professor, and tenured an adviser of mine, Mark Satterthwaite. And this list doesn’t even include other faculty in the late 1970s and early 1980s like eminent contract theorist John Roberts, behavioralist Colin Camerer, mechanism designer John Ledyard or game theorist Ehud Kalai. This group was essentially put together by two senior economists at Kellogg, Nancy Schwartz and Stanley Reiter, who had the incredible foresight to realize both that applied game theory was finally showing promise of tackling first-order economic questions in a rigorous way, and that the folks with the proper mathematical background to tackle these questions were largely going unhired since they often did their graduate work in operations or mathematics departments rather than traditional economics departments. This market inefficiency, as it were, allowed Nancy and Stan to hire essentially every young scholar in what would become the field of mechanism design, and to develop a graduate program which combined operations, economics, and mathematics in a manner unlike any other place in the world.

From that fantastic group, Holmstrom’s contribution lies most centrally in the area of formal contract design. Imagine that you want someone – an employee, a child, a subordinate division, an aid contractor, or more generally an agent – to perform a task. How should you induce them to do this? If the task is “simple”, meaning the agent’s effort and knowledge about how to perform the task most efficiently is known and observable, you can simply pay a wage, cutting off payment if effort is not being exerted. When only the outcome of work can be observed, if there is no uncertainty in how effort is transformed into outcomes, knowing the outcome is equivalent to knowing effort, and hence optimal effort can be achieved via a bonus payment made on the basis of outcomes. All straightforward so far. The trickier situations, which Holmstrom and his coauthors analyzed at great length, are when neither effort nor outcomes are directly observable.

Consider paying a surgeon. You want to reward the doctor for competent, safe work. However, it is very difficult to observe perfectly what the surgeon is doing at all times, and basing pay on outcomes has a number of problems. First, the patient outcome depends on the effort of not just one surgeon, but on others in the operating room and prep table: team incentives must be provided. Second, the doctor has many ways to shift the balance of effort between reducing costs to the hospital, increasing patient comfort, increasing the quality of the medical outcome, and mentoring young assistant surgeons, so paying on the basis of one or two tasks may distort effort away from other harder-to-measure tasks: there is a multitasking problem. Third, the number of medical mistakes, or the cost of surgery, that a hospital ought expect from a competent surgeon depends on changes in training and technology that are hard to know, and hence a contract may want to adjust payments for its surgeons on the performance of surgeons elsewhere: contracts ought take advantage of relevant information when it is informative about the task being incentivized. Fourth, since surgeons will dislike risk in their salary, the fact that some negative patient outcomes are just bad luck means that you will need to pay the surgeon very high bonuses to overcome their risk aversion: when outcome measures involve uncertainty, optimal contracts will weigh “high-powered” bonuses against “low-powered” insurance against risk. Fifth, the surgeon can be incentivized either by payments today or by keeping their job tomorrow, and worse, these career concerns may cause the surgeon to waste the hospital’s money on tasks which matter to the surgeon’s career beyond the hospital.

Holmstrom wrote the canonical paper on each of these topics. His 1979 paper in the Bell Journal of Economics shows that any information which reduces the uncertainty about what an agent actually did should feature in a contract, since by reducing uncertainty, you reduce the risk premium needed to incentivize the agent to accept the contract. It might seem strange that contracts in many cases do not satisfy this “informativeness principle”. For instance, CEO bonuses are often not indexed to the performance of firms in the same industry. If oil prices rise, essentially all oil firms will be very profitable, and this is true whether or not a particular CEO is a good one. Bertrand and Mullainathan argue that this is because many firms with diverse shareholders are poorly governed!

The simplicity of contracts in the real world may have more prosaic explanations. Jointly with Paul Milgrom, the famous “multitasking” paper published in JLEO in 1991 notes that contracts shift incentives across different tasks in addition to serving as risk-sharing mechanisms and as methods for inducing effort. Since bonuses on task A will cause agents to shift effort away from hard-to-measure task B, it may be optimal to avoid strong incentives at all (just pay teachers a salary rather than a bonus based only on test performance) or to split job tasks (pay bonuses to teacher A who is told to focus only on math test scores, and pay salary to teacher B who is meant to serve as a mentor). That outcomes are generated by teams also motivates simpler contracts. Holmstrom’s 1982 article on incentives in teams, published in the Bell Journal, points out that if both my effort and yours is required to produce a good outcome, then the marginal product of our efforts are both equal to the entire value of what is produced, hence there is not enough output to pay each of us our marginal product. What can be done? Alchian and Demsetz had noticed this problem in 1972, arguing that firms exist to monitor the effort of individuals working in teams. With perfect knowledge of who does what, you can simply pay the workers a wage sufficient to make the optimal effort, then collect the residual as profit. Holmstrom notes that the monitoring isn’t the important bit: rather, even shareholder controlled firms where shareholders do no monitoring at all are useful. The reason is that shareholders can be residual claimants for profit, and hence there is no need to fully distribute profit to members of the team. Free-riding can therefore be eliminated by simply paying team members a wage of X if the team outcome is optimal, and 0 otherwise. Even a slight bit of shirking by a single agent drops their payment precipitously (which is impossible if all profits generated by the team are shared by the team), so the agents will not shirk. Of course, when there is uncertainty about how team effort transforms into outcomes, this harsh penalty will not work, and hence incentive problems may require team sizes to be smaller than that which is first-best efficient. A third justification for simple contracts is career concerns: agents work hard today to try to signal to the market that they are high quality, and do so even if they are paid a fixed wage. This argument had been made less formally by Fama, but Holmstrom (in a 1982 working paper finally published in 1999 in RESTUD) showed that this concern about the market only completely mitigates moral hazard if outcomes within a firm were fully observable to the market, or the future is not discounted at all, or there is no uncertainty about agent’s abilities. Indeed, career concerns can make effort provision worse; for example, agents may take actions to signal quality to the market which are negative for their current firm! A final explanation for simple contracts comes from Holmstrom’s 1987 paper with Milgrom in Econometrica. They argue that simple “linear” contracts, with a wage and a bonus based linearly on output, are more “robust” methods of solving moral hazard because they are less susceptible to manipulation by agents when the environment is not perfectly known. Michael Powell, a student of Holmstrom’s now at Northwestern, has a great set of PhD notes providing details of these models.

These ideas are reasonably intuitive, but the way Holmstrom answered them is not. Think about how an economist before the 1970s, like Adam Smith in his famous discussion of the inefficiency of sharecropping, might have dealt with these problems. These economists had few tools to deal with asymmetric information, so although economists like George Stigler analyzed the economic value of information, the question of how to elicit information useful to a contract could not be discussed in any systematic way. These economists would have been burdened by the fact that the number of contracts one could write are infinite, so beyond saying that under a contract of type X does not equate marginal cost to marginal revenue, the question of which “second-best” contract is optimal is extraordinarily difficult to answer in the absence of beautiful tricks like the revelation principle partially developed by Holmstrom himself. To develop those tricks, a theory of how individuals would respond to changes in their joint incentives over time was needed; the ideas of Bayesian equilibria and subgame perfection, developed by Harsanyi and Selten, were unknown before the 1960s. The accretion of tools developed by pure theory finally permitted, in the late 1970s and early 1980s, an absolute explosion of developments of great use to understanding the economic world. Consider, for example, the many results in antitrust provided by Nobel winner Jean Tirole, discussed here two years ago.

Holmstrom’s work has provided me with a great deal of understanding of why innovation management looks the way it does. For instance, why would a risk neutral firm not work enough on high-variance moonshot-type R&D projects, a question Holmstrom asks in his 1989 JEBO Agency Costs and Innovation? Four reasons. First, in Holmstrom and Milgrom’s 1987 linear contracts paper, optimal risk sharing leads to more distortion by agents the riskier the project being incentivized, so firms may choose lower expected value projects even if they themselves are risk neutral. Second, firms build reputation in capital markets just as workers do with career concerns, and high variance output projects are more costly in terms of the future value of that reputation when the interest rate on capital is lower (e.g., when firms are large and old). Third, when R&D workers can potentially pursue many different projects, multitasking suggests that workers should be given small and very specific tasks so as to lessen the potential for bonus payments to shift worker effort across projects. Smaller firms with fewer resources may naturally have limits on the types of research a worker could pursue, which surprisingly makes it easier to provide strong incentives for research effort on the remaining possible projects. Fourth, multitasking suggests agent’s tasks should be limited, and that high variance tasks should be assigned to the same agent, which provides a role for decentralizing research into large firms providing incremental, safe research, and small firms performing high-variance research. That many aspects of firm organization depend on the swirl of conflicting incentives the firm and the market provide is a topic Holmstrom has also discussed at length, especially in his beautiful paper “The Firm as an Incentive System”; I shall reserve discussion of that paper for a subsequent post on Oliver Hart.

Two final light notes on Holmstrom. First, he is the source of one of my favorite stories about Paul Samuelson, the greatest economic theorist of all time. Samuelson was known for having a steel trap of a mind. At a light trivia session during a house party for young faculty at MIT, Holmstrom snuck in a question, as a joke, asking for the name of the third President of independent Finland. Samuelson not only knew the name, but apparently was also able to digress on the man’s accomplishments! Second, I mentioned at the beginning of this post the illustrious roster of theorists who once sat at MEDS. Business school students are often very hesitant to deal with formal models, partially because they lack a technical background but also because there is a trend of “dumbing down” in business education whereby many schools (of course, not including my current department at The University of Toronto Rotman!) are more worried about student satisfaction than student learning. With perhaps Stanford GSB as an exception, it is inconceivable that any school today, Northwestern included, would gather such an incredible collection of minds working on abstract topics whose applicability to tangible business questions might lie years in the future. Indeed, I could name a number of so-called “top” business schools who have nobody on their faculty who has made any contribution of note to theory! There is a great opportunity for a Nancy Schwartz or Stan Reiter of today to build a business school whose students will have the ultimate reputation for rigorous analysis of social scientific questions.

“Bonus Culture: Competitive Pay, Screening and Multitasking,” R. Benabou & J. Tirole (2014)

Empirically, bonus pay as a component of overall renumeration has become more common over time, especially in highly competitive industries which involve high levels of human capital; think of something like management of Fortune 500 firms, where the managers now have their salary determined globally rather than locally. This doesn’t strike most economists as a bad thing at first glance: as long as we are measuring productivity correctly, workers who are compensated based on their actual output will both exert the right amount of effort and have the incentive to improve their human capital.

In an intriguing new theoretical paper, however, Benabou and Tirole point out that many jobs involve multitasking, where workers can take hard-to-measure actions for intrinsic reasons (e.g., I put effort into teaching because I intrinsically care, not because academic promotion really hinges on being a good teacher) or take easy-to-measure actions for which there might be some kind of bonus pay. Many jobs also involve screening: I don’t know who is high quality and who is low quality, and although I would optimally pay people a bonus exactly equal to their cost of effort, I am unable to do so since I don’t know what that cost is. Multitasking and worker screening interact among competitive firms in a really interesting way, since how other firms incentivize their workers affects how workers will respond to my contract offers. Benabou and Tirole show that this interaction means that more competition in a sector, especially when there is a big gap between the quality of different workers, can actually harm social welfare even in the absence of any other sort of externality.

Here is the intuition. For multitasking reasons, when different things workers can do are substitutes, I don’t want to give big bonus payments for the observable output, since if I do the worker will put in too little effort on the intrinsically valuable task: if you pay a trader big bonuses for financial returns, she will not put as much effort into ensuring all the laws and regulations are followed. If there are other finance firms, though, they will make it known that, hey, we pay huge bonuses for high returns. As a result, workers will sort, with all of the high quality traders will move to the high bonus firm, leaving only the low quality traders at the firm with low bonuses. Bonuses are used not only to motivate workers, but also to differentially attract high quality workers when quality is otherwise tough to observe. There is a tradeoff, then: you can either have only low productivity workers but get the balance between hard-to-measure tasks and easy-to-measure tasks right, or you can retain some high quality workers with large bonuses that make those workers exert too little effort on hard-to-measure tasks. When the latter is more profitable, all firms inefficiently begin offering large, effort-distorting bonuses, something they wouldn’t do if they didn’t have to compete for workers.

How can we fix things? One easy method is with a bonus cap: if the bonus is capped at the monopsony optimal bonus, then no one can try to screen high quality workers away from other firms with a higher bonus. This isn’t as good as it sounds, however, because there are other ways to screen high quality workers (such as offering lower clawbacks if things go wrong) which introduce even worse distortions, hence bonus caps may simply cause less efficient methods to perform the same screening and same overincentivization of the easy-to-measure output.

When the individual rationality or incentive compatibility constraints in a mechanism design problem are determined in equilibrium, based on the mechanisms chosen by other firms, we sometimes called this a “competing mechanism”. It seems to me that there are quite a number of open questions concerning how to make these sorts of problems tractable; a talented young theorist looking for a fun summer project might find it profitable to investigate this as-yet small literature.

Beyond the theoretical result on screening plus multitasking, Tirole and Benabou also show that their results hold for market competition more general than just perfect competition versus monopsony. They do this through a generalized version of the Hotelling line which appears to have some nice analytic properties, at least compared to the usual search-theoretic models which you might want to use when discussing imperfect labor market competition.

Final copy (RePEc IDEAS version), forthcoming in the JPE.

“The Limits of Price Discrimination,” D. Bergemann, B. Brooks and S. Morris (2013)

Rakesh Vohra, who much to the regret of many of us at MEDS has recently moved on to a new and prestigious position, pointed out a clever paper today by Bergemann, Brooks and Morris (the first and third names you surely know, the second is a theorist on this year’s market). Beyond some clever uses of linear algebra in the proofs, the results of the paper are in and of themselves very interesting. The question is the following: if a regulator, or a third party, can segment consumers by willingness-to-pay and provide that information to a monopolist, what are the effects on welfare and profits?

In a limited sense, this is an old question. Monopolies generate deadweight loss as they sell at a price above marginal cost. Monopolies that can perfectly price discriminate remove that deadweight loss but also steal all of the consumer surplus. Depending on your social welfare function, this may be a good or bad thing. When markets can be segmented (i.e., third degree price discrimination) with no chance of arbitrage, we know that monopolist profits are weakly higher since the uniform monopoly price could be maintained in both markets, but the effect on consumer surplus is ambiguous.

Bergemann et al provide two really interesting results. First, if you can choose the segmentation, it is always possible to segment consumers such that monopoly profits are just the profits gained under the uniform price, but quantity sold is nonetheless efficient. Further, there exist segmentations such that producer surplus P is anything between the uniform price profit P* and the perfect price discrimination profit P**, and such that consumer surplus plus consumer surplus P+C is anything between P* and P**! This seems like magic, but the method is actually pretty intuitive.

Let’s generate the first case, where producer profit is the uniform price profit P* and consumer surplus is maximal, C=P**-P*. In any segmentation, the monopolist can always charge P* to every segment. So if we want consumers to capture all of the surplus, there can’t be “too many” high-value consumers in a segment, since otherwise the monopolist would raise their price above P*. Let there be 3 consumer types, with the total market uniformly distributed across the three, such that valuations are 1, 2 and 3. Let marginal cost be constant at zero. The profit-maximizing price is 2, earning the monopolist 2*(2/3)=4/3. But what if we tell the monopolist that consumers can either be Class A or Class B. Class A consists of all consumers with willingness-to-pay 1 and exactly enough consumers with WTP 2 and 3 that the monopolist is just indifferent between choosing price 1 or price 2 for Class A. Class B consists of the rest of the types 2 and 3 (and since the relative proportion of type 2 and 3 in this Class is the same as in the market as a whole, where we already know the profit maximizing price is 2 with only types 2 and 3 buying, the profit maximizing price remains 2 here). Some quick algebra shows that if Class A consists of all of the WTP 1 consumers and exactly 1/2 of the WTP 2 and 3 consumers, then the monopolist is indifferent between charging 1 and 2 to Class A, and charges 2 to Class B. Therefore, it is an equilibrium for all consumers to buy the good, the monopolist to earn uniform price profits P*, and consumer surplus to be maximized. The paper formally proves that this intuition holds for general assumptions about (possibly continuous) consumer valuations.

The other two “corner cases” for bundles of consumer and producer surplus are also easy to construct. Maximal producer surplus P** with consumer surplus 0 is simply the case of perfect price discrimination: the producer knows every consumer’s exact willingness-to-pay. Uniform price producer surplus P* and consumer surplus 0 is constructed by mixing the very low WTP consumers with all of the very high types (along with some subset of consumers with less extreme valuations), such that the monopolist is indifferent between charging the monopolist price or just charging the high type price so that everyone below the high type does not buy. Then mix the next highest WTP types with low but not quite as low WTP types, and continue iteratively. A simple argument based on a property of convex sets allows mixtures of P and C outside the corner cases; Rakesh has provided an even more intuitive proof than that given in the paper.

Now how do we use this result in policy? At a first pass, since information is always good for the seller (weakly) and ambiguous for the consumer, a policymaker should be particularly worried about bundlers providing information about willingness-to-pay that is expected to drastically lower consumer surplus while only improving rent extraction by sellers a small bit. More works needs to be done in specific cases, but the mathematical setup in this paper provides a very straightforward path for such applied analysis. It seems intuitive that precise information about consumers with willigness-to-pay below the monopoly price is unambiguously good for welfare, whereas information bundles that contain a lot of high WTP consumers but also a relatively large number of lower WTP consumers will lower total quantity sold and hence social surplus.

I am also curious about the limits of price discrimination in the oligopoly case. In general, the ability to price discriminate (even perfectly!) can be very good for consumers under oligopoly. The intuition is that under uniform pricing, I trade-off stealing your buyers by lowering prices against earning less from my current buyers; the ability to price discriminate allows me to target your buyers without worrying about the effect on my own current buyers, hence the reaction curves are steeper, hence consumer surplus tends to increase (see Section 7 of Mark Armstrong’s review of the price discrimination literature). With arbitrary third degree price discrimination, however, I imagine mathematics similar to that in the present paper could prove similarly elucidating.

2013 Working Paper (IDEAS version).

“Maximal Revenue with Multiple Goods: Nonmonotonicity and Other Observations,” S. Hart and P. Reny (2013)

One of the great theorems in economics is Myerson’s optimal auction when selling one good to any number of buyers with private valuations of the good. Don’t underrate this theorem just because it is mathematically quite simple: the question it answers is how to sell any object, using any mechanism (an auction, a price, a bargain, a two step process involving eliciting preferences first, and so on). That an idea as straightforward as the revelation principle makes that question possible to answer is truly amazing.

Myerson’s paper was published 32 years ago. Perhaps more amazing that the simplicity of the Myerson auction is how difficult it has been to derive similar rules for selling more than one good at a time, selling goods more than once, or selling goods when the sellers compete. The last two have well known problems that make analysis difficult: dynamic mechanisms suffer from the “ratchet effect”: a buyer won’t reveal information if she knows a seller will use it in subsequent sales, and competing mechanisms can have an IR constraint which is generated as an equilibrium condition. Hart and Reny, in a new note, show some great examples of the difficulty with the first difficult mechanism problem, selling multiple goods. In particular, increases in the distribution of private values (in the sense of first order stochastic dominance) can lower the optimal revenue, and randomized mechanisms can raise it.

Consider first increases in the private value distribution. This is strange: if for any state of the world, I value your goods more, it seems reasonable that there is “more surplus to extract” for the seller. And indeed, not only does the Myerson optimal auction with a single good have the property that increases in private values lead to increased seller revenue, but Hart and Reny show that any incentive-compatible mechanism with a single good has this property.

(This is actually very easy to prove, and a property I’d never seen stated for the general IC case, so I’ll show it here. If q(x) is the probability I get the good given the buyer’s reported type x, and s(x) is the seller revenue given this report, then incentive compatibility when the true type is x requires that

q(x)x-s(x)>=q(y)x-s(y)

and incentive compatibility when the true type is y requires that

q(y)y-s(y)>=q(x)y-s(x)

Combining these inequalities gives

(q(x)-q(y))x>=s(x)-s(y)>=(q(x)-q(y))y

Hence when x>y, q(x)-q(y)>=0, therefore s(x)>=s(y) for all x and y, therefore a fosd shift in the buyer valuations increases the expected value of seller revenue s. Nice!)

When selling multiple goods, however, this nice property doesn’t hold. Why not? Imagine the buyer might have values for (one unit of each of) 2 goods I am trying to sell of (1,1), (1,2), (2,2) or (2,3). Imagine (2,3) is a very common set of private values. If I knew this buyer’s type, I would extract 5 from him, though if I price the bundle including one of each good at 5, then none of the other types will buy. Also, if I want to extract 5 from type (2,3), then I also can’t sell unit 1 alone for less than 2 or unit 2 alone for less than 3, in which case the only other buyer will be (2,2) buying good 1 alone for price 2. Let’s try lowering the price of the bundle to 4. Now, satisfying (2,3)’s incentive compatibility constraints, we can charge as little as 1 for the first good bought by itself and 2 for the second good bought by itself: at those prices, (2,3) will buy the bundle, (2,2) will buy the first good for 1, and (1,2) will buy the second good for 2. This must look strange to you already: when the buyer’s type goes up from (1,2) to (2,2), the revenue to the seller falls from 2 to 1! And it turns out the prices and allocations described are the optimal mechanism when (2,2) is much less common than the other types. Essentially, across the whole distribution of private values the most revenue can be extracted by selling the second good, so the optimal way to satisfy IC constraints involves making the IC tightest for those whose relative value of the second good is high. Perhaps we ought call the rents (2,2) earns in this example “uncommon preference rents”!

Even crazier is that an optimal sale of multiple goods might involve using random mechanisms. It is easy to show that with a single good, a random mechanism (say, if you report your value for the good is 1 dollar, the mechanism assigns you the good with probability .5 for a payment of 50 cents) does no better than a deterministic mechanism. A footnote in the Hart and Reny paper credits Aumann for the idea that this is actually pretty strange: a mechanism is a sequential game where the designer moves first. It is intuitive that being able to randomize would be useful in these types of situations; in a matching pennies game, I would love to be able to play .5 heads and .5 tails when moving first! But the optimal mechanism with a single good does not have this property, for an intuitive reason. Imagine I will sell the good for X. Every type with private value below X does not buy, and those with types V>=X earn V-X in information rents. Offering to sell the good with probability .5 for price X/2 does not induce anybody new to buy, and selling with probability .5 for price Y less than X/2 causes some of the types close to X to switch to buying the lottery, lowering the seller revenue. Indeed, it can be verified that the revenue from the lottery just described is exactly the revenue from a mechanism which offered to sell the good with probability 1 for price 2Y<X.

With multiple goods, however, let high and low type buyers be equally common. Let the high type buyer be indifferent between buying two goods for X, buying the first good only for Y, and buying the second good only for Z (where IC requires that X generate the most seller revenue as long as buyer values are nonnegative). Let the low type value only the first good, at value W less than Y. How can I sell to the low type without violating the high type’s IC constraints? Right now, if the high type buyer has values V1 and V2 for the two goods, the indifference assumptions means V1+V2-X=V1-Y=V2-Z. Can I offer a fractional sale (with probability p) of the first good at price Y2 such that pW-Y2>=0, yet pV1-Y2<V1+V2-X=V1-Y? Sure. The low value buyer is just like the low value buyers in the single good case, but the high value buyer dislikes fractional sales because in order to buy the lottery, she is giving up her purchase of the second good. Giving up buying the second good costs her V2 in welfare whether a fraction or the whole of good 1 is sold, but the benefit of deviating is lower with the lottery.

April 2013 working paper (IDEAS version). Update: Sergiu notes there is a new version as of December 21 on his website. (Don’t think the paper is all bad news for deriving general properties of optimal mechanisms. In addition to the results above, Hart and Reny also show a nice result about mechanisms where there are multiple optimal responses by buyers, some good for the seller and some less so. It turns out that whenever you have a mechanism of this type, there is another mechanism that uniquely generates revenue arbitrarily close to the seller-optimal from among those multiple potential buyer actions in the first mechanism.)

“X-Efficiency,” M. Perelman (2011)

Do people still read Leibenstein’s fascinating 1966 article “Allocative Efficiency vs. X-Efficiency”? They certainly did at one time: Perelman notes that in the 1970s, this article was the third-most cited paper in all of the social sciences! Leibenstein essentially made two points. First, as Harberger had previously shown, distortions like monopoly simply as a matter of mathematics can’t have large welfare impacts. Take monopoly. for instance. The deadweight loss is simply the change in price times the change in quantity supplied times .5 times the percentage of the economy run by monopolist firms. Under reasonable looking demand curves, those deadweight triangles are rarely going to be even ten percent of the total social welfare created in a given industry. If, say, twenty percent of the final goods economy is run by monopolists, then, we only get a two percent change in welfare (and this can be extended to intermediate goods with little empirical change in the final result). Why, then, worry about monopoly?

The reason to worry is Leibenstein’s second point: firms in the same industry often have enormous differences in productivity, and there is tons of empirical evidence that firms do a better job of minimizing costs when under the selection pressures of competition (Schmitz’ 2005 JPE on iron ore producers provides a fantastic demonstration of this). Hence, “X-inefficiency”, which Perelman notes is named after Tolstoy’s “X-factor” in the performance of armies from War and Peace, and not just just allocative efficiency may be important. Draw a simple supply-demand graph and you will immediately see that big “X-inefficiency rectangles” can swamp little Harberger deadweight loss triangles in their welfare implications. So far, so good. These claims, however, turned out to be incredibly controversial.

The problem is that just claiming waste is really a broad attack on a fundamental premise of economics, profit maximization. Stigler, in his well-named X-istence of X-efficiency (gated pdf), argues that we need to be really careful here. Essentially, he is suggesting that information differences, principal-agent contracting problems, and many other factors can explain dispersion in costs, and that we ought focus on those factors before blaming some nebulous concept called waste. And of course he’s correct. But this immediately suggests a shift from traditional price theory to a mechanism design based view of competition, where manager and worker incentives interact with market structure to produce outcomes. I would suggest that this project is still incomplete, that the firm is still too much of a black box in our basic models, and that this leads to a lot of misleading intuition.

For instance, most economists will agree that perfectly price discriminating monopolists have the same welfare impact as perfect competition. But this intuition is solely based on black box firms without any investigation of how those two market structures affect the incentive for managers to collect costly information of efficiency improvements, on the optimal labor contracts under the two scenarios, etc. “Laziness” of workers is an equilibrium outcome of worker contracts, management monitoring, and worker disutility of effort. Just calling that “waste” as Leibenstein does is not terribly effective analysis. It strikes me, though, that Leibenstein is correct when he implicitly suggests that selection in the marketplace is more primitive than profit maximization: I don’t need to know much about how manager and worker incentives work to understand that more competition means inefficient firms are more likely to go out of business. Even in perfect competition, we need to be careful about assuming that selection automatically selects away bad firms: it is not at all obvious that the efficient firms can expand efficiently to steal business from the less efficient, as Chad Syverson has rigorously discussed.

So I’m with Perelman. Yes, Leibenstein’s evidence for X-inefficiency was weak, and yes, he conflates many constraints with pure waste. But on the basic points – that minimized costs depend on the interaction of incentives with market structure instead of simply on technology, and that heterogeneity in measured firm productivity is critical to economic analysis – Leibenstein is far more convincing that his critics. And while Syverson, Bloom, Griffith, van Reenen and many others are opening up the firm empirically to investigate the issues Leibenstein raised, there is still great scope for us theorists to more carefully integrate price theory and mechanism problems.

Final article in JEP 2011 (RePEc IDEAS). As always, a big thumbs up to the JEP for making all of their articles ungated and free to read.

“Incentives for Unaware Agents,” E.L. von Thadden & X. Zhao (2012)

There is a paradox that troubles a lot of applications of mechanism design: complete contracts (or, indeed, conditional contracts of any kind!) appear to be quite rare in the real world. One reason for this may be that agents are simply unaware of what they can do, an idea explored by von Thadden and Zhao in this article as well as by Rubinstein and Glazer in a separate 2012 paper in the JPE. I like the opening example in Rubinstein and Glazer:

“I went to a bar and was told it was full. I asked the bar hostess by what time one should arrive in order to get in. She said by 12 PM and that once the bar is full you can only get in if you are meeting a friend who is already inside. So I lied and said that my friend was already inside. Without having been told, I would not have known which of the possible lies to tell in order to get in.”

The contract itself gave the agent the necessary information. If I don’t specify the rule that patrons whose friend is inside are allowed entry, then only those who are aware of that possibility will ask. Of course, some patrons who I do wish to allow in, because their friend actually is inside, won’t know to ask unless I tell them. If the harm to the bar from previously unaware people learning and then lying overwhelms the gain from allowing unaware friends in, then the bar is better off not giving an explicit “contract”. Similar problems occur all the time. There are lots of behavioral explanations (recall the famous Israeli daycare which was said to have primed people into an “economic relationship” state of mind by setting a fine for picking kids up late, leading to more lateness, not less). But the bar story above relies on no behavioral action aside from agents having a default (ask about the friend clause if aware, or don’t ask if unaware) which can be removed if agents are informed about their real possible actions when given a contract.

When all agents are unaware, the tradeoff is simple, as above: I make everyone aware of their true actions if the cost of providing incentive rents is exceeded by the benefit of agents switching to actions I prefer more. Imagine that agents can not clean, partially clean, or fully clean their tools at the end of the workday (giving some stochastic output of cleanliness). They get no direct utility out of cleaning, and indeed get disutility the more time they spend cleaning. If there is no contract, they default to partially cleaning. If there is a contract, then if all cleaning pays the same the agent will exert zero effort and not clean. The only reason I might offer high-powered incentives, then, is if the benefit of getting agents to fully clean their tools exceeds the IC rents I will have to pay them once the contract is in place.

More interesting is the case with aware and unaware agents, when I don’t know which agent is which. The unaware agents gets contracts that pay the same wage no matter what their output, and the aware agents can get high-powered incentives. Solving the contracting problem involves a number of technical difficulties (standard envelope theorem arguments won’t work), but the solution is fairly intuitive. Offer two incomplete wage contracts w(x) and v(x). Let v(x) just fully insure: no matter what the output, the wage is the same. Let w(x) increase with better outputs. Choose the full insurance wage v low enough that the unaware agents’ participation constraint just binds. Then offer just enough rents in w(x) that the aware agents, who can take any action they want, actually take the planner preferred action. Unlike in a standard screening problem, I can manipulate this problem by just telling unaware agents about their possible actions: it turns out that profits only increase by making these agents aware if there are sufficiently few unaware agents in the population.

Some interesting sidenotes. Unawareness is “stable” in the sense that unaware agents will never be told they are unaware, and hence if we played this game for two periods, they would remain unaware. It is not optimal for aware agents to make unaware agents aware, since the aware earn information rents as a result of that unawareness. It is not optimal for the planner to make unaware agents aware: the firm is maximizing total profit, announcements strictly decrease wages of aware agents (by taking their information rents), and don’t change unaware agents rents (they get zero since their wage is always chosen to make their PC bind, as is usual for “low types” in screening problems). Interesting.

2009 working paper (IDEAS). Final version in REStud 2012. The Rubinstein/Glazer paper takes a slightly different tack. Roughly, it says that contract designers can write a codex of rules, where you are accepted if you satisfy all the rules. An agent made aware of the rules can figure out how to lie if it involves only lying about one rule. A patient, for instance, may want a painkiller prescription. He can lie about any (unverifiable) condition, but he is only smart enough to lie once. The question is, which codices are not manipulable?

“On the Equivalence of Bayesian and Dominant Strategy Implementation,” A. Gershkov et al (2012)

Let’s take a quick break from the job market. Bayesian incentive compatibility is not totally satisfying when trying to implement some mechanism, as each agent must care about the rationality and actions of other agents. Dominant strategy incentive compatibility (DSIC) is much cleaner: no matter what other agents do, the action that makes me best off is just to state my preferences truthfully. For example, the celebrated Vickrey-Clarke-Groves mechanism is dominant strategy incentive compatible. Even if other agents lie or make mistakes, the transfers I pay (or receive) are set so that I have just enough incentive to report my true type.

Many clever mechanisms are not DSIC, however. Consider the Cremer-McLean auction with correlated types. There are a bunch of bidders with correlated private values for an oil field. If the values were independent, in the optimal mechanism, the winning bidder gets an information rent. Consider a first price auction, where all of our values are distributed uniformly on [0,1]. I draw .7 for my value. I won’t actually bid .7, because bidding slightly lower increases my profit when I win, but only slightly decreases my probability of winning. If the auctioneer wants to use a scheme that reveals my true value, he better give me some incentive to reveal; the second-price auction does just that, by making me pay only the second-highest bid if I win.

Cremer-McLean does the following. If I want to enter the auction, I need to make a side bet that gives me zero expected payoff if both I and everyone else report our true types, and a very negative payoff if somebody lies. In particular, charge me an entry fee that depends only on what I think other people’s bids will be. Conditional on everyone else telling the truth, I am perfectly happy as a Bayesian to report truthfully. My willingness to accept the bet reveals, because of correlated private values, something about my own private value. But now the auctioneer knows our true values, and hence can extract the full surplus in the auction without paying any information rents. Many, many people – Milgrom and Wilson, famously – find this mechanism rather unsatisfying in the real world. Certainly, it’s tough to think of any social choice mechanism that uses the Cremer-McLean strategy, or even something similar. This may be because it relies heavily on knife-edge Bayesian reasoning and common knowledge of rationality among the players, much stronger conditions than we need for dominant strategy incentive compatibility.

Manelli and Vincent have a 2010 Econometrica with a phenomenal result: in many simple cases, Bayesian IC gives me nothing that I can’t get from a DSIC. This makes sense in some ways: consider the equivalence of the Bayesian mechanism of a sealed-bid auction and the dominant strategy mechanism of a second price auction. What Gershkov et al do is extent that equivalence to a much broader class of social choice implementation. In particular, take any social choice function with one dimensional independent types, and quasi-linear utility for each agent. If there is a Bayesian IC mechanism to implement some social choice, then I can write down allocations and transfers which are DSIC and give the exact same interim (meaning after individuals learn their private types) expected utility for every agent. That is, in any auction with independent private values and linear utility, there is nothing I can do with Bayesian mechanisms that I can’t do with much more plausible mechanisms.

How does this work? Recall that the biggest difference between Bayesian IC and DSIC is that a mechanism is Bayesian IC (on a connected type space) if expected utility from an allocation rule is non-decreasing in my own type, and DSIC if utility from the allocation rule is non-decreasing in my own type no matter what the types of the other agents. Gershkov et al give the following example. I want to give an object to the agent with the highest value, as long as her value is not more than .5 higher than the other agent. Both agent’s values are independent drawn from U[0,1]. If the difference between the two is more than .5, I want to allocate the good to no one. Just giving an agent the good with probability 1 if his type is higher than the other agent’s report and lower than the other agent’s report plus .5 is Bayesian incentive compatible (the marginal of expected utility is nondecreasing in my type, so there must exist transfer payments that implement), but not DSIC: if the other agent reports his type minus .1, then I want to shade an equal amount. However, consider just giving an agent the good with probability equal to the minimum of his report and .5. If my type is .7, then I get the object with probability .5. This is exactly the interim probability I would get the object in the Bayesian mechanism. Further, the allocation probability is increasing in my own type no matter what the other agent’s type, so there must exist transfer payments that implement in dominant strategies. The general proof relies on extending a mathematical proof from the early 1990s: if a bounded, non-negative function of several variables generates monotone, one-dimensional marginals, then there must exist a non-negative function with the same bound, and the same marginals, that is monotone is each coordinate. The first function looks a lot like the condition on allocation rules for Bayesian IC, and the second the condition on allocation rules for DSIC…

Final Econometrica preprint (IDEAS version)

“Learning About the Future and Dynamic Efficiency,” A. Gershkov & B. Moldovanu (2009)

How am I to set a price when buyers arrive over time and I have a good that will expire, such as a baseball ticket or an airplane seat?  “Yield management” pricing is widespread in industries like these, but the standard methods tend to involve nonstrategic agents.  But a lack of myopia can sometimes be very profitable.  Consider a home sale.  Buyers arrive slowly, and the seller doesn’t know the distribution of potential buyer values.  It’s possible that if I report a high value when I arrive first, the seller will Bayesian update about the future and will not sell me the house, since they believe that other buyers also value the house highly. If I report a low value, however, I may get the house.

Consider the following numerical example from Gershkov and Moldovanu.  There are two agents, one arriving now and one arriving tomorrow.  The seller doesn’t know whether the agent values are IID in [0,1] or IID in [1,2], but puts 50 percent weight on each possibility.  With complete information, the dynamically efficient thing to do would be to sell to the first agent if she reports a value in [.5,1]U[1.5,2]. With incomplete information, however, there is no transfer than can simultaneously get the first agent to tell the truth when her value is in [.5,1] and tell the truth when her value is in [1,1.5].  By the revelation principle, then, there can be no dynamically efficient pricing mechanism.

Consider a more general problem, with N goods with qualities q1,q2..qN, and one buyer arriving each period.  The buyer has a value x(i) drawn from a distribution F, and he gets utility x(i)*q(j) if he receives good j.   Incomplete information by itself turns out not to be a major problem, as long as the seller knows the distribution: just find the optimal history-dependent cutoffs using a well-known result from Operations Research, then choose VCG style payments to ensure each agent reports truthfully.  If the distribution from which buyer values is unknown, as in the example above, then seller’s learn about what the optimal cutoffs should be from the buyer’s reports. Unsurprisingly, we will need something like the following: since cutoffs depend on my report, implementation depends on the maximal amount the cutoff can change having a derivative less than one in my type.   If the derivative is less than one, then the multiplicative nature of buyer utilities means that there will be no incentive to lie about your valuation in order to alter the seller’s beliefs about the buyer value distribution.

http://www.econ2.uni-bonn.de/moldovanu/pdf/learning-about-the-future-and-dynamic-efficiency.pdf (IDEAS version).  Final version published in the September 2009 AER. I previously wrote about a followup by the same authors for the case where the seller does not observe the arrival time of potential buyers, in addition to not knowing the buyer’s values.  

Advertisements
%d bloggers like this: