Filed under Game Theory

“Long Cheap Talk,” R. Aumann & S. Hart (2003)

I wonder if Crawford and Sobel knew just what they were starting when they wrote their canonical cheap talk paper – it is incredible how much more we know about the value of cheap communication even when agents are biased. Most importantly, it is not true that bias or self-interest means we must always require people to “put skin in the game” or perform some costly action in order to prove the true state of their private information. A colleague passed along this paper by Aumann and Hart which addresses a question that has long bedeviled students of repeated games: why don’t they end right away? (And fair notice: we once had a full office shrine, complete with votive candles, to Aumann, he of the antediluvian beard and two volume tome, so you could say we’re fans!)

Take a really simple cheap talk game, where only one agent has any useful information. Row knows what game we are playing, and Column only knows the probability distribution of such games. In the absence of conflict (say, where there are two symmetric games, each of which has one Pareto optimal equilibrium), Row first tells Column that which game is the true one, this is credible, and so Column plays the Pareto optimal action. In other cases, we know from Crawford-Sobel logic that partial revelation may be useful even when there are conflicts of interest: Row tells Column with some probability what the true game is. We can also create new equilibria by using talk to reach “compromise”. Take a Battle of the Sexes, with LL payoff (6,2), RR (2,6) and LR=RL=(0,0). The equilibria of the simultaneous game without cheap talk are LL,RR, or randomize 3/4 on your preferred location and 1/4 of the opponent’s preferred location. But a new equilibria is possible if we can use talk to create a public randomization device. We both write down 1 or 2 on a piece of paper, then show the papers to each other. If the sum is even, we both go LL. If the sum is odd, we both then go RR. This gives ex-ante payoff (4,4), which is not an equilibrium payoff without the cheap talk.

So how do multiple rounds help us? They allow us to combine these motives for cheap talk. Take an extended Battle of the Sexes, with a third action A available to Column. LL still pays off (6,2), RR still (2,6) and LR=RL=(0,0). RA or LA pays off (3,0). Before we begin play, we may be playing extended Battle of the Sexes, or we may be playing a game Last Option that pays off 0 to both players unless Column plays A, in which case both players get 4; both games are equally probable ex-ante, and only Row learns which game we actually in. Here, we can enforce a payoff of (4,4) if, when the game is actually extended Battle of the Sexes, we randomize between L and R as in the previous paragraph, but if the game is Last Option, Column always plays A. But the order in which we publicly randomize and reveal information matters! If we first randomize, then reveal which game we are playing, then whenever the public randomization causes us to play RR (giving row player a payoff of 2 in Battle of the Sexes), Row will afterwards have the incentive to claim we are actually playing Last Resort. But if Row first reveals which game we are playing, and then we randomize if we are playing extended Battle of the Sexes, we indeed enforce ex-ante expected payoff (4,4).

Aumann and Hart show precisely what can be achieved with arbitrarily long strings of cheap talk, using a clever geometric proof which is far too complex to even summarize here. But a nice example of how really long cheap talk of this fashion can be used is in a paper by Krishna and Morgan called The Art of Conversation. Take a standard Crawford-Sobel model. The true state of the world is drawn uniformly from [0,1]. I know the true state, and get utility which is maximized when you take action on [0,1] as close as possible to the true state of the world plus .1. Your utility is maximized when you take action as close as possible to the true state of the world. With this “bias”, there is a partially informative one-shot cheap talk equilibrium: I tell you whether we are in [0,.3] or [.3,1] and you in turn take action either .15 or .65. How might we do better with a string of cheap talk? Try the following: first I tell you whether we are in [0,.2] or [.2,1]. If I say we are in the low interval, you take action .1. If I say we are in the high interval, we perform a public randomization which ends the game (with you taking action .6) with probability 4/9 and continues the game with probability 5/9; for example, to publicly randomize we might both shout out numbers between 1 and 9, and if the difference is 4 or less, we continue. If we continue, I tell you whether we are in [.2,.4] or [.4,1]. If I say [.2,.4], you take action .3, else you take action .7. It is easy to calculate that both players are better off ex-ante that in the one-shot cheap talk game. The probabilities 4/9 and 5/9 were chosen so as to make each player indifferent from following the proposed equilibrium after the randomization or not.

The usefulness of the lotteries interspersed with the partial revelation are to let the sender credibly reveal more information. If there were no lottery, but instead we always continued with probability 1, look at what happens when the true state of nature is .19. The sender knows he can say in the first revelation that, actually, we are on [.2,1], then in the second revelation that, actually, we are on [.2,4], in which case the receiver plays .3 (which is almost exactly sender’s ideal point .29). Hence without the lotteries, the sender has an incentive to lie at the first revelation stage. That is, cheap talk can serve to give us jointly controlled lotteries in between successive revelation of information, and in so doing, improve our payoffs.

Final published Econometrica 2003 copy (IDEAS). Sequential cheap talk has had many interesting uses. I particularly enjoyed this 2008 AER by Alonso, Dessein and Matouschek. The gist is the following: it is often thought that the tradeoff between decentralized firms and centralized firms is more local control in exchange for more difficult coordination. But think hard about what information will be transmitted by regional managers who only care about their own division’s profits. As coordination becomes more important, the optimal strategy in my division is more closely linked to the optimal decision in other divisions. Hence I, the regional manager, have a greater incentive to freely share information with other regional managers than in the situation where coordination is less important. You may prefer centralized decision-making when cooperation is least important because this is when individual managers are least likely to freely share useful information with each other.

“Between Proof and Truth,” J. Boyer & G. Sandu (2012)

In the previous post, I promised that game theory has applications in pure philosophy. Some of these applications are economic in nature – a colleague has written a paper using a branch of mechanism design to link the seemingly disjoint methodologies of induction and falsification – but others really are pure. In particular, there is a branch of zero-sum games which deals with the fundamental nature of truth itself!

Verificationist theories of truth like those proposed by Michael Dummett suggest that a statement is true if we can prove it to be true. This may seem trivial, but it absolutely is not: for one, to the extent that there are unprovable statements as in some mathematical systems, we ought say “that statement is neither true nor false” not simply “we cannot prove that statement true or false”. Another school of thought, beginning with Tarski’s formal logics and expanded by Hintikka, says that truth is a property held by sentences. The statement “Snow is white” is true if and only if there is a thing called snow, and it is necessarily white. The statement “A or B” is true if and only if the thing denoted by “A” is true or the thing denoted by “B” is true.

These may seem very different conceptions. The first is a property requiring action – something is true if someone can verify it. The second seems more in the air – something is true if its sentence has certain properties. But take any sentence in formal logic, like “There exists A such that for all B either C or D is true”. We can play a game between Verifier and Falsifier. Move left to right across the sentence, letting the Verifier choose at all existentials and “or” statements, and the Falsifier choose at all universals and “and” statements. Verifier wins if he can get to the end of sentence with the sentence remaining true. That is, semantic games take Tarksi truth and make it playable by someone with agency, at least in principle.

The paper by Boyer and Sandu takes this as a starting point, and discusses when Dunnett’s truth coincides with Tarksi and Hintikka’s truth, restricting ourselves to semantic games played on recursive structures (nonconstructive winning strategies in the semantic game seems problematic if we want to relate truth in semantic games to verificationist truth!) Take statements in Peano arithmetic where all objects chosen are natural numbers (it happens to be truth that in PA, every recursive structure is isomorphic to the natural numbers). When is every statement I can prove also true in the sense of a winning strategy in the recursive semantic game? Conversely, when can the semantic game truth of a sentence by given by a proof? The answer to both is negative. For the first, check that the sentence that all programs x1 and inputs x2, there exists a number of steps y such that the system either halts after y steps or it does not halt. This is the halting problem. It is not decidable, hence there is no winning strategy for Verifier, but the sentence if trivially provable in Peano arithmetic by the law of the excluded middle.

Boyer and Sandu note (as known in an earlier literature) that we can relate the two types of truth by extending the semantic game to allow backward moves. That is, at any node, or at the end of the game, Verifier can go back to any node she played and change her action. Verifier wins if she has a finite winning strategy. It turns out that Verifier can win in the game with backward moves if and only if she can win in the standard game. Further, if a statement can be proven, Verifier can win in the game with backward moves using a recursive strategy. This has some interesting implications for Godel sentences (“This sentence is not provable within the current system.”) which I don’t wish to discuss here.

Note that all of this is just the use of game theory in “games against nature”. We usually think of game theory as being a tool for the analysis of situations with strategic interaction, but the condition that players are rational perfect optimizers means that, in zero sum games, checking whether something is possible for some player just involves checking whether a player called Nature has a winning strategy against him. This technique is so broadly applicable, in economics and otherwise, that we ought really be careful about defining game theory as solely a tool for analyzing the strategies of multiple “actual” agents; e.g., Wikipedia quotes Myerson’s definition that game theory is “the study of mathematical models of conflict and cooperation between intelligent rational decision-makers”. This is too limiting.

Final copy. This article appeared in Synthese 187/March 2012. Philosophers seem to rarely put their working papers online, but Springer has taken down the paywall on Synthese throughout December, so you can read the above link even without a subscription.

Game Theory and History, A. Greif & Friends (1993, 1994)

(This post refers to A. Greif, “Contract Enforceability and Economic Institutions in Early Trade: The Maghribi Traders’ Coalition”, AER 1993, and A. Greif, P. Milgrom & B. Weingast, “Coordination, Commitment and Enforcement: The Case of the Merchant Guild,” JPE 1994.)

Game theory, after a rough start, may actually be fulfilling its role as proposed by Herbert Gintis: unifier of the sciences. It goes without saying that game theoretic analysis is widespread in economics, political science (e.g., voter behavior), sociology (network games), law (antitrust), computer science (defending networks against attacks), biology (evolutionary strategies), pure philosophy (more on this in a post tomorrow!), with occasional appearances in psychology, religion (recall Aumann’s Talmud paper), physics (quantum games), etc. But history? Surely game theory, particularly the more complex recent results, has no place there? Yet Avner Greif, an economic historian at Stanford, has shown that games can play a very interesting role indeed in understanding historical events.

Consider first his Maghribi traders paper. In the 11th and 12th century, a group of Judeo-Arabic traders called the Maghribis traded across the Mediterranean. Two institutional aspects of their trade are interesting. First, they all hired agents in foreign cities to carry out their trade, and second, they generally used other Maghribi merchants as their agents. This is quite different from, for instance, Italy, where merchants tended to hire agents in foreign cities who were not themselves merchants. What explains that difference, and more generally, how can long distance traders insure that traders do not rip you off? For instance, how do I keep them from claiming they sold at a low price when actually they sold at a high one?

To a theorist, this looks like a repeated reputational game with imperfect monitoring. Greif doesn’t go the easy route and just assume there are trustworthy and untrustworthy types. Rather, he assumes that there are a set of potential agents who can be hired in each period, that agents are exogenously separated from merchants with probability p in each period, and that merchants can choose to hire and fire at any wage they choose. You probably know from economics of reputation or from the efficiency wage literature that I need to offer wages higher than the agent’s outside option to keep him from stealing; the value of the continuation game, then, is more than the value of stealing now. Imagine that I fire anyone who steals and never hire him again. How do I ensure that other firms do not then hire that same agent (perhaps the agent will say, “Look, give me a second chance and I will work at a lower wage”)? Well, an agent who has cheated one merchant will never be hired by that merchant again. This means that when he is in the unemployed pool, even if other merchants are willing to hire him, his probability of getting hired is lower, since one merchant will definitely not hire him. That means that the continuation value of the game if he doesn’t steal from me is lower. Therefore, the efficiency wage I must pay him to keep him from stealing is higher than the efficiency wage I can pay someone who hasn’t ever stolen, so I strictly prefer to hire agents who have never stolen. This allows the whole coalition to coordinate. Note that the fewer agents there are, the higher the continuation value from not stealing, and hence the lower the efficiency wage I can pay: it is optimal to keep the set of potential agents small.

What of the Italian merchants? Why do they not hire only each other? Maghribi merchants tended to be involved only in long distance trade, while Italian merchants were also involved in real estate and other pursuits. This means the outside option (continuation value after cheating if no one hires me again) is higher for Italian merchants than Maghribi merchants, which means that hiring merchants at the necessary efficiency wage will be relatively more expensive for Italians than Maghribis.

A followup by Greif, with Milgrom and Weingast, considers the problem of long distance trade from the perspective of cities. Forget about keeping your agent from ripping you off: how do you keep the city from ripping you off? For instance, Genoans in Constantinople had their district overrun by a mob at one point, with no compensation offered. Sicilians raised taxes on sales by Jews at one point after they had brought their goods for sale. You may naively think that reputation alone will be enough; I won’t rip anyone off because I want a reputation of being a safe and fair city for trade.

But again, the literature of repeated games tells us this will not work. Generally, I need to punish deviations from the efficient set of strategies, and punish those who themselves do not punish deviators. In terms of medieval trade, to keep a city from ripping me off, I need not only to punish the city by bringing it less trade, but I also need to make sure the city doesn’t make up for my lost trade by offering a special deal to some other trader. That is, I need to get information about violation against a single trader to other traders, and I need to make sure they are willing to punish the deviating city.

The merchant guild was the institution that solved this problem. Merchant guilds were able to punish their own members by, for example, keeping them from earning rents from special privileges in their own city. In the most general setting, when a guild orders a boycott, cities may be able to attract some trade, but less than the efficient amount, because only by offering a particularly good deal to the merchants who come during a boycott will entice them to come and to credibly believe the city will not steal.

This is all to say that strong guilds may be in the best interest of cities since they allow the city to solve its commitment problem. The historical record confirms many examples of cities encouraging guilds to come trade, and encouraging the strengthening of guilds. Only a reputational model like the above one can explain such city behavior; if guilds are merely extracting rents with monopoly privilege, cities would not encourage them all. Both of these papers, I think, are quite brilliant.

1993 AER (IDEAS version) and 1994 JPE (IDEAS version). Big thumbs up to Avner for having the final published versions of these papers on his website.

“Unbeatable Imitation,” P. Duersch, J. Oechssler & B. C. Schipper (2012)

People, particularly in relatively unimportant situations, rely on heuristics rather than completely rational foresight. But using heuristics in modeling seems to me undesirable, because players using heuristics can easily be abused by more strategic players. For instance, consider the game of fighter pilot Chicken as in the movie Top Gun. Both players prefer going straight while the opponent swerves to swerving when the opponent swerves (hence showing lack of nerve) to swerving when the opponent goes straight (hence showing a unique lack of nerve) to going straight when the opponent goes straight (hence crashing). Consider playing Chicken over and over against an heuristic-based opponent. Perhaps the opponent simply best responds to whatever you did in the previous period. In this case, if I go straight in period 1, the opponent swerves in the next period, and if I swerve, the opponent goes straight. Therefore, I’ll go straight in periods 1 through infinity, knowing my opponent will swerve in every period except possibly the first. The sophisticated player will earn a much higher payoff than the unsophisticated one.

Duersch et al show that, in every 2×2 symmetric game and in a large class of N-by-N symmetric two-player games, a simple heuristic called “imitation” has an undiscounted average payoff identical to that which can be achieved by an opponent playing any strategy at all. In imitation, I retain my strategy each period unless the opponent earned strictly more than I did in the previous period, in which case I copy him. Consider Chicken again. If I go straight and the opponent swerves, then I know he will go straight in the following period. In the next period, then, I can either crash into him (causing him to swerve two periods on) or swerve myself (causing him to go straight two periods on). In any case, I can at best get my opponent to swerve while I go straight once every two periods. By symmetry, in the periods where this doesn’t happen, I can at best get the payoff from swerving when my opponent goes straight, meaning my average payoff is no better than my heuristic-based opponent! This is true no matter what strategy is used against the imitating opponent.

Now imitation will fail in many games, of course. Consider Rock-Paper-Scissors. If I play Rock when you play Scissors, then since you imitate, you will switch to Rock in the next period. Knowing this, I will play Paper, and so on, winning every period. Games that have this type of cycling possibility allow me to extract arbitrarily larger higher payoff than the imitating opponent. What’s interesting is that, in finite symmetric two-player games between an imitator and an agent with perfect rationality, games with a possibility of cycling in some subgame are the only ones in which the imitator does not earn the same average payoff per period as the rational player! Checking this condition is difficult, but games with no pure equilibrium in the relative payoff game (i.e., the game where payoffs for each player are equal to the difference in payoffs between players in the original game, hence making the original game zero-sum) always have a cycle, and games which are quasiconcave never do. Many common games (oligopoly competition, Nash bargaining, etc.) can be written as quasiconcave games.

Imitation is really pretty unique. The authors give the example of a 3×3 symmetric oligopoly game, where strategy 1 is “produce Cournot quantity”, strategy 2 is “produce Stackelberg follower quantity” and strategy 3 is “produce Stackelberg leader quantity.” The game has no subgames with cycles as defined above, and hence imitators and the rational player earn the same average payoff (if rational player plays Stackelberg leader and I play something else, then he earns more than me, hence I imitate him next period, hence he best responds by playing Stackelberg follower). Other heuristics do much worse than imitation. A heuristic where you simply best reply just plays Stackelberg follower forever, for example.

This result is quite interesting, and the paper is short; the “useful insight on the worst page” test of a quality paper is easily satisfied. I like this work too because it is related to some ideas I have about the benefits of going first. Consider shifting a symmetric simultaneous game to a symmetric sequential game. Going first has no benefit except that it allows me to commit to my action (and many negatives, of course, including the inability to mix strategies). Likewise a heuristic rule allows the heuristic player to commit to actions without assuming perfection of the equilibrium. So there is a link between “optimal” heuristics and the desire of a rational player to commit to his action in advance if he could so choose.

November 2011 Working Paper (IDEAS version). Final version published in the September 2012 issue of Games and Economic Behavior.

“Aggregate Comparative Statics,” D. Acemoglu & M. K. Jensen (2011)

Enormous parts of economic theory are devoted to “comparative statics”: if one variable changes (one chickpea firm’s cost structure decreases, demand for pineapples goes up, firms receive better information about the size of an oil field before bidding on rights to that field), does another variable increase or decrease (chickpea prices rise or fall, equilibrium supply of pineapples rises or falls, revenue for the oil field auctioneer rises or falls)?

There are a couple ways we can check comparative statics. The traditional way is using the implicit function theorem. Call t the variable you wish to change, and x the endogenous variable who comparative static you wish to check. If you have a model for which you can solve for x as a function of t, then finding the sign of dx/dt (or the nondifferential analogue) is straightforward. This is rare indeed in many common economic models; for instance, if x is supply of firm 1 in a Cournot duopoly, and t is the marginal cost of firm 2, and all I know is that demand satisfies certain properties, then since I haven’t even specified the functional form of demand, I will not be able to take an explicit derivative. We may still be in luck, however. Let f(x(t),t) be the function an agent is maximizing, which depends on x the agent can choose, and t which is exogenous. If the optimal choice x* lies in the interior of a convex strategy set, and f is concave and twice differentiable, then the first order approach applies, and hence at any optima, f’(x*,t)=0. By the chain rule, f’(x*,t)=0=(df/dx*)(dx*/dt)+(df/dt), or dx*/dt=-(df/dt)/(df/dx*).

So far, so good. But often, our maximand is neither differentiable nor concave, our strategy set is not convex, or our solution is not interior. What are we to do? Here, the monotone comparative statics approach (Athey and Milgrom are the early reference here) allows us to sign without many assumptions. We still don’t have many good results for games of strategic substitutes, however; in these types of games, my best response is decreasing in your action, so if a parameter change causes you to increase your action, I will want to decrease my action. Further, there are many games where player actions are neither strategic complements nor strategic substitutes. A good example is a patent race. If you increase your effort, I may best respond by increasing my effort (in order to “stay close”) or decreasing my effort (because I am now so far behind that I have no chance of inventing first).

Acemoglu and Jensen show that both types of games can be handled nicely if the games are “aggregative games” where my action depends only on the aggregate output of all firms rather than the individual actions of other firms. My choice of production in a Cournot oligopoly depends only on aggregate production by all firms, and my choice of effort in an independent Poisson patent race depends only on the cumulative hazard rate of invention. In such a case, if the game is one of strategic substitutes in the choice variable x, and the aggregate quantity is some increasing function of the sum of all x and t, then an increase in t leads to a decrease in equilibrium choices x, and entry by another player increases the aggregate quantity. If t(i) is an idiosyncratic exogenous variable that affects i’s choice, and only affects other players’ actions through the aggregate x, then an increase in t(i) increases x(i) and decreases x(j) for all other j. [When I say "increase" and "decrease" for the equilibrium quantities, when there are multiple equilibria, this means that the entire set of equilibria quantities shift up or down.]

For example, take a Cournot game, where my profits are s(i)*P(X+t)-c(i)(s(i),t(i)), where s(i) is my supply choice, P(X+t) is price given the total industry supply plus a demand shifter t, and c(i) is my cost function for producing s(i) given some idiosyncratic cost shifter t(i) such that s(i) and t(i) obey the single crossing property. Assume P is twice differentiable. Then taking the derivative of the profit function with respect to other players’ supply, we have that my supply is a strategic substitute with other’s supply iff P’+s(i)P”<0. Note that this condition does not depend on any assumption about the cost function c. Then using the theorem in the last paragraph, we have that, in a Cournot game with *any* cost function, a decrease in the demand curve decreases total industry supply, an additional entrant decreases equilibrium supply from existing firms, and a decrease in costs for one firm increases that firm's equilibrium output and decreases the output of all other firms. All we assumed to get this was the strategic substitutes property, twice-differentiability of P, and the single crossing property of the cost function in s and t. As with many comparative statics results, checking strategic substitutability and making sure the signs of s and t are correct for the interpretation of results is often easier than taking implicit derivatives even if your assumptions are such that the implicit approach could conceivably work.

If the strategic substitutes property does not hold, as in a patent race, then with compact, convex strategy sets, a payoff function pseudoconcave in own player strategies, a boundary condition, and a “local solvability” condition are sufficient to get the same results. Local solvability is roughly the condition which guarantees that the player’s own effect on the aggregate when she reallocates after a change in t or t(i) does not alter the result from the previous paragraph enough to flip the comparative static’s sign.

April 2011 Working Paper, as yet unpublished (IDEAS version).

“Evolutionary Dynamics and Backward Induction,” S. Hart (2002)

Let’s follow up yesterday’s post with an older paper on evolutionary games by the always-lucid Sergiu Hart. As noted in the last post, there are many evolutionary dynamics for which the rest points of the evolutionary game played by completely myopic agents and the Nash equilibria of the equivalent static game played by strategic games coincide, which is really quite phenomenal (and since you know there are payoff-suboptimal Nash equilibria, results of this kind have, since Maynard Smith (1973), fundamentally changed our understanding of biology). Nash equilibria is a low bar, however. Since Kuhn (1953), we have also known that every finite game has a backward induction equilibrium, what we now call the subgame perfect equilibrium, in pure strategies. When does the invariant limit distribution of an extensive form evolutionary game coincide with the backward induction equilibrium? (A quick mathematical note: an evolutionary system with mutation, allowing any strategy to “mutate” on some agent with some probability in each state, means that by pure luck the system can move from any state to any other state. We also allow evolutionary systems to have selection, meaning that with some probability in each state an agent switches from his current strategy to one with a higher payoff. This process defines a Markov chain, and since the game is finite and the mutations allow us to reach any state, it is a finite irreducible Markov chain. Such Markov chains have a unique invariant distribution in the limit.)

In general, we can have limit distributions of evolutionary processes that are not the backward induction equilibrium. Consider the following three step game. Agent 1 chooses C or B, then if C was chosen, agent 2 (in the agent-normal form) chooses C2 or B2, then if C and B2 were chosen, agent 3 chooses C3 or B3. The payoff to each agent when B is chosen is (4,0,0), when C and C2 are chosen is (5,9,0), when C, B2 and C3 are chosen is (0,0,0), and when C, B2 and B3 are chosen is (0,10,1). You can see that (C,C2,C3) and (B,B2,B3) are both Nash, but only (B,B2,B3) is subgame perfect, and hence the backward induction equilibrium. Is (B,B2,B3) the limit distribution of the evolutionary game? In the backward induction equilibrium, agent 1 chooses B at the first node, and hence nodes 2 and 3 are never reached, meaning only mutation, and not selection, affect the distribution of strategies at those nodes. Since the Markov chain is ergodic, with probability 1 the proportion of agents at node 2 playing B2 will fall below .2; when that happens, selection at node 1 will push agents toward C instead of B. When this happens, now both nodes 2 and 3 are reached with positive probability. If less than .9 of the agents in 3 are playing B3, then selection will push agents at node 2 toward C2. Selection can therefore push the percentage of agents playing B2 down to 0, and hence (C,C2,C3) can be part of the limit invariant distribution even though it is not the backward induction solution.

So is backward induction unjustifiable from an evolutionary perspective? No! Hart shows that if the number of agents goes to infinity as the probability of mutation goes to zero, then the backward induction solution, when it is unique, is also the only element in the limit invariant distribution of the evolutionary game. How does letting the number of agents go to infinity help? Let Bi be an element of the backward induction equilibrium at node i somewhere in the game tree. Bi must be a best reply in the subgame beginning with i if Bj is played in all descendant nodes by a sufficiently high proportion of the population, so if Bi is not a best reply (and hence selection does not push us toward Bi) it must be that Bj is not being played further down the game tree. If Bi is a best reply in the subgame beginning with i, then most of the population will play Bi because of selection pressures.

Now here’s the trick. Consider the problematic case in the example, when node i is not being reached in a hypothesized limit distribution (if i is reached, then since the probability of mutation goes to zero, selection is much stronger than mutation, and hence non best replies will go away in the limit). Imagine that there is another node g preceding i which is also not reached, and that i is only reached when some strategy outside the backward induction equilibrium is played in g. When g and i are not reached, there is no selection pressure, and hence no reason that the backward induction equilibrium node will be played. Large populations help here. With some small probability, an agent in g mutates such that he plays the node which reaches i. This still has no effect unless there is also a mutation in the node before g that causes g to be reached. The larger the population, the lower the probability the specific individual who mutated in g mutates back before any individual in the node before g mutates. Hence larger populations make it more likely that rare mutations in unreached nodes will “coordinate” in the way needed for selection to take over.

Final GEB version (IDEAS version). Big up to Sergiu for posting final pdfs of all of his papers on his personal website.

“Survival of Dominated Strategies Under Evolutionary Dynamics,” J. Hofbauer & W. Sandholm (2011)

There is a really interesting tension in a lot of economic rhetoric. On the one hand, we have results that derive from optimal behavior by agents with rational foresight: “price equals marginal cost in competitive markets because of profit-maximizing behavior” or “Policy A improves welfare in a dynamic general equilibrium setting with utility-maximizers”. Alternatively, though, we have explanations that rely on dynamic consequences to even non-maximizing agents: “price equals marginal cost in competitive markets because firms who price about MC are driven out by competition” or “Policy A improves welfare in a dynamic general equilibrium, and the dynamic equilibrium is sensible because firms adjust myopically as if in a tatonnement process.”

These two types of explanation, without further proof, are not necessarily the same. Profit-maximizing firms versus firms disciplined by competition give completely different welfare results under monopoly, since the non profit-maximizing monopolist can be very wasteful and yet still make positive profits. In a dynamic context, firms adjust myopically to excess demand in some markets, rather than profit-maximizing according to rational expectations, will not necessarily converge to equilibrium (a friend mentioned that Lucas made precisely this point in a paper in the 1970s).

How can we square the circle? At least in static games, there has been a lot of work here. Nash and other strategic equilibrium concepts are well known. There is also a branch of game theory going back to the 1950s, evolutionary games, where rather than choosing strategically, a probability vector lists what portion of the players are playing a given strategy at a given time, resulting in some payoffs. A revision rule, perhaps stochastic to allow for “mutations” as in biology, then tells us how the vector of strategies updates conditional on payoffs in the previous round. Fudenberg and Kreps’ learning model from the 1980s is a special case.

Amazingly, it is true for almost all sensible revision rules that the set of rest points of the dynamic includes every Nash equilibrium of the underlying static game, and further that for many revision rules the dynamic rest points are exactly equivalent to the set of Nash equilibria. We have one problem, however: dynamic systems needn’t converge to points at all, but rather may converge to cycles or other outcomes.

Hofbauer and Sandholm – Sandholm being both a graduate of my institution and probably the top economist in the world today on evolutionary games – show that for any revision rule satisfying a handful of common properties, we can construct a game where strictly dominated strategies are played with positive probability. This includes any dynamic meeting the following four properties: the population law of motion is continuous in payoffs and the current population vector, there is positive correlation between strategy growth rates and current payoffs, the dynamic is at rest iff the strategy vector is a Nash equilibrium of the underlying static game, and if an unplayed strategy has sufficiently high rewards, then with positive probability some agents begin using it. These criterion are satisfied by “excess payoff dynamics” like BNN where strategies with higher than average payoffs have higher than average growth rates, and by “pairwise comparison dynamics” where agents switch with positive probability to strategies which have higher payoff than their own current payoff. A myopic best response is not continuous, and indeed, myopic best response has been shown to eliminate strictly dominated strategies.

The proof involves a quite difficult topological construction which I don’t discuss here, but it’s worth discussing the consequence of this result. In strategic situations where we may think agents lack full rationality or rational foresight, and where we observe cycle or other non-rest behavior over time, we should be hesitant to ignore strictly dominated actions (particularly ones that are only dominated by a small amount) in our analysis of the situation. There is also scope for policy improvements: if agents are learning using a dynamic which does not rule out strictly dominated strategies, we may be able to provide information which coerces an alternative dynamic which will rule out such strategies.

Final version in Theoretical Economics 6 (IDEAS version). Another big thumbs up to the journal Theoretical Economics, the Lake Wobegon of econ, where the submissions are free, the turnaround time is fast, and all the articles are totally ungated.

“The Nash Bargaining Solution in Economic Modeling,” K. Binmore, A. Rubinsten & A. Wolinsky (1986)

If we form a joint venture, our two firms will jointly earn a profit of N dollars. If our two countries agree to this costly treaty, total world welfare will increase by the equivalent of N dollars. How should we split the profit in the joint venture case, or the costs in the case of the treaty? There are two main ways of thinking about this problem: the static bargaining approach developed first by John Nash, and bargaining outcomes that form the perfect outcome of a strategic game, for which Rubinstein (1982) really opened the field.

The Nash solution says the following. Let us have some pie of size 1 to divide. Let each of us have a threat point, S1 and S2. Then if certain axioms are followed (symmetry, invariance to unimportant transformations of the utility function, Pareto optimality and something called the IIA condition), the bargain is the one that maximizes (u1(p)-u1(S1))*(u2(1-p)-u2(S2)), where p is the share of the pie of size 1 that accrues to player 1. So if we both have linear utility, player 1 can leave and collect .3, and player 2 can leave and collect 0, but a total of 1 is earned by our joint venture, the Nash bargaining solution is the p that maximizes (p-.3)*(1-p-0); that is, p=.65. This is pretty intuitive: 1-.3-0=.7 of surplus is generated by the joint venture, and we each get our outside option plus half of that surplus.

The static outcome is not very compelling, however, as Tom Schelling long ago pointed out. In particular, the outside option looks like a noncredible threat: If player 2 refused to offer player 1 more than .31, then Player 1 would accept given his outside option is only .3. That is, in a one-shot bargaining game, any p between .3 and 1 looks like an equilibrium. It is also not totally clear how we should interpret the utility functions u1 and u2, and the threat points S1 and S2.

Rubinstein bargaining began to fix this. Let players make offers back and forth, and let there be a time period D between each offer. If no agreement is reached after T periods, we both get our outside options. Under some pretty compelling axioms, there is a unique perfect equilibrium whereby player 1 gets p* if he makes the first offer, and p** if player 2 makes the first offer. Roughly, if the time between offers is D, player 1 must offer player 2 a high enough share that player 2 is indifferent between that share today and the amount he could earn when he makes an offer in the next period. Note that the outside options do not come into play unless, say, player 1′s outside option is higher than min{p*,p**}. Note also that as D goes to 0, all of the difference in bargaining power has to do with who is more patient. Binmore et al modify this game so that, instead of discounting the future, rather there is a small chance that the gains from negotiation will disappear (“breakdown”) in between every period; for instance, we may want to form a joint venture to invent some product, but while we negotiate, another firm may swoop in and invent it. It turns out that this model, with von Neumann-Morganstern utility functions for each player (though perhaps differing levels of risk aversion) is a special case of Rubinstein bargaining.

Binmore et al prove that as D goes to zero, both strategic cases above have unique perfect equilibria equal to a Nash bargaining solution. But a Nash solution for what utility functions and threat points? The Rubinstein game limits to Nash bargaining where the difference in utilities has to do with time preference, and the threat points S1 and S2 are equal to zero. The breakdown game limits to Nash bargaining where the difference in utilities has to do with risk aversion, and the threat points S1 and S2 are equal to whatever utility we would get from the world after breakdown.

Two important points: first, it was well known that a concave transformation of a utility function leads to a worse outcome in Nash bargaining for that player. But we know from the previous paragraph that this concave transformation is equivalent to a more impatient Rubinstein bargainer: a concave transformation of the utilities in the Nash outcome has to do with changing the patience, not the risk aversion, of players. Second, Schelling was right when he argued that the Nash threat points involve noncredible threats. As long as players prefer their Rubinstein equilibrium outcome to their outside option, the outside option does not matter for the bargaining outcome. Take the example above where one player could leave the joint venture and still earn .3. The limit of Rubinstein bargaining is for each player to earn .5 from the joint venture, not .65 and .35. The fact that one player could leave the joint venture and still earn .3 is totally inconsequential to the negotiation, since the other player knows that this threat is not credible whenever the first player could earn at least .31 by staying. This point is often wildly misunderstood when people apply Nash bargaining solutions: properly defining the threat point matters!

Final RAND version (IDEAS). There has been substantial work since the 80s on the problem of bargaining, particularly in trying to construct models where delay is generated, since Rubinstein guarantees agreement immediately and real-world bargaining rarely ends in one step; unsurprisingly, these newer papers tend to rely on difficult manipulation of theorems using asymmetric information.

“Sociology and Game Theory: Contemporary and Historical Perspectives,” R. Swedberg (2001)

Game theory is clearly the dominant theoretical paradigm in economics, and hugely important in political science as well, but does not appear to have made much of an impression on sociology. There is something odd about this. Swedberg quotes Max Weber, father of both our traditions, in an essay written just after 1900:

“The historical agent, to the extent that he is acting, as we are here assuming, in a strictly “rational” way, takes into account those “conditions” of the future course of events which interest him, which are “external” to him and, as far as he knows, given in reality. He then, in his mind, fits into the causal nexus various “possible” courses of action for himself, together with the consequences to be anticipated from them in combination with those “external” conditions, in order to decide on one or another of the courses of action appropriate to his “goal” in accordance with the “possible” outcomes which he has worked out in his mind.”

Weber here gives a very nice definition of the type of methodogically individualist decision theory which we economists use today. Yet 100 years later, game theory is not accepted in the mainstream of sociology. Why?

Richard Swedberg, in an essay for Theory and Society, attempts to answer that question. Tom Schelling, he of the social dilemma and the tipping point, unsurprisingly gained some traction with sociologists in the 1950s and 1960s. Both von Neumann and Morganstern, as well as Luce and Raiffa, were reviewed in the American Sociological Review, though the reviewer in both cases was an economist. Only a handful of models caught on, however. From my reading of Swedberg’s analysis, it seems four reasons were predominant.

First, the mathematical ability of sociologists (in that era, and to a lesser extent now) was too low to follow frontier developments in game theory. Second, many schools of sociology were, and are, uncomfortable with using such an individualist theory to analyze the outcome of social forces. Third, the use of game theory to make empirical predictions was less than successful – as Swedberg notes, game theory is better at explaning “why” than “how much?” Such connection with empirical reality was, and is, considered far more important by sociologists than economists. And fourth, some sociologists felt the axiomatics of game theory were far too demanding, and rather than expand the theory to incorporate incomplete information, or limits on reasoning power, or other factors that economists later studies, sociology simply left the technique behind. Swedberg argues that game theory in sociology is nonetheless a powerful tool for examining counterfactuals (something difficult to examine empirically, of course), and further that sociologists can contribute to game theory by discussing what the action spaces and payoffs look like in certain games, the ex-ante step that many economists do not take seriously enough. I would love to see precisely this sort of work by sociologists.

What is not made clear in the article, and perhaps is not well known to mathematical sociologists, are certain aspects of the history of game theory within economics. Sociologists were by no means alone in recognizing two big flaws in early game theory: the exact specification of the game form, including action spaces and payoffs, is very important, and the multiplicity of equilibria (or a nonunique core in cooperative games) makes analysis difficult. These two problems were to a large extent solved between the late 1970s and the late 1980s. The problem of specifying the game form in all its institutional detail was handled through the field of mechanism design, which roughly asks whether a planner or a principal can implement some outcome as the equilibrium of some game within a very large class. The robustness and refinement of equilibria led to a lot of work which helped us better understand what the Nash fixed point imposes in terms of decisionmaking, but ultimately led to something of a negative result: we will never have an equilibrium concept which is suitable for all types of games, and certainly not one which provides unique solutions.

Mechanism design and the robustness literature are interesting for two reasons. First, more or less 100 percent of the work in those areas was done by economists rather than by other social scientists – it is particularly interesting how little was done concerning the theory of games by practitioners in other fields. Any understanding of why applied game theory is most common in economics surely must be able to explain the absolute dominance of economists in pure game theory. Second, here is as good a place as any to give some dap to my home department, Northwestern MEDS. Some incredible amount of the important work in this game theory golden age was done there. To name only a few, the department at one time was home to Milgrom, Myerson, Stokey, Satterthwaite, Holmstrom, Baron, Ledyard, Kalai, Kamien, Sonnenschein, John Roberts, almost all of whom were young researchers at their productive peaks – ex-post, that must have been the best theoretical economics department of all time, right?

Final article from Theory and Society (No IDEAS link found). Tips of the hat to Richard Swedberg for posting full, final articles on his website, and to the post at orgtheory where I first came across this article.

“Clocks and Trees: Isomorphic Dutch Auctions and Centipede Games,” J. C. Cox & D. James (2012)

There has been a substantial amount of effort by experimental economists showing where lab subjects take actions other than Nash equilibria play in common games. Folks appear to give away money in ultimatum games, bid too high in Dutch auctions, wait too long to end the centipede game, and so on. The most common response to these experimental facts is to propose alternative utility functions (such as forms of social preferences) or alternative equilibrium definitions that incorporate particular types of errors (like the Quantal Response).

Cox and James dig back into Selten’s famous Chain Store Paradox paper for an alternative explanation. (An aside: I imagined this had to be the most cited paper ever to appear in Theory and Decision; it is.) Selten points out that the very first thing an agent in a game must do is understand what her options are, what information can be transmitted, and what the consequences of any tuple of actions will be. Such information must generally be deduced by an agent from the description of the game; it is not obvious. When we teach game theoretic reasoning, we generally make such complicated deduction easier through the use of an extensive-form game tree. That is, how well students will do on a game theory exam is absolutely affected by how the information in an identical game is presented.

Cox and James discuss this point in the context of Dutch auctions and centipede games. In a Dutch auction, agents have a private valuation for a good, the auction starts at some amount higher than that valuation, and then the price drops until some agents “bids”, winning the good at that price, and getting payoff equal to her valuation minus the bid price. Alternatively, we could write a (discrete-interval) Dutch auction out as a game tree. If you have valuation 6 and you bid when the auction is at “5″, you win a dollar. Every possible bid by each player can be written out in the tree, along with consequences. It turns out that presenting information in such a tree led to actions which are roughly the same as predicted by Nash equilibrium, whereas the traditional declining-clock Dutch auction without a tree led to the overbidding seen in other experiments.

The centipede game is a bit different. Traditional experiments present the game as an extensive-form game tree. But there is a nice way to run a centipede experiment using a clock. Start at state 0, giving player 1 the option to stop the game and win his valuation. If the player does not stop the game within 10 seconds, shift to state 2, where player 2 has the option to stop the game and win her valuation plus 2, giving player 1 nothing. If she does not stop the game after 10 seconds, go to stage 3, and so on. The clock version of the centipede game led almost all subjects to stop at 0, which is the unique Nash equilibrium.

There are two things you might take out of these results. One is that play in games involves both the player’s strategic reasoning as well as their comprehension about how the game operates. Both are important. It makes me skeptical of some experimental results showing non-equilibrium outcomes. Second, you might imagine that we could incorporate learning about the options available to players as they play repeatedly into an equilibrium definition. The draft of this paper I read did not refer to the 1990s learning-in-games theoretical literature, which absolutely has discussed this point, but perhaps the final Econometrica version does.

Working paper (IDEAS version). Final paper published in March 2012 Econometrica.

Follow

Get every new post delivered to your Inbox.

Join 87 other followers

%d bloggers like this: