## “The Explanatory Relevance of Nash Equilibrium: One-Dimensional Chaos in Boundedly Rational Learning,” E. Wagner (2013)

The top analytic philosophy journals publish a surprising amount of interesting game and decision theory; the present article, by Wagner in the journal Philosophy of Science, caught my eye recently.

Nash equilibria are stable in a static sense, we have long known; no player wishes to deviate given what others do. Nash equilibria also require fairly weak epistemic conditions: if all players are rational and believe the other players will play the actual strategies they play with probability 1, then the set of outcomes is the Nash equilibrium set. A huge amount of work in the 80s and 90s considered whether players would “learn” to play Nash outcomes, and the answer is by and large positive, at least if we expand from Nash equilibria to correlated equilibria: fictitious play (I think what you do depends on the proportion of actions you took in the past) works pretty well, rules that are based on the relative payoffs of various strategies in the past work with certainty, and a type of Bayesian learning given initial beliefs about the strategy paths that might be used generates Nash in the limit, though note the important followup on that paper by Nachbar in Econometrica 2005. (Incidentally, a fellow student pointed out that the Nachbar essay is a great example of how poor citation measures are for theory. The paper has 26 citations on Google Scholar mainly because it helped kill a literature; the number of citations drastically underestimates how well-known the paper is among the theory community.)

A caution, though! It is not the case that every reasonable evolutionary or learning rule leads to an equilibrium outcome. Consider the “continuous time imitative-logic dynamic”. A continuum of agents exist. At some exponential time for each agent, a buzzer rings, at which point they randomly play another agent. The agent imitates the other agent in the future with probability exp(beta*pi(j)), where beta is some positive number and pi(j) is the payoff to the opponent; if imitation doesn’t occur, a new strategy is chosen at random from all available strategies. A paper by Hofbauer and Weibull shows that as beta grows large, this dynamic is approximately a best-response dynamic, where strictly dominated strategies are driven out; as beta grows small, it looks a lot like a replicator dynamic, where imitation depends on the myopic relative fitness of a strategy. A discrete version of the continuous dynamics above can be generated (all agents simultaneously update rather than individually update) which similarly “ranges” from something like the myopic replicator to something like a best response dynamic as beta grows. Note that strictly dominated strategies are not played for any beta in both the continuous and discrete time i-logic dynamics.

Now consider a simple two strategy game with the following payoffs:

Left Right
Left   (1,1) (a,2)
Right (2,a) (1,1)

The unique Nash equilibrium is X=1/A. Let, say, A=3. When beta is very low (say, beta=1), and players are “relatively myopic”, and the initial condition is X=.1, the discrete time i-logic dynamic converges to X=1/A. But if beta gets higher, say beta=5, then players are “more rational” yet the dynamic does not converge or cycle at all: indeed, whether the population plays left or right follows a chaotic system! This property can be generated for many initial points X and A.

The dynamic here doesn’t seem crazy, and making agents “more rational” in a particular sense makes convergence properties worse, not better. And since play is chaotic, a player hoping to infer what the population will play next is required to know the initial conditions with certainty. Nash or correlated equilibria may have some nice dynamic properties for wide classes of reasonable learning rules, but the point that some care is needed concerning what “reasonable learning rules” might look like is well taken.

Final 2013 preprint. Big thumbs up to Wagner for putting all of his papers on his website, a real rarity among philosophers. Actually, a number of his papers look quite interesting: Do cooperate and fair bargaining evolve in tandem? How do small world networks help the evolution of meaning in Lewis-style sender-receiver games? How do cooperative “stag hunt” equilibria evolve when 2-player stag hunts have such terrible evolutionary properties? I think this guy, though a recent philosophy PhD in a standard philosophy department, would be a very good fit in many quite good economic theory programs…

## “On the Origin of States: Stationary Bandits and Taxation in Eastern Congo,” R. S. de la Sierra (2013)

The job market is yet again in full swing. I won’t be able to catch as many talks this year as I would like to, but I still want to point out a handful of papers that I consider particularly elucidating. This article, by Columbia’s de la Sierra, absolutely fits that category.

The essential question is, why do states form? Would that all young economists interested in development put their effort toward such grand questions! The old Rousseauian idea you learned your first year of college, where individuals come together voluntarily for mutual benefit, seems contrary to lots of historical evidence. Instead, war appears to be a prime mover for state formation; armed groups establish a so-called “monopoly on violence” in an area for a variety of reasons, and proto-state institutions evolve. This basic idea is widespread in the literature, but it is still not clear which conditions within an area lead armed groups to settle rather than to pillage. Further, examining these ideas empirically seems quite problematic for two reasons, first because states themselves are the ones who collect data hence we rarely observe anything before states have formed, and second, because most of the planet has long since been under the rule of a state (with apologies to James Scott!)

De la Sierra brings some economics to this problem. What is the difference between pillaging and sustained state-like forms? The pillager can only extract assets on its way through, while the proto-state can establish “taxes”. What taxes will it establish? If the goal is long-run revenue maximization, Ramsey long ago told us that it is optimal to tax elements that are inelastic. If labor can flee, but the output of the mine can not, then you ought tax the output of the mine highly and set a low poll tax. If labor supply is inelastic but output can be hidden from the taxman, then use a high poll tax. Thus, when will bandits form a state instead of just pillaging? When there is a factor which can be dynamically taxed at such a rate that the discounted tax revenue exceeds what can be pillaged today. Note that the ability to, say, restrict movement along roads, or to expand output through state-owned capital, changes relevant tax elasticities, so at a more fundamental level, capacity by rebels along these margins are also important (and I imagine that extending de la Sierra’s paper will involve the evolutionary development of these types of capacities).

This is really an important idea. It is not that there is a tradeoff between producing and pillaging. Instead, there is a three way tradeoff between producing in your home village, joining an armed group to pillage, and joining an armed group that taxes like a state! The armed group that taxes will, as a result of its desire to increase tax revenue, perhaps introduce institutions that increase production in the area under its control. And to the extent that institutions persist, short-run changes that cause potential bandits to form taxing relationships may actually lead to long-run increases in productivity in a region.

De la Sierra goes a step beyond theory, investigating these ideas empirically in the Congo. Eastern Congo during and after the Second Congo War was characterized by a number of rebel groups that occasionally just pillaged, but occasionally formed stable tax relationships with villages that could last for years. That is, the rebels occasionally implemented something looking like states. The theory above suggests that exogenous changes in the ability to extract tax revenue (over a discounted horizon) will shift the rebels from pillagers to proto-states. And, incredibly, there were a number of interesting exogenous changes that had exactly that effect.

The prices of coltan and gold both suffered price shocks during the war. Coltan is heavy, hard to hide, and must be shipped by plane in the absence of roads. Gold is light, easy to hide, and can simply be carried from the mine on jungle footpaths. When the price of coltan rises, the maximal tax revenue of a state increases since taxable coltan production is relatively inelastic. This is particularly true near airstrips, where the coltan can actually be sold. When the price of gold increases, the maximal tax revenue does not change much, since gold is easy to hide, and hence the optimal tax is on labor rather than on output. An exogenous rise in coltan prices should encourage proto-state formation in areas with coltan, then, while an exogenous rise is gold prices should have little impact on the pillage vs. state tradeoff. Likewise, a government initiative to root out rebels (be they stationary or pillaging) decreases the expected number of years a proto-state can extract rents, hence makes pillaging relatively more lucrative.

How to confirm these ideas, though, when there was no data collected on income, taxes, labor supply, or proto-state existence? Here is the crazy bit – 11 locals were hired in Eastern Congo to travel to a large number of villages, spend a week there querying families and village elders about their experiences during the war, the existence of mines, etc. The “state formation” in these parts of Congo is only a few years in the past, so it is at least conceivable that memories, suitably combined, might actually be reliable. And indeed, the data do seem to match aggregate trends known to monitors of the war. What of the model predictions? They all seem to hold, and quite strongly: the ability to extract more tax revenue is important for proto-state formation, and areas where proto-states existed do appear to have retained higher productive capacity years later perhaps as a result of the proto-institutions those states developed. Fascinating. Even better, because there is a proposed mechanism rather than an identified treatment effect, we can have some confidence that this course is, to some extent, externally valid!

December 2013 working paper (No IDEAS page). You may wonder what a study like this costs (particularly if you are, like me, a theorist using little more than chalk and a chalkboard); I have no idea, but de la Sierra’s CV lists something like a half million dollars of grants, an incredible total for a graduate student. On a personal level, I spent a bit of time in Burundi a number of years ago, including visiting a jungle camp where rebels from the Second Congo War were still hiding. It was pretty amazing how organized even these small groups were in the areas they controlled; there was nothing anarchic about it.

## “Unbeatable Imitation,” P. Duersch, J. Oechssler & B. C. Schipper (2012)

People, particularly in relatively unimportant situations, rely on heuristics rather than completely rational foresight. But using heuristics in modeling seems to me undesirable, because players using heuristics can easily be abused by more strategic players. For instance, consider the game of fighter pilot Chicken as in the movie Top Gun. Both players prefer going straight while the opponent swerves to swerving when the opponent swerves (hence showing lack of nerve) to swerving when the opponent goes straight (hence showing a unique lack of nerve) to going straight when the opponent goes straight (hence crashing). Consider playing Chicken over and over against an heuristic-based opponent. Perhaps the opponent simply best responds to whatever you did in the previous period. In this case, if I go straight in period 1, the opponent swerves in the next period, and if I swerve, the opponent goes straight. Therefore, I’ll go straight in periods 1 through infinity, knowing my opponent will swerve in every period except possibly the first. The sophisticated player will earn a much higher payoff than the unsophisticated one.

Duersch et al show that, in every 2×2 symmetric game and in a large class of N-by-N symmetric two-player games, a simple heuristic called “imitation” has an undiscounted average payoff identical to that which can be achieved by an opponent playing any strategy at all. In imitation, I retain my strategy each period unless the opponent earned strictly more than I did in the previous period, in which case I copy him. Consider Chicken again. If I go straight and the opponent swerves, then I know he will go straight in the following period. In the next period, then, I can either crash into him (causing him to swerve two periods on) or swerve myself (causing him to go straight two periods on). In any case, I can at best get my opponent to swerve while I go straight once every two periods. By symmetry, in the periods where this doesn’t happen, I can at best get the payoff from swerving when my opponent goes straight, meaning my average payoff is no better than my heuristic-based opponent! This is true no matter what strategy is used against the imitating opponent.

Now imitation will fail in many games, of course. Consider Rock-Paper-Scissors. If I play Rock when you play Scissors, then since you imitate, you will switch to Rock in the next period. Knowing this, I will play Paper, and so on, winning every period. Games that have this type of cycling possibility allow me to extract arbitrarily larger higher payoff than the imitating opponent. What’s interesting is that, in finite symmetric two-player games between an imitator and an agent with perfect rationality, games with a possibility of cycling in some subgame are the only ones in which the imitator does not earn the same average payoff per period as the rational player! Checking this condition is difficult, but games with no pure equilibrium in the relative payoff game (i.e., the game where payoffs for each player are equal to the difference in payoffs between players in the original game, hence making the original game zero-sum) always have a cycle, and games which are quasiconcave never do. Many common games (oligopoly competition, Nash bargaining, etc.) can be written as quasiconcave games.

Imitation is really pretty unique. The authors give the example of a 3×3 symmetric oligopoly game, where strategy 1 is “produce Cournot quantity”, strategy 2 is “produce Stackelberg follower quantity” and strategy 3 is “produce Stackelberg leader quantity.” The game has no subgames with cycles as defined above, and hence imitators and the rational player earn the same average payoff (if rational player plays Stackelberg leader and I play something else, then he earns more than me, hence I imitate him next period, hence he best responds by playing Stackelberg follower). Other heuristics do much worse than imitation. A heuristic where you simply best reply just plays Stackelberg follower forever, for example.

This result is quite interesting, and the paper is short; the “useful insight on the worst page” test of a quality paper is easily satisfied. I like this work too because it is related to some ideas I have about the benefits of going first. Consider shifting a symmetric simultaneous game to a symmetric sequential game. Going first has no benefit except that it allows me to commit to my action (and many negatives, of course, including the inability to mix strategies). Likewise a heuristic rule allows the heuristic player to commit to actions without assuming perfection of the equilibrium. So there is a link between “optimal” heuristics and the desire of a rational player to commit to his action in advance if he could so choose.

November 2011 Working Paper (IDEAS version). Final version published in the September 2012 issue of Games and Economic Behavior.

## “Evolutionary Dynamics and Backward Induction,” S. Hart (2002)

Let’s follow up yesterday’s post with an older paper on evolutionary games by the always-lucid Sergiu Hart. As noted in the last post, there are many evolutionary dynamics for which the rest points of the evolutionary game played by completely myopic agents and the Nash equilibria of the equivalent static game played by strategic games coincide, which is really quite phenomenal (and since you know there are payoff-suboptimal Nash equilibria, results of this kind have, since Maynard Smith (1973), fundamentally changed our understanding of biology). Nash equilibria is a low bar, however. Since Kuhn (1953), we have also known that every finite game has a backward induction equilibrium, what we now call the subgame perfect equilibrium, in pure strategies. When does the invariant limit distribution of an extensive form evolutionary game coincide with the backward induction equilibrium? (A quick mathematical note: an evolutionary system with mutation, allowing any strategy to “mutate” on some agent with some probability in each state, means that by pure luck the system can move from any state to any other state. We also allow evolutionary systems to have selection, meaning that with some probability in each state an agent switches from his current strategy to one with a higher payoff. This process defines a Markov chain, and since the game is finite and the mutations allow us to reach any state, it is a finite irreducible Markov chain. Such Markov chains have a unique invariant distribution in the limit.)

In general, we can have limit distributions of evolutionary processes that are not the backward induction equilibrium. Consider the following three step game. Agent 1 chooses C or B, then if C was chosen, agent 2 (in the agent-normal form) chooses C2 or B2, then if C and B2 were chosen, agent 3 chooses C3 or B3. The payoff to each agent when B is chosen is (4,0,0), when C and C2 are chosen is (5,9,0), when C, B2 and C3 are chosen is (0,0,0), and when C, B2 and B3 are chosen is (0,10,1). You can see that (C,C2,C3) and (B,B2,B3) are both Nash, but only (B,B2,B3) is subgame perfect, and hence the backward induction equilibrium. Is (B,B2,B3) the limit distribution of the evolutionary game? In the backward induction equilibrium, agent 1 chooses B at the first node, and hence nodes 2 and 3 are never reached, meaning only mutation, and not selection, affect the distribution of strategies at those nodes. Since the Markov chain is ergodic, with probability 1 the proportion of agents at node 2 playing B2 will fall below .2; when that happens, selection at node 1 will push agents toward C instead of B. When this happens, now both nodes 2 and 3 are reached with positive probability. If less than .9 of the agents in 3 are playing B3, then selection will push agents at node 2 toward C2. Selection can therefore push the percentage of agents playing B2 down to 0, and hence (C,C2,C3) can be part of the limit invariant distribution even though it is not the backward induction solution.

So is backward induction unjustifiable from an evolutionary perspective? No! Hart shows that if the number of agents goes to infinity as the probability of mutation goes to zero, then the backward induction solution, when it is unique, is also the only element in the limit invariant distribution of the evolutionary game. How does letting the number of agents go to infinity help? Let Bi be an element of the backward induction equilibrium at node i somewhere in the game tree. Bi must be a best reply in the subgame beginning with i if Bj is played in all descendant nodes by a sufficiently high proportion of the population, so if Bi is not a best reply (and hence selection does not push us toward Bi) it must be that Bj is not being played further down the game tree. If Bi is a best reply in the subgame beginning with i, then most of the population will play Bi because of selection pressures.

Now here’s the trick. Consider the problematic case in the example, when node i is not being reached in a hypothesized limit distribution (if i is reached, then since the probability of mutation goes to zero, selection is much stronger than mutation, and hence non best replies will go away in the limit). Imagine that there is another node g preceding i which is also not reached, and that i is only reached when some strategy outside the backward induction equilibrium is played in g. When g and i are not reached, there is no selection pressure, and hence no reason that the backward induction equilibrium node will be played. Large populations help here. With some small probability, an agent in g mutates such that he plays the node which reaches i. This still has no effect unless there is also a mutation in the node before g that causes g to be reached. The larger the population, the lower the probability the specific individual who mutated in g mutates back before any individual in the node before g mutates. Hence larger populations make it more likely that rare mutations in unreached nodes will “coordinate” in the way needed for selection to take over.

Final GEB version (IDEAS version). Big up to Sergiu for posting final pdfs of all of his papers on his personal website.

## “Survival of Dominated Strategies Under Evolutionary Dynamics,” J. Hofbauer & W. Sandholm (2011)

There is a really interesting tension in a lot of economic rhetoric. On the one hand, we have results that derive from optimal behavior by agents with rational foresight: “price equals marginal cost in competitive markets because of profit-maximizing behavior” or “Policy A improves welfare in a dynamic general equilibrium setting with utility-maximizers”. Alternatively, though, we have explanations that rely on dynamic consequences to even non-maximizing agents: “price equals marginal cost in competitive markets because firms who price about MC are driven out by competition” or “Policy A improves welfare in a dynamic general equilibrium, and the dynamic equilibrium is sensible because firms adjust myopically as if in a tatonnement process.”

These two types of explanation, without further proof, are not necessarily the same. Profit-maximizing firms versus firms disciplined by competition give completely different welfare results under monopoly, since the non profit-maximizing monopolist can be very wasteful and yet still make positive profits. In a dynamic context, firms adjust myopically to excess demand in some markets, rather than profit-maximizing according to rational expectations, will not necessarily converge to equilibrium (a friend mentioned that Lucas made precisely this point in a paper in the 1970s).

How can we square the circle? At least in static games, there has been a lot of work here. Nash and other strategic equilibrium concepts are well known. There is also a branch of game theory going back to the 1950s, evolutionary games, where rather than choosing strategically, a probability vector lists what portion of the players are playing a given strategy at a given time, resulting in some payoffs. A revision rule, perhaps stochastic to allow for “mutations” as in biology, then tells us how the vector of strategies updates conditional on payoffs in the previous round. Fudenberg and Kreps’ learning model from the 1980s is a special case.

Amazingly, it is true for almost all sensible revision rules that the set of rest points of the dynamic includes every Nash equilibrium of the underlying static game, and further that for many revision rules the dynamic rest points are exactly equivalent to the set of Nash equilibria. We have one problem, however: dynamic systems needn’t converge to points at all, but rather may converge to cycles or other outcomes.

Hofbauer and Sandholm – Sandholm being both a graduate of my institution and probably the top economist in the world today on evolutionary games – show that for any revision rule satisfying a handful of common properties, we can construct a game where strictly dominated strategies are played with positive probability. This includes any dynamic meeting the following four properties: the population law of motion is continuous in payoffs and the current population vector, there is positive correlation between strategy growth rates and current payoffs, the dynamic is at rest iff the strategy vector is a Nash equilibrium of the underlying static game, and if an unplayed strategy has sufficiently high rewards, then with positive probability some agents begin using it. These criterion are satisfied by “excess payoff dynamics” like BNN where strategies with higher than average payoffs have higher than average growth rates, and by “pairwise comparison dynamics” where agents switch with positive probability to strategies which have higher payoff than their own current payoff. A myopic best response is not continuous, and indeed, myopic best response has been shown to eliminate strictly dominated strategies.

The proof involves a quite difficult topological construction which I don’t discuss here, but it’s worth discussing the consequence of this result. In strategic situations where we may think agents lack full rationality or rational foresight, and where we observe cycle or other non-rest behavior over time, we should be hesitant to ignore strictly dominated actions (particularly ones that are only dominated by a small amount) in our analysis of the situation. There is also scope for policy improvements: if agents are learning using a dynamic which does not rule out strictly dominated strategies, we may be able to provide information which coerces an alternative dynamic which will rule out such strategies.

Final version in Theoretical Economics 6 (IDEAS version). Another big thumbs up to the journal Theoretical Economics, the Lake Wobegon of econ, where the submissions are free, the turnaround time is fast, and all the articles are totally ungated.

## “What to Maximize if you Must,” A. Heifetz, C. Shannon & Y. Spiegel (2007)

Here’s the other interesting evolution of preferences paper that I mentioned in the previous post. Fair warning: the mathematical level of the paper is very high unless, perhaps, you are used to using the Whitney C(k) topology and relative prevalence on infinite dimensional manifolds in your everyday work! The basic idea is fairly straightforward, though.

We know from many examples that being “irrational” can help you in games. For example, if you are the “crazy” type who always plays “In” in the Centipede Game, your fully rational opponent will wait until the very last stage to end the game, giving both of you much higher payoffs than in the Nash equilibria. Equate payoffs with evolutionary fitness in some sense – perhaps higher payoffs in the game means more influence, or more frequent transactions in the future market, or whatever. The centipede game example, among many others, suggests that deviations and biases may not be weeded out by evolutionary selection (a technical point, but for the curious, here we allow more robust forms of selection than just the replicator dynamic). But how general is this argument? Heifetz et al prove that is quite general indeed.

Let each player get a payoff equal to their standard utility from the outcome of the game plus a disposition which is a function of the player’s own strategy, other players’ strategies and a parameter lying somewhere close to zero. For instance, if u(x,y) represents player 1’s utility from his own strategy x and opponent strategy y, while v(x,y) represents player 2’s utility, a payoff with disposition for player one might be p(x,y,e)=u(x,y)+ev(x,y). If e is greater than zero, then player 1 is altruistic. If e is less than zero, he is spiteful. If e equals zero, he has standard preferences. The conception of disposition here is broad enough to account for a variety of psychological tendencies, among other interpretations.

Let players both choose strategies from open subsets of finite-dimensional Euclidean space. Now consider a “generic” game, where a game is a set of payoffs (continuous in both player’s strategies) and a set of dispositions for each player, and where genericity has a standard measure-theoretic definition. For almost every set of payoffs and dispositions, can we find a parameter e which gives the player higher payoff than he would get if e=0? If so, then there is a disposition which will not disappear by evolutionary selection in that particular game. And it turns out that, for almost every such manifold, we can find an appropriate e. Why are these dispositions useful? We assumed payoff functions are in C^3, so in general, (pure) Nash equilibria will be locally unique. Since e is required to be close to 0, a minor disposition in one’s own strategy will directly have very little effect of payoffs. But it may have a large indirect effect by causing opponent’s to change their behavior as a result of the disposition. Think of the centipede game: a unilateral deviation to “In” in the first period is not very useful for increasing payoffs since rational opponents will play “Out” the next period anyway. But a disposition which causes you to play “Out” often is useful when known to the opponent, because she will then also play “In” until the last period, even if she a standard homo economicus.

3 final notes: First, the centipede game is an imperfect example because it has a finite strategy space. Technically, the “almost every” result applies only to games with open set of R^n strategy spaces (marginal analysis is used in the proof), but the intuition on why dispositions are useful remains the same. Second, there is a sense in which the results still hold if the strategy space is infinite dimensional (an infinitely repeated game, for instance). The basic problem there is that Nash equilibria are not usually locally unique because of various kinds of folk theorems. See the paper for details on this point. Third, the main proof has an interesting implication for delegation. If you can delegate game-playing to someone who is almost but not exactly like you, in almost every game there exists a delegate who would make it worth your while. Rarely, it turns out, is it best to play your own games yourself!

http://elsa.berkeley.edu/users/cshannon/wp/what.pdf (July 2004 Working Paper – final version in JET 2007)

## “On the Evolution of Attitudes Toward Risk in Winner-Take-All Games,” E. Dekel & S. Scotchmer (1999)

How about a couple of posts about evolution of preferences? Informal evolutionary arguments are everywhere in economics. People will do X in a market because if they didn’t, they would lose money and be forced out. Firms will profit maximize because if they don’t, they will be selected away. Many of these informal arguments are probably wrong: rare is the replicator dynamic with random matching that gives a trivial outcome! But they are important. If I have one heterodox crusade, it’s to get profit maximization by firms replaced by selection arguments: if you think firms in some sense luck into optimal pricing, or quantity setting, or marketing, rather than always minimizing costs, then you will be much more hesitant to support policies like patents that lead to monopoly power. I heard second-hand that a famed micro professor used to teach that he was more worried about the “big rectangles” of efficiency loss when monopolies don’t cost minimize than the “small triangles” of deadweight loss; the irony is that when I heard the story, the worried professor was Harberger of the Harberger Triangle himself!

But back to the Dekel and Scotchmer paper. The question here is whether, in a winner take all world, preference for risk will come to dominate. This is an informal argument both for what people will do in general situations (men, in particular, take a lot of risks and there are casual evobiology arguments that this is a result of winner-take-all mating in our distant past) and for what firms will survive situations like a patent race. This makes intuitive sense: if only the best of group survive to the next generation, and we can choose the random variable that represents our skill, we should choose one with high variance. What could be wrong with that argument?

Quite a bit, it turns out. I use “men” from now on to mean whatever agent is being selected in winner take all contests each generation. Each man is genetically programmed to choose some lottery from a finite set. In each period, groups of size m meet. Each man realizes an outcome from his lottery, and the highest outcome “wins” and reproduces in the next period. Here’s the trick. If a distribution (call it F) FOSD another distribution, then it is “favored,” meaning that measure of distribution F players will be higher next period. But risk loving behavior has to do with second order stochastic dominance; distributions that are second order stochastically dominated are more risky. And here the ranking is much less straightforward. Consider groups of size 2. Let F give 1 with probability 1. Let G give 1/4 with probability 2/3 and give 2.5 with probability 1/3. F SOSD G – F and G have the same mean, while G in a specific sense has more “spread” – but F is also favored in evolution over G.

The intuition of that example is that increasing risk in a way that just expands the tails is not very useful: in a contest, winning by epsilon is just as good as winning by a million. So you might imagine that some condition on the possible tail distributions is necessary to make risk loving evolutionarily dominant. And indeed there is. This condition requires the group size to be sufficiently large, though, so if the contests are played in small groups, even restricting the possible lotteries may not be enough to make risk loving dominate over time.

What if everybody plays in a contest against everybody else? Without mutations, this game will end in one period (whichever type draws the highest number in the one period the game is played with make it to the next generation). Adding a small number of mutations in the normal way allows us to examine this scenario, though. And surprisingly, it’s even harder to get risk loving behavior to dominate than in the cases where contests were in small groups. The authors give an example where a distribution first order stochastically dominates and yet is still not successful. The exact condition needed for SOSD to be linked to evolutionary success when contests are played among the whole population turns out to be a strengthening of the tail condition described above.

I don’t know that there’s a moral about evolution here, but there certainly is a good warning against believe informal evolutionary arguments! More on this point in tomorrow’s post, on a new and related working paper.

http://socrates.berkeley.edu/~scotch/wta.pdf (Final JET version; big thumbs up to Suzanne Scotchmer for putting final, published versions of her papers online.)