I should warn you: this is a difficult paper. It’s not any more technically difficult than an average piece of micro theory, but it does presuppose quite a bit of knowledge about a field called psychological games. I’ll try my best to give you the main point.
Psychological games come originally from a 1989 GEB by Geanakoplos, Pearce and Stacchetti (GPS). The basic problem is, how can we incorporate strategies themselves into the utility function in games? I believe they use the example of surprise. Imagine you are buying the wife either flowers or chocolate; she likes both, but is especially happy when you surprise her. It’s hard to surprise someone in a game, though: by the equilibrium assumption, they (correctly) guess what you are going to do. This may cause you to randomize over flowers and chocolate. She may correctly infer that she will receive chocolate with probability .5, but when she actually receives chocolate, she is “surprised.” I actually find this example totally unconvincing – “it’s the thought that counts,” after all, and your strategy is not surprising the wife at all – but it does point the way toward incorporating more sophisticated beliefs about strategies into utility functions.
A series of authors, especially Battigalli at Bocconi and Dufwenberg at Arizona, have extended the GPS idea to dynamic games and more interesting motivations than surprise. Both of those guys are competent experimental economists and first-rate theorists, which is a rare combination indeed. Some dynamic belief-based motivations might be: I feel guilty (and hence get less utility) if my opponent in a Centipede Game would have continued in the next stage, or I feel shame if my opponent might believe I am not “cooperating”.
But here we have an epistemic problem. The use of Nash equilibrium, or rationalizability, or other solution concepts, has in the past twenty-five years been justified by reference to epistemic conditions. For example, consider the Duopoly Entry game. Coca-Cola chooses either high or low quantity (H or L), then Pepsi chooses either high or low quantity (h or l). If one firm plays high and the other low, the high firm gets payoff 3 and the low firm 1. If both play high, payoffs are (0,0). If both play low, payoffs are (2,2). The unique subgame perfect equilibrium is for Coca-Cola to play high and then Pepsi to play low. This outcome can be epistemically justified by assuming rationality (I do what is in my best interest) and first-order knowledge of rationality (at the start of the game, I believe everyone else does what is in their best interest). With those assumptions, Coca-Cola knows that, if it plays high, Pepsi will play low, and if Coke plays low, Pepsi will play high. Therefore, Coca-Cola knows for sure what Pepsi will do conditional on Pepsi moving, and since Coca-Cola is itself rational, it will play high. Pepsi, then, being rational, will play low. So (H,l) is the unique epistemically-justified equilibrium.
This reasoning made a strong assumption, though: Coca-Cola knew what Pepsi would do conditional on Coca-Cola’s actions in the first stage. This is fine in standard game theory, since strategies are defined – I believe by the late David Blackwell – as a complete, contingent plan that one could literally give to a machine at the start of the game. But consider psychological games. My payoff depends on what I believe you may do along paths we never reach, so epistemic justifications for a solution concept must take these beliefs, rather than strategies implying commitment, as the relevant primitive. Battagalli and coauthors have therefore suggested that a better conception of strategy is that of own-beliefs about what you or someone else will do conditional on reaching some node after some history.
What epistemic conditions justify the backward induction solution using beliefs, then? Consider a concept called “material rationality”. This says first that each player’s beliefs about what everyone will do are rational, in that all “believable” behavior strategies actually solve each player’s dynamic program given initial beliefs and Bayesian updating. Second, we need some link from beliefs to actions, so we require that conditional on reaching some node given some history, players do not take actions that they initially planned not to take. Now go back to the Duopoly Game, and consider the non-subgame perfect Nash equilibrium where Coca-Cola plays Low and Pepsi plans to play high no matter what Coca-Cola does. Does material rationality rule out such non-credible threats? No. If Coca-Cola believes Coca-Cola will play Low with probability one at the initial node, even though material rationality implies that Pepsi will play low if Coke plays High and high if Coke plays Low, both players are being materially rational by playing the imperfect equilibrium (Low,high). What’s missing is persistence of belief in material rationality. That is, Coca-Cola needs to know not only that Pepsi is materially rational at the initial node, but also that Pepsi will be materially rational at any node after any history, even those never seen.
When dynamic games involve more than two stages, an even stronger condition is needed to ensure only the backward induction strategies are played: common strong belief in material rationality. It is roughly what it sounds like, but the proofs are too involved for a blog post, so if you’re interested, you should consult the original paper.
ftp://ftp.igier.unibocconi.it/wp/2011/375.pdf (Working paper, Jan. 2011)