Filed under Classics

Paul Samuelson’s Contributions to Welfare Economics, K. Arrow (1983)

I happened to come across a copy of a book entitled “Paul Samuelson and Modern Economic Theory” when browsing the library stacks recently. Clear evidence of his incredible breadth are in the section titles: Arrow writes about his work on social welfare, Houthhaker on consumption theory, Patinkin on money, Tobin on fiscal policy, Merton on financial economics, and so on. Arrow’s chapter on welfare economics was particularly interesting. This book comes from the early 80s, which is roughly the end of social welfare as a major field of study in economics. I was never totally clear on the reason for this – is it simply that Arrow’s Possibility Theorem, Sen’s Liberal Paradox, and the Gibbard-Satterthwaite Theorem were so devastating to any hope of “general” social choice rules?

In any case, social welfare is today little studied, but Arrow mentions a number of interesting results which really ought be better known. Bergson-Samuelson, conceived when the two were in graduate school together, is rightfully famous. After a long interlude of confused utilitarianism, Pareto had us all convinced that we should dismiss cardinal utility and interpersonal utility comparisons. This seems to suggest that all we can say about social welfare is that we should select a Pareto-optimal state. Bergson and Samuelson were unhappy with this – we suggest individuals should have preferences which represent an order (complete and transitive) over states, and the old utilitarians had a rule which imposed a real number for society’s value of any state (hence an order). Being able to order states from a social point of view seems necessary if we are to make decisions. Some attempts to extend Pareto did not give us an order. (Why is an order important? Arrow does not discuss this, but consider earlier attempts at extending Pareto like Kaldor-Hicks efficiency: going from state s to state s’ is KH-efficient if there exist ex-post transfers under which the change is Paretian. Let person a value the bundle (1,1)>(2,0)>(1,0)>all else, and person b value the bundle (1,1)>(0,2)>(0,1)>all else. In state s, person a is allocated (2,0) and person b (0,1). In state s’, person a is allocated (1,0) and person b is allocated (0,2). Note that going from s to s’ is a Kaldor-Hicks improvement, but going from s’ to s is also a Kaldor-Hicks improvement!)

Bergson and Samuelson wanted to respect individual preferences – society can’t prefer s to s’ if s’ is a Pareto improvement on s in the individual preference relations. Take the relation RU. We will say that sRUs’ if all individuals weakly prefer s to s’. Not that though RU is not complete, it is transitive. Here’s the great, and non-obvious, trick. The Polish mathematician Szpilrajn has a great 1930 theorem which says that if R is a transitive relation, then there exists a complete relation R2 which extends R; that is, if sRs’ then sR2s’, plus we complete the relation by adding some more elements. This is not a terribly easy proof, it turns out. That is, there exists social welfare orders which are entirely ordinal and which respect Pareto dominance. Of course, there may be lots of them, and which you pick is a problem of philosophy more than economics, but they exist nonetheless. Note why Arrow’s theorem doesn’t apply: we are starting with given sets of preferences and constructing a social preference, rather than attempting to find a rule that maps any individual preferences into a social rule. There have been many papers arguing that this difference doesn’t matter, so all I can say is that Arrow himself, in this very essay, accepts that difference completely. (One more sidenote here: if you wish to start with individual utility functions, we can still do everything in an ordinal way. It is not obvious that every indifference map can be mapped to a utility function, and not even true without some type of continuity assumption, especially if we want the utility functions to themselves be continuous. A nice proof of how we can do so using a trick from probability theory is in Neuefeind’s 1972 paper, which was followed up in more generality by Mount and Reiter here at MEDS then by Chichilnisky in a series of papers. Now just sum up these mapped individual utilities, and I have a Paretian social utility function which was constructed entirely in an ordinal fashion.)

Now, this Bergson-Samuelson seems pretty unusable. What do we learn that we don’t know from a naive Pareto property? Here are two great insights. First, choose any social welfare function from the set we have constructed above. Let individuals have non-identical utility functions. In general, there is no social welfare function which is maximized by always keeping every individual’s income identical in all states of the world! The proof of this is very easy if we use Harsanyi’s extension of Bergson-Samuelson: if agents are Expected Utility maximizers, than any B-S social welfare function can be written as the weighted linear combination of individual utility functions. As relative prices or the social production possibilities frontier changes, the weights are constant, but the individual marginal utilities are (generically) not. Hence if it was socially optimal to endow everybody with equal income before the relative price change, it (generically) is not later, no matter which Pareto-respecting measure of social welfare your society chooses to use! That is, I think, an astounding result for naive egalitarianism.

Here’s a second one. Surely any good economist knows policies should be evaluated according to cost-benefit analysis. If, for instance, the summed willingness-to-pay for a public good exceeds the cost of the public good, then society should buy it. When, however, does a B-S social welfare function allow us to make such an inference? Generically, such an inference is only possible if the distribution of income is itself socially optimal, since willingness-to-pay depends on the individual budget constraints. Indeed, even if demand estimation or survey evidence suggests that there is very little willingness-to-pay for a public good, society may wish to purchase the good. This is true even if the underlying basis for choosing the particular social welfare function we use has nothing at all to do with equity, and further since the B-S social welfare function respects individual preferences via the Paretian criterion, the reason we build the public good also has nothing to do with paternalism. Results of this type are just absolutely fundamental to policy analysis, and are not at all made irrelevant by the impossibility results which followed Arrow’s theorem.

This is a book chapter, so I’m afraid I don’t have an online version. The book is here. Arrow is amazingly still publishing at the age of 91; he had an interesting article with the underrated Partha Dasgupta in the EJ a couple years back. People claim that relative consumption a la Veblen matters in surveys. Yet it is hard to find such effects in the data. Why is this? Assume I wish to keep up with the Joneses when I move to a richer place. If I increase consumption today, I am decreasing savings, which decreases consumption even more tomorrow. How my desire to change consumption today if I have richer peers then depends on that dynamic tradeoff, which Arrow and Dasgupta completely characterize.

“Returns to Scale in Research & Development: What Does the Schumpeterian Hypothesis Imply?,” F. Fisher & P. Temin (1973)

Schumpeter famously argued for the economic importance of market power. Even though large firms cause static inefficiency, they had dynamic benefits in that large firms demand more invention since they can extract more revenue from each new product. Further, they supply more invention, Schumpeter hypothesized, since the rate of invention has increasing returns to scale in the number of inventors, and in the number of other employees at the firm. (Axioms A and B). The second part of that statement may be for many reasons; for instance, if the output of a research project could be many potential products, a larger firm has the ability to capitalize on many of those new projects, whereas a small firm might have more limited complementary capabilities. Often, this hypothesis has been tested by checking whether larger firms are more research intensive, meaning that larger firms have a higher percentage of their workforce doing research (Hypothesis 1). Alternatively, a direct reading of Schumpeter is that a 1% increase in the non-research staff of a firm leads to a more than 1% increase in total R&D output of a firm, where output is just the number of research workers times each worker’s average output as a function of firm size (Hypothesis 2).

And here is where theory comes into play. Are axioms A and B necessary or sufficient for either hypothesis 1 or 2? If they don’t imply hypothesis 1, then the idea of testing the Schumpeterian axioms about increasing returns to scale by examining researcher employment is wrong-headed. If they don’t imply hypothesis 2, then Schumpeter’s qualitative argument is incomplete in the first place. Fisher and Temin (that’s Franklin Fisher and Peter Temin, two guys who, it goes without saying, have had quite some careers since they wrote this paper in the early 70s!) show that, in fact, for both hypotheses the axioms are neither necessary nor sufficient.

An even more basic problem wasn’t noticed by Fisher and Temin, but instead was pointed out by Carlos Rodriguez in a 1979 comment. If Axiom 1 holds, and the average product per researcher is increasing in the number of researchers, then marginal product always exceeds average product. If market equilibrium means I pay all research workers their marginal product, then I will be making a loss if I operate at the “optimal” quantity. Hence I will hire no research workers at all. So step one to interpreting Schumpeter, then, is to restate his two axioms. A weaker condition might be that if the number of research and the number of nonresearch workers increase at the same rate, then average product per research worker is increasing. This is implied by Axioms A and B, but doesn’t rely on always-increasing average product per research worker (Axiom C). This is good for checking our two hypotheses, since anything that would have been implied by Axioms A and B is still implied by our more theoretically-grounded axiom C.

So what does our axiom imply about the link between research staff size and firm size? Unsurprisingly, nothing at all! Surely the optimal quantity of research workers depends on the marginal product of more research workers as firm size grows, and not on the average product of those workers. Let’s prove it. Let F(R,S) is the average product per research worker as a function of R, the number of researchers, and S, the number of other employees at the firm. I hire research workers as long as their marginal product exceeds the researcher wage rate. The marginal product of total research output is the derivative of R*F(R,S) with respect to R, or F+R*dF/dR. As S increases, this marginal product goes up if and only if dF/dS+R*dF^2/dRdS>0. That is, I hire more research workers in equilibrium if my non-research staff is bigger according to a function that depends on the second derivative of the average output per researcher. But my axioms had only to do with the first derivative! Further, if dF/dS+R*dF^2/dRdS>0, then larger firms have a larger absolute number of scientists than smaller firms, but this implication is completely independent of the Schumpeterian axioms. What’s worse, even that stronger assumption involving the second derivative does not imply anything about the share of research workers on the staff.

The moral is the same one you were probably taught you first day of economics class: using reasoning about averages to talk about equilibrium behavior, so dependent on marginals, can lead you astray very quickly!

1971 working paper; the final version was published in JPE 1973 (IDEAS). Related to the comment by Rodriguez, Fisher and Temin point out here that the problem with increasing returns to scale does not ruin their general intuition, for the reasons I stated above. What about the empirics of Schumpeter’s prediction? Broadly, there is not much support for a link between firm size and research intensity, though the literature on this is quite contentious. Perhaps I will cover it in another post.

“The Meaning of Utility Measurement,” A. Alchian (1953)

Armen Alchian, one of the dons from UCLA’s glory days, passed away today at 98. His is, for me, a difficult legacy to interpret. On the one hand, Alchian-Demsetz 1972 is among the most famous economics papers ever written, and it can fairly be considered the precursor to mechanism design, the most important new idea in economics in the past 50 years. People produce more by working together. It is difficult to know who shirks when we work as a team. A firm gives a residual claimant (an owner) who then has an incentive to monitor shirking, and as only one person needs to monitor the shirking, this is much less costly than a market where each member of the team production would need somehow to monitor whether other parts of the team shirk. Firms are deluded if they think that they can order their labor inputs to do whatever they want – agency problems exist both within and outside the firm. Such an agency theory of the firm is very modern indeed. That said, surely this can’t explain things like horizontally integrated firms, with different divisions producing wholly different products (or, really, any firm behavior where output is a separable function of each input in the firm).

Alchian’s other super famous work is his 1950 paper on evolution and the firm. As Friedman would later argue, Alchian suggested that we are justified treating firms as if they are profit maximizers when we do our analyses since the nature of competition means that non-profit maximizing firms will disappear in the long run. I am a Nelson/Winter fan, so of course I like the second half of the argument, but if I want to suggest that firms partially seek opportunities and partially are driven out by selection (one bit Lamarck, one bit Darwin), then why not just drop the profit maximization axiom altogether and try to write a parsimonious description of firm behavior which doesn’t rely on such maximization?

It turns out that if you do the math, profit maximization is not generally equivalent to selection. Using an example from Sandroni 2000, take two firms. There are two equally likely states of nature, Good and Bad. There are two things a firm can do, the risky one, which returns profit 3 in good states and 0 in bad states, and a risk-free one, which always returns 1. Maximizing expected profit means always investing all capital in the risky state, hence eventually going bankrupt. A firm who doesn’t profit maximize (say, it has incorrect beliefs and thinks we are always in the Bad state, hence always takes the risk-free action) can survive. This example is far too simple to be of much worth, but it does at least remind us of lesson in the St. Petersburg paradox: expected value maximization and survival have very little to do with each other.

More interesting is the case with random profits, as in Radner and Dutta 2003. Firms invest their capital stock, choosing some mean-variance profits pair as a function of capital stock. The owner can, instead of reinvesting profits into the capital stock, pay out to herself or investors. If the marginal utility of a dollar of capital stock falls below a dollar, the profit-maximizing owner will not reinvest that money. But a run of (random) losses can drive the firm to bankruptcy, and does so eventually with certainty. A non-profit maximizing firm may just take the lowest variance earnings in every period, pay out to investors a fraction of the capital stock exactly equal to the minimum earnings that period, and hence live forever. But why would investors ever invest in such a firm? If investment demand is bounded, for example, and there are many non profit-maximizing firms from the start, it is not the highest rate of return but the marginal rate of return which determines the market interest rate paid to investors. A non profit-maximizer that can pay out to investors at least that much will survive, and all the profit maximizers will eventually fail.

The paper in the title of this post is much simpler: it is merely a very readable description of von Neumann expected utility, when utility can be associated with a number and when it cannot, and the possibility of interpersonal utility comparison. Alchian, it is said, was a very good teacher, and from this article, I believe it. What’s great is the timing: 1953. That’s one year before Savage’s theory, the most beautiful in all of economics. Given that Alchian was associated with RAND, where Savage was fairly often, I imagine he must have known at least some of the rudiments of Savage’s subjective theory, though nothing appears in this particular article. 1953 is also two years before Herbert Simon’s behavioral theory. When describing the vN-M axioms, Alchian gives situations which might contradict each, except for the first, a complete and transitive order over bundles of goods, an assumption which is consistent with all but “totally unreasonable behavior”!

1953 AER final version (No IDEAS version).

“The Oligopoly Solution Concept is Identified,” T. Bresnahan (1980)

Here’s a classic, super-simple paper. I think I can give you the idea in two paragraphs. I know price and quantity sold for some good. I know supply and demand must equate. I use whatever method I like to deal with simultaneity of the supply and demand functions (say, a cost shifter approach). How can I identify whether the industry is acting as if it had market power? That is, how can I separate collusive behavior from competitive behavior?

A numerical example will help. Let marginal costs and demand be linear. Let demand be P=11-Q. Shift demand and supply (meaning shift the intercept) however you like. The price-quantity bundle you see under monopoly with MC constant and equal to 1 will be identical to the price-quantity bundle you see under perfect competition with MC increasing and equal to 1+Q. For instance, price=6 and quantity=5 is found by letting P=MC for MC=1+Q or by letting MR=MC for MR=11-2Q. And demand shifters don’t help us! If demand shifts to P=R-Q, where R is any y-intercept, then under perfect competition with MC=1+Q, we have equilibrium price such that 1+Q=R-Q, or Q=(R-1)/2, and equilibrium price under monopoly with MC=1 such that MC=MR, or R-2Q=1, or Q=(R-1)/2. Supply shifters are equally unhelpful: for inverse demand P=11-Q, shifting the y-intercept for the cost curve of both the hypothetical monopolist and the hypothetical perfectly competitive market changes equilibrium quantity by exactly the same amount. So what to do? The simplest method is to assume, a priori, something about the nature of marginal costs in the industry; if they are constant, the the price patterns we saw in the numerical example can only be explained by monopoly/collusive behavior. But Bresnahan points out that we don’t even need to make this assumption. Just note that a rotation of the demand curve through some equilibrium point affects those with market power and those without differently. Since rotating the demand curve retains the P=MC equilibrium condition under perfect competition, such a rotation only affects equilibrium price and quantity if competition is not perfect. If I have, say, demand-side instruments, one of which only affects the y-intercept and one of which affects the slope (and perhaps also the intercept), then not only can I identify whether perfect competition exists, but I can even identify the degree to which behavior is monopolistic. Useful.

Final version from Economic Letters 1982 (IDEAS)

“Das Unsicherheitsmoment in der Wirtlehre,” K. Menger (1934)

Every economist surely knows the St. Petersburg Paradox described by Daniel Bernoulli in 1738 in a paper which can fairly claim to be the first piece of theoretical economics. Consider a casino offering a game of sequential coinflips that pays 2^(n-1) as a payoff if the first heads arrives on the nth flip of the coin. That is, if there is a heads on the first flip, you receive 1. If there is a tails on the first flip, and a heads on the second, you receive 2, and 4 if TTH, and 8 if TTTH, and so on. It is quite immediate that this game has expected payoff of infinity. Yet, Bernoulli points out, no one would pay anywhere near infinity for such a game. Why not? Perhaps they have what we would now call logarithmic utility, in which case I value the gamble at .5*ln(1)+.25*ln(2)+.125*ln(4)+…, a finite sum.

Now, here’s the interesting bit. Karl Menger proved in the 1927 that the standard response to the St. Petersburg paradox is insufficient (note that Karl with a K is the mathematically inclined son and mentor to Morganstern, rather than the relatively qualitative father, Carl, who somewhat undeservingly joined Walras and Jevons on the Mt. Rushmore of Marginal Utility). For instance, if the casino pays out e^(2^n-1) rather than 2^(n-1), then even an agent with logarithmic utility have infinite expected utility from such a gamble. This, nearly 200 years after Bernoulli’s original paper! Indeed, such a construction is possible for any unbounded utility function; let the casino pay out U^-1(2^(n-1)) when the first heads arrives on the nth flip, where U^-1 is inverse utility.

Things are worse, Menger points out. One can construct a thought experiment where, for any finite amount C and an arbitrarily small probability p, there is a bounded utility function where an agent will prefer the gamble to win some finite amount D with probability p to getting a sure thing of C [Sentence edited as suggested in the comments.] So bounding the utility function does not kill off all paradoxes of this type.

The 1927 lecture and its response are discussed in length in Rob Leonard’s “Von Neumann, Morganstern, and the Creation of Game Theory.” Apparently, Oskar Morganstern was at the Vienna Kreis where Menger first presented this result, and was quite taken with it, a fact surely interesting given Morganstern’s later development of expected utility theory. Indeed, one of Machina’s stated aims in his famous paper on EU with the Independence Axiom is providing a way around Menger’s result while salvaging EU analysis. If you are unfamiliar with Machina’s paper, one of the most cited in decision theory in the past 30 years, it may be worthwhile to read the New School HET description of the “fanning out” hypothesis which relates Machina to vN-M expected utility.

http://www.springerlink.com/content/m7q803520757q700/fulltext.pdf (Unfortunately, the paper above is both gated, and in German, as the original publication was in the formerly-famous journal Zeitschrift fur Nationalokonomie. The first English translation is in Shubik’s festschrift for Morganstern published in 1967, but I don’t see any online availability.)

“Epistemic Conditions for Nash Equilibrium,” R. Aumann & A. Brandenburger (1995)

This paper is a real classic that I noticed had yet to appear on this site. Every economist knows a Nash equilibrium attains when no player can improve his payoff by deviating from his strategy. There are two concerns. First, if the equilibrium prescribes a mixed strategy, then why would anyone randomize, by which we mean make important decisions on the basis of a coin flip? And second, how much need I know about the other player for Nash to obtain?

We can think of this in terms of common and mutual knowledge. I know something in some state of the world if I believe that event happens with probability 1 (in other works in this literature, I believe it happens with absolute certainty; don’t concern yourself with this difference). You and I, or the group of us, mutually know something if we all believe it with probability 1. You and I, or the group of us, commonly know something if we all believe it with prob. 1, we all believe with prob. 1 that we all believe it with prob. 1, and so on ad infinitum. It is tempting to believe, and many economists did basically until this paper was published, that common knowledge of the game form (who gets what payoffs from what actions?) and common knowledge of rationality (players maximize given their information) are implied by Nash equilibria. The logic is something like: I play mixed strategy X because I think you will play Y, which you play because you think I will play X because I think you will play Y, and so on.

This logic is not correct. Let a “conjecture” be a belief about what another person will do in a game. In a two player game, if the game form, rationality of the players, and each player’s conjecture is mutually known, the conjectures form a Nash equilibrium. There is no common knowledge at all! The proof is actually really simple. First, if conjectures are mutually known to be Q, then the conjectures actually are Q – this follows almost immediately from the definition of “known”. Next, note that is some action a(i) for player i is assigned positive probability by player j in his conjecture Q(j), then a(i) must be maximizing in the game form given i’s conjecture Q(i) about what j will do. What this sentence says is that j knows the following events “i is rational” and “i is using conjecture Q(i)” happen with probability 1 given the current state, and the event “i plays action a(i)” happens with positive probability given the current state. Since two of those events have probability 1 and the third has positive probability, their intersection has positive probability. In the state where that intersection attains, i is rationally playing a(i) given conjecture Q(i), hence a(i) is an optimal action given what i conjectures j will do. In the same manner, ever action a(j) with positive probability is an optimal action for j given what j conjectures i will do. This is the definition of a Nash equilibrium.

This result is perhaps not too surprising. Why would Nash play require common knowledge of rationality, for instance? If you know what I conjecture you are going to do, and you know my payoff function, and you know that I am rational, then you know what “best respond” means for me, and you know that I will best respond given my conjecture about how you will play. Likewise, I know that you will best respond given your conjecture about how I will play. So if we think of mixed strategies as “conjectures held by our opponents in a game which are mutually known” rather than “conscious randomization devices”, we have a nice interpretation of Nash equilibrium.

With more than 2 players, the situation is a little trickier. We need a common prior about the state, and we need common knowledge of the conjectures. This common knowledge needs to be over the conjectures each player has about what all other players will do, not common knowledge about what each individual player will do; that is, common knowledge that player 1 conjectures that player 2 will mix evenly between A and B, and player 3 will mix evenly between a and b is not sufficient. We need common knowledge that player 1 conjectures, say, that Aa happens 20 percent of the time, Ab 30% of the time, Ba 30 percent of the time and Bb 20 percent of the time. Exact details about why these assumptions are needed can be found in the paper, but the important part here is that, again, common knowledge of rationality is not one of the required conditions for Nash equilibrium to obtain.

http://www.ratio.huji.ac.il/dp_files/dp57.pdf (Final Econometrica version – Aumann is quite good at putting final, ungated copies of his research online. He also appears, in the hallway of our department, in the best economist portrait of all time. Rather than staring blandly at the camera and smiling, Aumann appears like a Biblical character deep in thought while playing a particularly rigorous game of chess.)

“Mathematical Models in the Social Sciences,” K. Arrow (1951)

I have given Paul Samuelson the title of “greatest economist ever” many times on this site. If he is number one, though, Ken Arrow is surely second. And this essay, an early Cowles discussion paper, is an absolute must-read.

Right on the first page is an absolute destruction of every ridiculous statement you’ve ever heard about mathematical economics. Quoting the physicist Gibbs: “Mathematics is a language.” On whether quantitative methods are appropriate for studying human action: “Doubtless many branches of mathematics – especially those most familiar to the average individual, such as algebra and the calculus – are quantitative in nature. But the whole field of symbolic logic is purely qualitative. We can frame such questions as the following: Does the occurrence of one event imply the occurrence of another? Is it impossible that two events should both occur?” This is spot on. What is most surprising to me, wearing my theorist hat, is how little twentieth century mathematics occurs in economics vis-a-vis the pure sciences, not how much. The most prominent mathematics in economics are the theories of probability, various forms of mathematical logic, and existence theorems on wholly abstract spaces, meaning spaces that don’t have any obvious correspondence with the physical world. These techniques tell us little about numbers, but rather help us answer questions like “How does X relate to Y?” and “Is Z a logical possibility?” and “For some perhaps unknown sets of beliefs, how serious a problem can Q cause?” All of these statements look to me to be exactly equivalent to the types of a priori logical reasoning which appear everywhere in 18th and 19th century “nonmathematical” social science.

There is a common objection to mathematical theorizing, that mathematics is limited in nature compared to the directed intuition which a good social scientist can verbalize. This is particularly true compared to the pure sciences. We have very little intuition about atoms, but great intuition about the social world we inhabit. Arrow argues, however, that making valid logical implication is a difficult task indeed, particularly if we’re using any deductive reasoning beyond the simplest tools in Aristotle. Writing our verbal thoughts as mathematics allows the use of more complicated deductive tools. And the same is true of induction: mathematical model building allows for the use of (what was then very modern) statistical tools to identify relationships. Naive regression identifies correlations, but is rarely able to discuss any more complex relationship between data.

A final note: if you’re interested in history of thought, there are some interesting discussions of decision theory pre-Savage and game theory pre-Nash and pre-Harsanyi in Arrow’s article. A number of interpretations are given that seem somewhat strange given our current understanding, such as interpreting mixed strategies as “bluffing,” or writing down positive-sum n-person cooperative games as zero-sum n+1 player games where a “fictitious player” eats the negative outcome. Less strange, but still by no means mainstream, is Wald’s interpretation of statistical inference as a zero-sum game against nature, where the statistician with a known loss function chooses a decision function (perhaps mixed) and nature simultaneously chooses a realization in order to maximize the expected loss. There is an interesting discussion of something that looks an awful lot like evolutionary game theory, proposed by Nicholas Rachevsky in 1947; I hadn’t known these non-equilibrium linear ODE games existed that far before Maynard Smith. Arrow, and no doubt his contemporaries, also appear to have been quite optimistic about the possibility of a dynamic game theory that incorporated learning about opponent’s play axiomatically, but I would say that, in 2012, we have no such theory and for a variety of reasons, a suitable one may not be possible. Finally, Arrow notes an interesting discussion between Koopmans and his contemporaries about methodological individualism; Arrow endorses the idea that, would we have the data, society’s aggregate outcomes are necessarily determined wholly by the actions of individuals. There is no “societal organism”. Many economists, no doubt, agree completely with that statement, though there are broad groups in the social sciences who both think that the phrase “would we have the data” is a more serious concern that economists generally consider it, and that conceive of non-human social actors. It’s worthwhile to at least know these arguments are out there.

http://128.36.236.35/P/cp/p00a/p0048.pdf (Final version provided thanks to the Cowles Commission’s lovely open access policy)

“Note on the Theory of the Economy of Research,” C. S. Peirce (1879)

Though this site is devoted generally to new research, the essay discussed in this post, I trust, will be new enough to the vast majority of readers. Charles Sanders Peirce is a titan of analytic philosophy, and there is certainly a case to be made that he is the greatest American philosopher of all time. He also has had a fairly well-known indirect influence on economics: Peirce was in some ways rediscovered by the great mathematician Alfred Tarski, who then taught Kenneth Arrow, and in doing so may have introduced Peirce’s relational algebra to the field of economics. (You may be thinking, relational algebra, what is that? But you certainly know what it is: take a set, apply a perhaps partial, often binary ordering with certain properties, then prove results. This surely describes every modern introduction to the theory of preferences, does it not?) But Peirce also has an essay more directly on economics that is fascinating to see in retrospect. This Peirce essay is reprinted in Phil Mirowski’s book “Science Bought and Sold” along with notes on the essay by James Wible which I shall also draw from.

Two final things. First, I note, if only to myself, the following quote from Peirce to be used in a future research paper of my own: “Economical science is particularly profitable to science; and that of all the branches of economy, the economy of research is the most profitable.” Second, check out where this essay was published: the annual report of the U.S. government Coast Survey of 1879! No wonder it has been overlooked. If you know anything of the biography of Peirce, though, there is not much surprising in this odd location. Peirce was supposedly such a nut that, despite obvious brilliance, he was repeatedly blackballed from academic appointments by future colleagues around the country!

Wible claims, and I also know of no earlier such work, that this Peirce essay is the earliest mathematical work on the theory of invention. And given the intellectual history, this seems almost certain to be so. The essay was written right at the cusp of the marginal revolution and mathematical political economy, Peirce is known to have been familiar with the few scraps of earlier mathematical economics like Cournot’s famous 1838 essay, and Peirce is the father of a philosophical school for which selecting the best line of research to examine in order to learn inductively was a pressing concern. If you’ve ever read economics articles from the middle of the 19th century, this one will shock you: in style, I think it is essentially publishable today. It looks like 21st century economics. There are marginal tradeoffs. There is social science done by mathematical manipulation of heavily abstracted concepts. There is even a Marshallian diagram! It’s phenomenal. Since this looks like modern economics, let’s discuss it like modern economics; what does Peirce’s theory say?

As he introduced it, “I considered this problem. Somebody furnished a fund to be expended upon research without restrictions. What sort of researches should it be expended upon?” Essentially, there are some scientific problems which we understand only vaguely; you may think of this purely qualitatively, or as meaning something is measured to within some confidence interval. There are diminishing returns to science, so that while decreasing error can be done at linear cost, the utility gained from such reduction is concave (the inverse is quadratic in Peirce’s formulation). There is a total fixed research budget. What should be worked on first? Note that this paper was first written in 1876: there is no stochastic learning or any such thing, as the mathematics to discuss bandits and related objects was not yet developed. Learning is purely deterministic here.

Solving that constrained maximization problem gives the now-familiar, but then-nonexistent, result that we should compare ratios of MU/MC across different projects. Peirce called this ratio of marginal utility to marginal cost the “economic urgency” of a given line of research. He notes that, given that functional form assumptions, new research fields where we know very little are particularly worthwhile investments: the gains from increasing our knowledge are exponential in ignorance, whereas the cost is linear. As an example, an early chemist with simple vials is able to provide results with more social utility than a thousand chemists working in Peirce’s day with all sorts of modern equipment. Peirce also derives a result concerning sampling which is a bit opaque for modern readers given that it is couched in terms of “accidental probable error” rather than confidence intervals; nonetheless, it is very Wald-esque in that it explicitly argues that optimal sample size in experiments depends crucially on the budget, the costs of sampling and the utility of learning inferences from that sampling. Such considerations are absolutely ignored in a lot of research design even today!

http://books.google.com/books?id=ux79s_IhpFYC (Both Peirce’s original essay and Wible’s commentary appear in “Science Bought and Sold,” edited by Mirowski and Sent. The Google Books Preview is generous enough here for you to read the entirety of both essays; I do not see any other ungated copies of either online.)

“On the Strategic Stability of Equilibria,” E. Kohlberg and J.-F. Mertens (1986)

(The following discussion also draws heavily on “Quantity Precommitment and Bertrand Competition Yield Cournot Outcomes, by D. Kreps and J. Scheinkman, 1983. That paper shows that if duopolists first choose capacities (at a cost) then simultaneously choose prices after observing capacities, the unique Nash equilibrium gives Cournot prices and quantities. Proving this in general is very difficult, as the subgame has noncontinuous payoffs: as in Bertrand, the low cost seller gets all the sales (up to capacity, of course), and in general noncontinuous games may not have any equilibria at all. Uniqueness is striking, since it means the argument doesn’t even depend on subgame perfection, though the equilibrium strategy is subgame perfect. That is, there is more going on than the strategies “Both choose Cournot capacities in period 1, and if either does not, choose price 0 to punish the other player.” The exact details of how to construct this equilibrium are interesting indeed, but for the purposes of the following discussion, just know that Kreps and Scheinkman’s basic point is that the solution to duopoly games does not only depend on variables chosen or on the timing implicit in the game form, but on the combination of the two.)

Kohlberg and Mertens were writing in the heyday of the equilibrium refinement literature – theirs was one of the last well-known refinements of Nash. Consider constructing a “reasonable set” of equilibria to a game. What properties might you like such a set to have? Best would, of course, be to define a set of axioms on “reasonableness” then show what this implies about the equilibrium, but the authors says “we do not yet feel ready for such an approach; we think the discussion…will abundantly illustrate the difficulties involved.” More on this point in the last paragraph of this post.

What non-axiomatic properties, then, might you like? Backwards induction is fairly reasonable, as has been discussed ad nauseum in an earlier philosophy and decision theory literature. Admissibility is another good one, also with roots in decision theory: all equilibria in the players’ reasonable sets should be undominated. Three other properties have to do with the game form itself. Existence of all least one equilibrium, surely, is a property we want if we only want to restrict ourselves to “reasonable sets” of potential equilibia. Invariance of equilibria to the game form in the sense of some 1950s authors also seems reasonable: I don’t want to get different reasonable sets simply by changing the way I write down a game tree if such a change has no effect on the normal form of the game (there is a technical qualification here that I ignore); one argument here is that the equilibria of a game should not depend on whether I give instructions to a computer on how I should play in every situation before the game begins, or whether I play through the game tree myself. Finally, if I delete a dominated strategy from game G to create game G’, I don’t want any equilibria to disappear from the reasonable set.

Here Kohlberg and Mertens propose their well-known KM stable set. The proofs are of limited interest except to those of you really up on your differential geometry; I assume theorems like “p-bar is homotopic to a homeomorphism” are not of broad interest to readers of this site. In any case, a KM stable set is a closed set of Nash equilibria such that, for any set of completely mixed strategies for all players, if I perturb the strategies of each player by some small delta to that set of completely mixed strategies, the perturbed game has equilibria epsilon close to the the original equilibrium (close in terms of the strategy simplex, as usual). Every game has at least one equilibrium in a KM stable set, but there may be multiple such sets. This definition is hard to work with, of course, but it satisfies all of desired properties except backward induction. Kohlberg and Mertens note in the conclusion that it would be great if someone could make a small modification such that backward induction were also captured (I believe Hillas did this, though I’ve not read his paper)

What is most interesting about this paper is how far afield the refinement literature got itself. I think the problem is evident in the list of non-axioms/properties listed by Kohlberg and Mertens. There are many properties you might think are reasonable for the solution to a game of strategic interaction. Some of them are decision-theoretic (Type 1). Some of them involve robustness to errors of logic, or limited reasoning capacity, or minor mistakes (Type 2). Some of them involve equilibria definitions that can handle problems in the way the game form are written (Type 3), as the invariance property here attempts to do.

Attempting to do all of the above simultaneously is going to be problematic. We ought first agree on a canonical game form, and given the form, ought describe errors explicitly in the game form (Type 2), then deal axiomatically with the decision theoretic issues. Looking back from 2012, I think that Type 2 has been dealt with suitably, and negatively, by papers discussed previously on this site. Essentially, if there are a set of Nash Equilibria, I can make every one of them a strict Nash Equilibria by altering super-high-order knowledge in a way that is fairly uncontroversial once you accept that reasoning with higher-order knowledge may be limited or that people make mistakes in applying such knowledge. That is, all NE would need to be part of any “reasonable set” if we are to leave the world of perfectly rational agents.

I actually think Type 3 robustness is not fully explored, though, which brings us back to Kreps and Scheinkman. I don’t think their conclusion – that the squabble about Cournot and Bertrand can be in some sense solved by suitably changing Bertrand such that information about other agents’ potential production is known when I choose price – is enough. Rather, there is a potential problem with the idea of how Nash Equilibrium treats payoffs, a problem made most clear in games with continuous action spaces where the payoffs depend on a system of variables, some of which can be solved for if we know the others. This is a bit opaque, but I hope to have more to say on that point in the near future.

http://www.dklevine.com/archive/refs4445.pdf (Final Econometrica version – big thumbs up to David Levine for his continuing acts of giving a different finger up to copyright maximalism.)

“The General Theory of Employment, Interest and Money,” J.M. Keynes (1936)

Keynes has returned, haven’t you heard? Or, at least, pieces of Keynesian analysis have returned, and many economists have gone back to their copies of the General Theory hoping to glean some insight. The problem is that the General Theory is not terribly well-written, unlike Keynes’ biography of Marshall, his Economic Possibilities for our Grandchildren, or his Economic Consequences of the Peace. Samuelson, in a memoriam (gated JSTOR copy, I’m afraid) where he calls Keynes’ book a work of genius accepted by every young economist of the day, also complains that it is a poorly written book, unsuitable for classroom use, resembling the confused notes of an author whose fame allowed him to bulldoze any potential editor; you simply must read this essay by Samuelson, both for the wonderful erudition and alacrity which appears in much of his writing, as well as for his frequent riffs, self-evident to me as well, on how much of a selfish jackass Keynes often appears to be!

Nonetheless, the General Theory is one of the most important pieces of economic writing ever put to paper, so let me give a few confused notes of my own on the subject. In particular, I’ll try to be true to what’s actually in Keynes, and not the intermediated version of “Keynesianism” which flowed from the pens of men like Hicks (who formalized some of the GT into the IS-LM model in this 1937 article, on the whole far more coherent than the GT itself). I’ll also try to discuss the less-often-quoted portions of the book – the Chapter 12 Keynesians are the Christmas-and-Easter-Catholics of the economics world, are they not? I’ve even got some red meat for the microeconomics-inclined reader, if you’ll bear with me.

1) This is a book filled with ideas: the importance of capital overhang, mass unemployment as an equilibrium outcome, liquidity traps, sticky wages, and the paradox of thrift all appear here. Not all are original to the GT, of course, but all of those were certainly outside the mainstream when Keynes wrote.

2) Keynes also recommends a number of policies that seem a bit ridiculous in retrospect. He supports “stamping” currency so that the government can force holders of currency to bear a negative nominal interest rate. He supports massive government involvement in investment markets. Most importantly, he places a huge importance on interest rates for long-term growth. Let’s discuss that more.

3) Perhaps Keynes is right on this point: if we are in a world where capital deepening is the primary reason for economic growth, then we can’t castigate Keynes for focusing on it any more than we can insult Malthus now for pointing out the importance of land constraints in an era when the Malthusian mechanism basically was in effect. But my reading of growth accounting is that even by the 1930s, innovation was more important than capital deepening for growth. It’s not often understood that, though the particular aspects of Keynesian theory apply only when we are below full employment, the text treats those periods as normal and periods of full employment as special, e.g., “the evidence indicates that full, or even approximately full, employment is of rare and short-lived occurrence.” That is, this is a book about growth and not a book of “depression economics,” as Hicks called it. The main point is not about monetary and fiscal policy in depressions, but about the use of those policies in normal times to ensure full employment in a non-inflationary way.

4) Keynes really hates on mathematical economics: perhaps this is the influence of Marshall at Cambridge. He claims at one point that he finds the math elides over qualifications that everyday language can keep “in the back of the mind.” This is a bit odd considering how difficult it is to understand many of the statements Keynes makes; even worse, we know in hindsight that those who “translated” Keynes to a formal language found many mistakes in the text. I wonder if his antipathy results from simply being unable to follow cutting edge mathematical economics as he got older. The Samuelson biography of Keynes I linked to above mentions some examples of utterly trivial mathematical errors Keynes made late in his career.

5) I find it entertaining that, at the end of Chapter 20, you find precisely the neoclassical point that confusion about whether price changes are relative or absolute can have major effects.

6) Liquidity definitely mattered in the recent financial crises, but I don’t think it mattered in the way Keynes uses the term. For Keynes, liquidity preference results, in general, from Knightian uncertainty about the future. I use liquid assets to deal with such uncertainty. But the precise financial market issue where liquidity seems most important today – collateral – is hardly discussed at all. That said, the discussion of liquidity, Chapter 17, is the best part of the General Theory. Keynes is a good marginalist, pointing out, among other things, that greater demand for liquidity will not lead to greater hoarding of money as the money supply is set by the government and not the potential hoarders. Rather, greater demand for liquidity simply leads to higher interest rates for potential borrowers.

So what of the legacy? For the standard counterpoint, see the wonderful 1977 essay “Is Keynesian Economics a Dead End?” by Tom Sargent, this year’s Nobel winner. What is great is that this essay very succinctly covers both what is wrong with large-scale Keynesian models of the economy, as well as an early response to the “counter-counter-revolutionaries” who defended themselves against Lucas, Sims, Sargent and the rest of the Young Turks. The basics of neoclassicism you surely know: expectations enter economics through a complicated feedback loop which was not captured by the identification restrictions of large macro models in the 1970s, a reasonable way to handle expectations is “rationally” where agents optimize conditional on not having perfect information, and simple thought experiments incorporating that idea show that within the range of policy options available, such expectations can give massively different predictions from orthodox Keynesianism, particularly that high inflation and high unemployment are possible. Further, though the driver in such models is agents’ misunderstanding about whether price changes are relative or absolute in the short term, such minor impulses can lead to large effects in a more interesting way that just that “the Great Depression was a period of contagious laziness,” as critics were putting it even back in the 1970s. This all seems totally reasonable to me. The New Keynesians (see, for instance, this nice 1990 review by Bob Gordon) restored Keynes a bit by focusing on non-informational reasons why prices might be sticky.

Of course, most modern macroeconomists believe both that the expectations process used to identify macro models must be well-founded and that price or wage stickiness plays an important role. More importantly, both schools more or less accept the nonneutrality of money in the short run, the neutrality of money in the long run, and the basic tenet of optimizing agents subject to constraints, of which only the first appears in Keynes (or, at least, the Keynes of the General Theory, since something like rational expectations appears in his earlier Treatise on Money); further, that first item is relatively unimportant to the General Theory, as noted in the earlier discussion. Keynes may be important to read, but he is certainly not “back,” financial crisis or not.

Full version online at http://www.marxists.org/reference/subject/economics/keynes/general-theory/, though the nice Harcourt edition is cheap enough to buy in print. Marxists.org actually has a great online library of economics classics, from Petty and Locke through to Taylorism and Keynes. Also see Keynes’ famous followup article in a 1937 QJE, though I would argue that this article has a totally different emphasis from the book.

Follow

Get every new post delivered to your Inbox.

Join 87 other followers

%d bloggers like this: