Category Archives: Philosophy of Science

Angus Deaton, 2015 Nobel Winner: A Prize for Structural Analysis?

Angus Deaton, the Scottish-born, Cambridge-trained Princeton economist, best known for his careful work on measuring the changes in wellbeing of the world’s poor, has won the 2015 Nobel Prize in economics. His data collection is fairly easy to understand, so I will leave larger discussion of exactly what he has found to the general news media; Deaton’s book “The Great Escape” provides a very nice summary of what he has found as well, and I think a fair reading of his development preferences are that he much prefers the currently en vogue idea of just giving cash to the poor and letting them spend it as they wish.

Essentially, when one carefully measures consumption, health, or generic characteristics of wellbeing, there has been tremendous improvement indeed in the state of the world’s poor. National statistics do not measure these ideas well, because developing countries do not tend to track data at the level of the individual. Indeed, even in the United States, we have only recently begun work on localized measures of the price level and hence the poverty rate. Deaton claims, as in his 2010 AEA Presidential Address (previously discussed briefly on two occasions on AFT), that many of the measures of global inequality and poverty used by the press are fundamentally flawed, largely because of the weak theoretical justification for how they link prices across regions and countries. Careful non-aggregate measures of consumption, health, and wellbeing, like those generated by Deaton, Tony Atkinson, Alwyn Young, Thomas Piketty and Emmanuel Saez, are essential for understanding how human welfare has changed over time and space, and is a deserving rationale for a Nobel.

The surprising thing about Deaton, however, is that despite his great data-collection work and his interest in development, he is famously hostile to the “randomista” trend which proposes that randomized control trials (RCT) or other suitable tools for internally valid causal inference are the best way of learning how to improve the lives of the world’s poor. This mode is most closely associated with the enormously influential J-PAL lab at MIT, and there is no field in economics where you are less likely to see traditional price theoretic ideas than modern studies of development. Deaton is very clear on his opinion: “Randomized controlled trials cannot automatically trump other evidence, they do not occupy any special place in some hierarchy of evidence, nor does it make sense to refer to them as “hard” while other methods are “soft”… [T]he analysis of projects needs to be refocused towards the investigation of potentially generalizable mechanisms that explain why and in what contexts projects can be expected to work.” I would argue that Deaton’s work is much closer to more traditional economic studies of development than to RCTs.

To understand this point of view, we need to go back to Deaton’s earliest work. Among Deaton’s most famous early papers was his well-known development of the Almost Ideal Demand System (AIDS) in 1980 with Muellbauer, a paper chosen as one of the 20 best published in the first 100 years of the AER. It has long been known that individual demand equations which come from utility maximization must satisfy certain properties. For example, a rational consumer’s demand for food should not depend on whether the consumer’s equivalent real salary is paid in American or Canadian dollars. These restrictions turn out to be useful in that if you want to know how demand for various products depend on changes in income, among many other questions, the restrictions of utility theory simplify estimation greatly by reducing the number of free parameters. The problem is in specifying a form for aggregate demand, such as how demand for cars depends on the incomes of all consumers and prices of other goods. It turns out that, in general, aggregate demand generated by utility-maximizing households does not satisfy the same restrictions as individual demand; you can’t simply assume that there is a “representative consumer” with some utility function and demand function equal to each individual agent. What form should we write for aggregate demand, and how congruent is that form with economic theory? Surely an important question if we want to estimate how a shift in taxes on some commodity, or a policy of giving some agricultural input to some farmers, is going to affect demand for output, its price, and hence welfare!

Let q(j)=D(p,c,e) say that the quantity of j consumed, in aggregate is a function of the price of all goods p and the total consumption (or average consumption) c, plus perhaps some random error e. This can be tough to estimate: if D(p,c,e)=Ap+e, where demand is just a linear function of relative prices, then we have a k-by-k matrix to estimate, where k is the number of goods. Worse, that demand function is also imposing an enormous restriction on what individual demand functions, and hence utility functions, look like, in a way that theory does not necessarily support. The AIDS of Deaton and Muellbauer combine the fact that Taylor expansions approximately linearize nonlinear functions and that individual demand can be aggregated even when heterogeneous across individuals if the restrictions of Muellbauer’s PIGLOG papers are satisfied to show a functional form for aggregate demand D which is consistent with aggregated individual rational behavior and which can sometimes be estimated via OLS. They use British data to argue that aggregate demand violates testable assumptions of the model and hence factors like credit constraints or price expectations are fundamental in explaining aggregate consumption.

This exercise brings up a number of first-order questions for a development economist. First, it shows clearly the problem with estimating aggregate demand as a purely linear function of prices and income, as if society were a single consumer. Second, it gives the importance of how we measure the overall price level in figuring out the effects of taxes and other policies. Third, it combines theory and data to convincingly suggest that models which estimate demand solely as a function of current prices and current income are necessarily going to give misleading results, even when demand is allowed to take on very general forms as in the AIDS model. A huge body of research since 1980 has investigated how we can better model demand in order to credibly evaluate demand-affecting policy. All of this is very different from how a certain strand of development economist today might investigate something like a subsidy. Rather than taking obversational data, these economists might look for a random or quasirandom experiment where such a subsidy was introduced, and estimate the “effect” of that subsidy directly on some quantity of interest, without concern for how exactly that subsidy generated the effect.

To see the difference between randomization and more structural approaches like AIDS, consider the following example from Deaton. You are asked to evaluate whether China should invest more in building railway stations if they wish to reduce poverty. Many economists trained in a manner influenced by the randomization movement would say, well, we can’t just regress the existence of a railway on a measure of city-by-city poverty. The existence of a railway station depends on both things we can control for (the population of a given city) and things we can’t control for (subjective belief that a town is “growing” when the railway is plopped there). Let’s find something that is correlated with rail station building but uncorrelated with the random component of how rail station building affects poverty: for instance, a city may lie on a geographically-accepted path between two large cities. If certain assumptions hold, it turns out that a two-stage “instrumental variable” approach can use that “quasi-experiment” to generate the LATE, or local average treatment effect. This effect is the average benefit of a railway station on poverty reduction, at the local margin of cities which are just induced by the instrument to build a railway station. Similar techniques, like difference-in-difference and randomized control trials, under slightly different assumptions can generate credible LATEs. In development work today, it is very common to see a paper where large portions are devoted to showing that the assumptions (often untestable) of a given causal inference model are likely to hold in a given setting, then finally claiming that the treatment effect of X on Y is Z. That LATEs can be identified outside of a purely randomized contexts is incredibly important and valuable, and the economists and statisticians who did the heavy statistical lifting on this so-called Rubin model will absolutely and justly win an Economics Nobel sometime soon.

However, this use of instrumental variables would surely seem strange to the old Cowles Commission folks: Deaton is correct that “econometric analysis has changed its focus over the years, away from the analysis of models derived from theory towards much looser specifications that are statistical representations of program evaluation. With this shift, instrumental variables have moved from being solutions to a well-defined problem of inference to being devices that induce quasi-randomization.” The traditional use of instrumental variables was that after writing down a theoretically justified model of behavior or aggregates, certain parameters – not treatment effects, but parameters of a model – are not identified. For instance, price and quantity transacted are determined by the intersection of aggregate supply and aggregate demand. Knowing, say, that price and quantity was (a,b) today, and is (c,d) tomorrow, does not let me figure out the shape of either the supply or demand curve. If price and quantity both rise, it may be that demand alone has increased pushing the demand curve to the right, or that demand has increased while the supply curve has also shifted to the right a small amount, or many other outcomes. An instrument that increases supply without changing demand, or vice versa, can be used to “identify” the supply and demand curves: an exogenous change in the price of oil will affect the price of gasoline without much of an effect on the demand curve, and hence we can examine price and quantity transacted before and after the oil supply shock to find the slope of supply and demand.

Note the difference between the supply and demand equation and the treatment effects use of instrumental variables. In the former case, we have a well-specified system of supply and demand, based on economic theory. Once the supply and demand curves are estimated, we can then perform all sorts of counterfactual and welfare analysis. In the latter case, we generate a treatment effect (really, a LATE), but we do not really know why we got the treatment effect we got. Are rail stations useful because they reduce price variance across cities, because they allow for increasing returns to scale in industry to be utilized, or some other reason? Once we know the “why”, we can ask questions like, is there a cheaper way to generate the same benefit? Is heterogeneity in the benefit important? Ought I expect the results from my quasiexperiment in place A and time B to still operate in place C and time D (a famous example being the drug Opren, which was very successful in RCTs but turned out to be particularly deadly when used widely by the elderly)? Worse, the whole idea of LATE is backwards. We traditionally choose a parameter of interest, which may or may not be a treatment effect, and then choose an estimation technique that can credible estimate that parameter. Quasirandom techniques instead start by specifying the estimation technique and then hunt for a quasirandom setting, or randomize appropriately by “dosing” some subjects and not others, in order to fit the assumptions necessary to generate a LATE. If is often the case that even policymakers do not care principally about the LATE, but rather they care about some measure of welfare impact which rarely is immediately interpretable even if the LATE is credibly known!

Given these problems, why are random and quasirandom techniques so heavily endorsed by the dominant branch of development? Again, let’s turn to Deaton: “There has also been frustration with the World Bank’s apparent failure to learn from its own projects, and its inability to provide a convincing argument that its past activities have enhanced economic growth and poverty reduction. Past development practice is seen as a succession of fads, with one supposed magic bullet replacing another—from planning to infrastructure to human capital to structural adjustment to health and social capital to the environment and back to infrastructure—a process that seems not to be guided by progressive learning.” This is to say, the conditions necessary to estimate theoretical models are so stringent that development economists have been writing noncredible models, estimating them, generating some fad of programs that is used in development for a few years until it turns out not to be silver bullet, then abandoning the fad for some new technique. Better, the randomistas argue, to forget about external validity for now, and instead just evaluate the LATEs on a program-by-program basis, iterating what types of programs we evaluate until we have a suitable list of interventions that we feel confident work. That is, development should operate like medicine.

We have something of an impasse here. Everyone agrees that on many questions theory is ambiguous in the absence of particular types of data, hence more and better data collection is important. Everyone agrees that many parameters of interest for policymaking require certain assumptions, some more justifiable than others. Deaton’s position is that the parameters of interest to economists by and large are not LATEs, and cannot be generated in a straightforward way from LATEs. Thus, following Nancy Cartwright’s delightful phrasing, if we are to “use” causes rather than just “hunt” for what they are, we have no choice but to specify the minimal economic model which is able to generate the parameters we care about from the data. Glen Weyl’s attempt to rehabilitate price theory and Raj Chetty’s sufficient statistics approach are both attempts to combine the credibility of random and quasirandom inference with the benefits of external validity and counterfactual analysis that model-based structural designs permit.

One way to read Deaton’s prize, then, is as an award for the idea that effective development requires theory if we even hope to compare welfare across space and time or to understand why policies like infrastructure improvements matter for welfare and hence whether their beneficial effects will remain when moved to a new context. It is a prize which argues against the idea that all theory does is propose hypotheses. For Deaton, going all the way back to his work with AIDS, theory serves three roles: proposing hypotheses, suggesting which data is worthwhile to collect, and permitting inference on the basis of that data. A secondary implication, very clear in Deaton’s writing, is that even though the “great escape” from poverty and want is real and continuing, that escape is almost entirely driven by effects which are unrelated to aid and which are uninfluenced by the type of small bore, partial equilibrium policies for which randomization is generally suitable. And, indeed, the best development economists very much understand this point. The problem is that the media, and less technically capable young economists, still hold the mistaken belief that they can infer everything they want to infer about “what works” solely using the “scientific” methods of random- and quasirandomization. For Deaton, results that are easy to understand and communicate, like the “dollar-a-day” poverty standard or an average treatment effect, are less virtuous than results which carefully situate numbers in the role most amenable to answering an exact policy question.

Let me leave you three side notes and some links to Deaton’s work. First, I can’t help but laugh at Deaton’s description of his early career in one of his famous “Notes from America”. Deaton, despite being a student of the 1984 Nobel laureate Richard Stone, graduated from Cambridge essentially unaware of how one ought publish in the big “American” journals like Econometrica and the AER. Cambridge had gone from being the absolute center of economic thought to something of a disconnected backwater, and Deaton, despite writing a paper that would win a prize as one of the best papers in Econometrica published in the late 1970s, had essentially no understanding of the norms of publishing in such a journal! When the history of modern economics is written, the rise of a handful of European programs and their role in reintegrating economics on both sides of the Atlantic will be fundamental. Second, Deaton’s prize should be seen as something of a callback to the ’84 prize to Stone and ’77 prize to Meade, two of the least known Nobel laureates. I don’t think it is an exaggeration to say that the majority of new PhDs from even the very best programs will have no idea who those two men are, or what they did. But as Deaton mentions, Stone in particular was one of the early “structural modelers” in that he was interested in estimating the so-called “deep” or behavioral parameters of economic models in a way that is absolutely universal today, as well as being a pioneer in the creation and collection of novel economic statistics whose value was proposed on the basis of economic theory. Quite a modern research program! Third, of the 19 papers in the AER “Top 20 of all time” whose authors were alive during the era of the economics Nobel, 14 have had at least one author win the prize. Should this be a cause for hope for the living outliers, Anne Krueger, Harold Demsetz, Stephen Ross, John Harris, Michael Todaro and Dale Jorgensen?

For those interested in Deaton’s work beyond what this short essay, his methodological essay, quoted often in this post, is here. The Nobel Prize technical summary, always a great and well-written read, can be found here.


“Minimal Model Explanations,” R.W. Batterman & C.C. Rice (2014)

I unfortunately was overseas and wasn’t able to attend the recent Stanford conference on Causality in the Social Sciences; a friend organized the event and was able to put together a really incredible set of speakers: Nancy Cartwright, Chuck Manski, Joshua Angrist, Garth Saloner and many others. Coincidentally, a recent issue of the journal Philosophy of Science had an interesting article quite relevant to economists interested in methodology: how is it that we learn anything about the world when we use a model that is based on false assumptions?

You might think of there being five classes which make up nearly every paper published in the best economics journals. First are pure theoretical exercises, or “tool building”, such as investigations of the properties of equilibria or the development of a new econometric technique. Second are abstract models which are meant to speak to an applied problem. Third are empirical papers whose primary quantities of interest are the parameters of an economic model (broadly, “structural papers”, although this isn’t quite the historic use of the term). Fourth are empirical papers whose primary quantities of interest are causal treatment effects (broadly, “reduced form papers”, although again this is not the historic meaning of that term). Fifth are descriptive work or historical summary. Lab and field experiments, and old-fashioned correlation analysis, all fit into that framework fairly naturally as well. It is the second and third classes which seem very strange to many non-economists. We write a model which is deliberately abstract and which is based on counterfactual assumptions about human or firm behavior, but nonetheless we feel that these types of models are “useful” or “explanatory” in some sense. Why?

Let’s say that in the actual world, conditions A imply outcome B via implication C (perhaps causal, perhaps as part of a simultaneous equilibrium, or whatever). The old Friedman 1953 idea is that a good model predicts B well across all questions with which we are concerned, and the unreality of the assumptions (or implicitly of the logical process C) are unimportant. Earlier literature in the philosophy of science has suggested that “minimal models” explain because A’, a subset of A, are sufficient to drive B via C; that is, the abstraction merely strips away any assumptions that are not what the philosopher Weisberg calls “explanatorily privileged causal factors.” Pincock, another philosopher, suggests that models track causes, yes, but also isolate factors and connect phenomena via mathematical similarity. That is, the model focuses on causes A’, subset of A, and on implications C’, subset of C, which are of special interest because they help us see how the particular situation we are analyzing is similar to ones we have analyzed before.

Batterman and Rice argue that these reasons are not why minimal models “work”. For instance, if we are to say that a model explains because it abstracts only to the relevant causal factors, the question is how we know what those factors are in advance of examining them. Consider Fisher’s sex ratio model: why do we so frequently see 1:1 sex ratios in nature? He argues that there is a fitness advantage for those whose offspring tend toward the less common sex, since they find it easier to procreate. In the model, parents choose sex of offspring, reproduction is asexual (does not involve matching), no genetic recombination occurs, there are no random changes to genes, etc: many of the assumptions are completely contrary to reality. Why, then, do we think the model explains? It explains because there is a story about why the omitted factors are irrelevant to the behavior being explained. That is, in the model assumptions D generate E via causal explanation C, and there is a story about why D->E via C and A->B via C operate in similar ways. Instead of simply assuming that certain factors are “explanatorily privileged”, we show that that model factors affect outcomes in similar ways to how more complicated real world objects operate.

Interesting, but I feel that this still isn’t what’s going on in economics. Itzhak Gilboa, the theorist, in a review of Mary Morgan’s delightful book The World in the Model, writes that “being an economic theorist, I have been conditioned to prefer elegance over accuracy, insight over detail.” I take that to mean that what economic theorists care about are explanatory factors or implications C’, subset of C. That is, the deduction is the theory. Think of Arrow’s possibility theorem. There is nothing “testable” about it; certainly the theory does not make any claim about real world outcomes. It merely shows the impossibility of preference aggregation satisfying certain axioms, full stop. How is this “useful”? Well, the usefulness of this type of abstract model depends entirely on the user. Some readers may find such insight trivial, or uninteresting, or whatever, whereas others may find such an exploration of theoretical space helps clarify their thinking about some real world phenomenon. The whole question of “Why do minimal models explain/work/predict” is less interesting to me than the question “Why do minimal models prove useful for a given reader“.

The closest philosophical position to this idea is some form of Peirce-style pragmatism – he actually uses a minimal model himself in exactly this way in his Note on the Economy of the Theory of Research! I also find it useful to think about the usefulness of abstract models via Economic Models as Analogies, an idea pushed by Gilboa and three other well-known theorists. Essentially, a model is a case fully examined. Examining a number of cases in the theoretical world, and thinking formally through those cases, can prove useful when critiquing new policy ideas or historical explanations about the world. The theory is not a rule – and how could it be given the abstractness of the model – but an element in your mental toolkit. In physics, for example, if your engineer proposes spending money building a machine that implies perpetual motion, you have models of the physical world in your toolkit which, while not being about exactly that machine, are useful when analyzing how such a machine would or would not work. Likewise, if Russian wants to think about how it should respond to a “sudden stop” in investment and a currency outflow, the logical consequences of any real world policy are so complex that it is useful to have thought through the equilibrium implications of policies within the context of toy models, even if such models are only qualitatively useful or only useful in certain cases. When students complain, “but the assumptions are so unrealistic” or “but the model can’t predict anything”, you ought respond that the model can predict perfectly within the context of the model, and it is your job as the student, as the reader, to consider how understanding the mechanisms in the model help you think more clearly about related problems in the real world.

Final version in Philosophy of Science, which is gated, I’m afraid; I couldn’t find an ungated draft. Of related interest in the philosophy journals recently is Kevin Davey’s Can Good Science Be Logically Inconsistent? in Synthese. Note that economists use logically inconsistent reasoning all the time, in that we use model with assumption A in context B, and model with assumption Not A in context C. If “accepting a model” means thinking of the model as “justified belief”, then Davey provides very good reasons to think that science cannot be logically inconsistent. If, however, “accepting a model” meaning “finding it useful as a case” or “finding the deduction in the model of inherent interest”, then of course logically inconsistent models can still prove useful. So here’s to inconsistent economics!

Some Results Related to Arrow’s Theorem

Arrow’s (Im)possibility Theorem is, and I think this is universally acknowledged, one of the great social science theorems of all time. I particularly love it because of its value when arguing with Popperians and other anti-theory types: the theorem is “untestable” in that it quite literally does not make any predictions, yet surely all would consider it a valuable scientific insight.

In this post, I want to talk about a couple. new papers using Arrow’s result is unusual ways. First, a philosopher has shown exactly how Arrow’s result is related to the general philosophical problem of choosing which scientific theory to accept. Second, a pair of computer scientists have used AI techniques to generate an interesting new method for proving Arrow.

The philosophic problem is the following. A good theory should satisfy a number of criteria; for Kuhn, these included accuracy, consistency, breadth, simplicity and fruitfulness. Imagine now there are a group of theories (about, e.g., how galaxies form, why birds have wings, etc.) and we ordinally rank them based on these criteria. Also imagine that we have ranked each theory according to these criteria and we all agree on the rankings. Which theory ought we accept? Arrow applied to theory choice gives us the worrying result that not only is there no unique method of choosing among theories but also that there may not exist any such method at all, at least if we want to satisfy unanimity, non-dictatorship and independence of irrelevant alternatives. That is, even if you and I all agree about how each theory ranks according to different desirability criteria, we still don’t have a good, general method of aggregating among criteria.

So what to do? Davide Rizza, in a new paper in Synthese (gated, I’m afraid), discusses a number of solutions. Of course, if we have more than just ordinal information about each criterion, then we can construct aggregated orders. For instance, if we assigned a number for the relative rankings on each criterion, we could just add these up for each theory and hence have an order. Note that this theory choice rule can be done even if we just have ordinal data – if there are N theories, then on criterion C, give the best theorem in that criterion N points, the second best N-1, and so on, then add up the scores. This is the famous Borda Count.

Why can’t we choose theories by the Borda Count or similar, then? Well, Borda (and any other rule that could construct an aggregate order while satisfying unanimity and non-dictatorship) must be violating the IIA assumption in Arrow. Unanimity, which insists a rule accept a theory if it considered best along every criterion, and non-dictatorship, where more than one criterion can at least matter in principle, seem totally unobjectionable. So maybe we ought just toss IIA from our theory choice rule, as perhaps Donald Saari would wish us to do. And IIA is a bit strange indeed. If I rank A>B>C, and if you require me to have transitive preferences, then just knowing the binary rankings A>B and B>C is enough to tell you that I prefer A>C even if I didn’t know that particular binary relationship. In this case, adding B isn’t “irrelevant”; there is information in the binary pairs generated by transitivity which IIA does not allow me to take advantage of. Some people call the IIA assumption “binary independence” since it aggregates using only binary relations, an odd thing given that the individual orders contain, by virtue of being orders, more than just binary relations. It turns out that there are aggregation rules which generate an order if we loosen IIA to an alternative restriction on how to use information in sequences. IIA, rather than ordinal rankings across criteria, is where Arrow poses a problem for theory choice. Now, Rizza points out that these aggregation rules needn’t be unique so we still can have situations where we all agree about how different theories rank according to each criterion, and agree on the axiomatic properties we want in an aggregation rules, yet nonetheless disagree about which theory to accept. Still worrying, though not for Kuhn, and certainly not for us crazier Feyerabend and Latour fans!

(A quick aside: How strange it is that Arrow’s Theorem is so heavily associated with voting? That every voting rule is subject to tactical behavior is Gibbard-Satterthwaite, not Arrow, and this result about strategic voting imposes nothing like an IIA assumption. Arrow’s result is about the far more general problem of aggregating orders, a problem which fundamentally has nothing to do with individual behavior. Indeed, I seem to recall that Arrow came up with his theorem while working one summer as a grad student at RAND on the problem of what, if anything, it could mean for a country to have preferences when voting on behalf of its citizens in bodies like the UN. The story also goes that when he showed his advisor – perhaps Hotelling? – what he had been working on over the summer, he was basically told the result was so good that he might as well just graduate right away!)

The second paper today comes from two computer scientists. There are lots of proofs of Arrow’s theorem – the original proof in Arrow’s 1951 book is actually incorrect! – but the CS guys use a technique I hadn’t seen before. Essentially, they first prove with a simple induction that iff you can find a case with 2 voters and 3 options that satisfies the Arrow axioms, can you find such a case with N>=2 voters and M>=3 options. This doesn’t actually narrow the problem a great deal: there are still 3!=6 ways to order 3 options, hence 6^2=36 permutations of the joint vote of the 2 voters, hence 6^36 functions mapping the voter orders to a social order. Nonetheless, the problem is small enough to be tackled by a Constraint Satisfaction algortihm which checks IIA and unanimity and finds only two social welfare functions not violating one of those constraints, which are just the cases where Agents 1 and 2 are dictators. Their algorithm took one second to run on a standard computer (clearly they are better algorithm writers than the average economist!). Sen’s theorem and Muller-Satterthwaite can also be proven using a similar restriction to the base case followed by algorithmic search.

Of course, algorithmic proofs tend to lack the insight and elegance of standard proofs. But they have benefits as well. Just as you can show that only 2 social welfare functions with N=2 voters and M=3 options satisfy IIA and unanimity, you can also show that only 94 (out of 6^36!) satisfy IIA. That is, it is IIA rather than other assumptions which is doing most of the work in Arrow. Inspecting those 94 remaining social welfare functions by hand can help elucidate alternative sets of axioms which also generate aggregation possibility or impossibility.

(And a third paper, just for fun: it turns out that Kiribati and Nauru actually use Borda counts in their elections, and that there does appear to be strategic candidate nomination behavior designed to take advantage of the non-IIA nature of Borda! IIA looks in many ways like a restriction on tactical behavior by candidates or those nominating issues, rather than a restriction on tactical behavior by voters. If you happen to teach Borda counts, this is a great case to give students.)

Laboratory Life, B. Latour & S. Woolgar (1979)

Let’s do one more post on the economics of science; if you haven’t heard of Latour and the book that made him famous, all I can say is that it is 30% completely crazy (the author is a French philosopher, after all!), 70% incredibly insightful, and overall a must read for anyone trying to understand how science proceeds or how scientists are motivated.

Latour is best known for two ideas: that facts are socially constructed (and hence science really isn’t that different from other human pursuits) and that objects/ideas/networks have agency. He rose to prominence with Laboratory Life, which followed two years observing a lab, that of future Nobel Winner Roger Guillemin at the Salk Institute at UCSD.

What he notes is that science is really strange if you observe it proceeding without any priors. Basically, a big group of people use a bunch of animals and chemicals and technical devices to produce beakers of fluids and points on curves and colored tabs. Somehow, after a great amount of informal discussion, all of these outputs are synthesized into a written article a few pages long. Perhaps, many years later, modalities about what had been written will be dropped; “X is a valid test for Y” rather than “W and Z (1967) claim that X is a valid test for Y” or even “It has been conjectured that X may be a valid test for Y”. Often, the printed literature will later change its mind; “X was once considered a valid test for Y, but that result is no longer considered convincing.”

Surely no one denies that the last paragraph accurately describes how science proceeds. But recall the schoolboy description, in which there are facts in the world, and then scientists do some work and run some tests, after which a fact has been “discovered”. Whoa! Look at all that is left out! How did we decide what to test, or what particulars constitute distinct things? How did we synthesize all of the experimental data into a few pages of formal writeup? Through what process did statements begin to be taken for granted, losing their modalities? If scientists actually discover facts, then how can a “fact” be overturned in the future? Latour argues, and gives tons of anecdotal evidence from his time at Salk, that providing answers to those questions basically constitutes the majority of what scientists actually do. That is, it is not that the fact is out there in nature waiting to be discovered, but that the fact is constructed by scientists over time.

That statement can be misconstrued, of course. That something is constructed does not mean that it isn’t real; the English language is both real and it is uncontroversial to point out that it is socially constructed. Latour and Woolgar: “To say that [a particular hormone] is constructed is not to deny its solidity as a fact. Rather, it is to emphasize how, where and why it was created.” Or later, “We do not wish to say that facts do not exist nor that there is no such thing as reality. In this simple sense we are not relativist. Our point is that ‘out-there-ness’ is the consequence of scientific work rather than its cause.” Putting their idea another way, the exact same object or evidence can at one point be considered up for debate or perhaps just a statistical artefact, yet later is considered a “settled fact” and yet later still will occasionally revert again. That is, the “realness” of the scientific evidence is not a property of the evidence itself, which does not change, but a property of the social process by which science reifies that evidence into an object of significance.

Latour and Woolgar also have an interesting discussion of why scientists care about credit. The story of credit as a reward, or credit-giving as some sort of gift exchange is hard to square with certain facts about why people do or do not cite. Rather, credit can be seen as a sort of capital. If you are credited with a certain breakthrough, you can use that capital to get a better position, more equipment and lab space, etc. Without further breakthroughs for which you are credited, you will eventually run out of such capital. This is an interesting way to think about why and when scientists care about who is credited with particular work.

Amazon link. This is a book without a nice summary article, I’m afraid, so you’ll have to stop by your library.

“Finite Additivity, Another Lottery Paradox, and Conditionalisation,” C. Howson (2014)

If you know the probability theorist Bruno de Finetti, you know him either for his work on exchangeable processes, or for his legendary defense of finite additivity. Finite additivity essentially replaces the Kolmogorov assumption of countable additivity of probabilities. If Pr(i) for i=1 to N is the probability of event i, then the probability of the union of all i is just the sum of each individual probability under either countable of finite additivity, but countable additivity requires that property to hold for a countably infinite set of events.

What is objectionable about countable additivity? There are three classic problems. First, countable additivity restricts me from some very reasonable subjective beliefs. For instance, I might imagine that a Devil is going to pick one of the integers, and that he is equally likely to predict any given number. That is, my prior is uniform over the integers. Countable additivity does not allow this: if the probability of any given number being picked is greater than zero, then the sum diverges, and if the probability any given number is picked is zero, then by countable additivity the sum of the grand set is also zero, violating the usual axiom that the grand set has probability 1. The second problem, loosely related to the first, is that I literally cannot assign probabilities to some objects, such as a nonmeasurable set.

The third problem, though, is the really worrying one. To the extent that a theory of probability has epistemological meaning and is not simply a mathematical abstraction, we might want to require that it not contradict well-known philosophical premises. Imagine that every day, nature selects either 0 or 1. Let us observe 1 every day until the present (call this day N). Let H be the hypothesis that nature will select 1 every day from now until infinity. It is straightforward to show that countable additivity requires that as N grows large, continued observation of 1 implies that Pr(H)->1. But this is just saying that induction works! And if there is any great philosophical advance in the modern era, it is Hume’s (and Goodman’s, among others) demolition of the idea that induction is sensible. My own introduction to finite additivity comes from a friend’s work on consensus formation and belief updating in economics: we certainly don’t want to bake in ridiculous conclusions about beliefs that rely entirely on countable additivity, given how strongly that assumption militates for induction. Aumann was always very careful on this point.

It turns out that if you simply replace countable additivity with finite additivity, all of these problems (among others) go away. Howson, in a paper in the newest issue of Synthese, asks why, given that clear benefit, anyone still finds countable additivity justifiable? Surely there are lots of pretty theorems, from Radon-Nikodym on down, that require countable additivity, but if the theorem critically hinges on the basis of an unjustifiable assumption, then what exactly are we to infer about the justifiability of the theorem itself?

Two serious objections are tougher to deal with for de Finetti acolytes: coherence and conditionalization. Coherence, a principle closely associated with de Finetti himself, says that there should not be “fair bets” given your beliefs where you are guaranteed to lose money. It is sometimes claimed that a uniform prior over the naturals is not coherent: you are willing to take a bet that any given natural number will not be drawn, but the conjunction of such bets for all natural numbers means you will lose money with certainty. This isn’t too worrying, though; if we reject countable additivity, then why should we define coherence to apply to non-finite conjunctions of bets?

Conditionalization is more problematic. It means that given prior P(i), your posterior P(f) of event S after observing event E must be such that P(f)(S)=P(i)(S|E). This is just “Bayesian updating” off of a prior. Lester Dubins pointed out the following. Let A and B be two mutually exclusive hypothesis, such that P(A)=P(B)=.5. Let the random quantity X take positive integer values such that P(X=n|B)=0 (you have a uniform prior over the naturals conditional on B obtaining, which finite additivity allows), and P(X=n|A)=2^(-n). By the law of total probability, for all n, P(X=n)>0, and therefore by Bayes’ Theorem, P(B|X=n)=1 and P(A|X=n)=0, no matter which n obtains! Something is odd here. Before seeing the resolution of n, you would take a fair bet on A obtaining. But once n obtains (no matter which n!), you are guaranteed to lose money by betting on A.

Here is where Howson tries to save de Finetti with an unexpected tack. The problem in Dubins example is not finite additivity, but conditionalization – Bayesian updating from priors – itself! Here’s why. By a principle called “reflection”, if using a suitable updating rule, your future probability of event A is p with certainty, then your current probability of event A must also be p. By Dubins argument, then, P(A)=0 must hold before X realizes. But that means your prior must be 0, which means that whatever independent reasons you had for the prior being .5 must be rejected. If we are to give up one of Reflection, Finite Additivity, Conditionalization, Bayes’ Theorem or the Existence of Priors, Howson says we ought give up conditionalization. Now, there are lots of good reasons why conditionalization is sensible within a utility framework, so at this point, I will simply point your toward the full paper and let you decide for yourself whether Howson’s conclusion is sensible. In any case, the problems with countable additivity should be better known by economists.

Final version in Synthese, March 2014 [gated]. Incidentially, de Finetti was very tightly linked to the early econometricians. His philosophy – that probability is a form of logic and hence non-ampliative (“That which is logical is exact, but tells us nothing”) – simply oozes out of Savage/Aumann/Selten methods of dealing with reasoning under uncertainty. Read, for example, what Keynes had to say about what a probability is, and you will see just how radical de Finetti really was.

“Between Proof and Truth,” J. Boyer & G. Sandu (2012)

In the previous post, I promised that game theory has applications in pure philosophy. Some of these applications are economic in nature – a colleague has written a paper using a branch of mechanism design to link the seemingly disjoint methodologies of induction and falsification – but others really are pure. In particular, there is a branch of zero-sum games which deals with the fundamental nature of truth itself!

Verificationist theories of truth like those proposed by Michael Dummett suggest that a statement is true if we can prove it to be true. This may seem trivial, but it absolutely is not: for one, to the extent that there are unprovable statements as in some mathematical systems, we ought say “that statement is neither true nor false” not simply “we cannot prove that statement true or false”. Another school of thought, beginning with Tarski’s formal logics and expanded by Hintikka, says that truth is a property held by sentences. The statement “Snow is white” is true if and only if there is a thing called snow, and it is necessarily white. The statement “A or B” is true if and only if the thing denoted by “A” is true or the thing denoted by “B” is true.

These may seem very different conceptions. The first is a property requiring action – something is true if someone can verify it. The second seems more in the air – something is true if its sentence has certain properties. But take any sentence in formal logic, like “There exists A such that for all B either C or D is true”. We can play a game between Verifier and Falsifier. Move left to right across the sentence, letting the Verifier choose at all existentials and “or” statements, and the Falsifier choose at all universals and “and” statements. Verifier wins if he can get to the end of sentence with the sentence remaining true. That is, semantic games take Tarksi truth and make it playable by someone with agency, at least in principle.

The paper by Boyer and Sandu takes this as a starting point, and discusses when Dunnett’s truth coincides with Tarksi and Hintikka’s truth, restricting ourselves to semantic games played on recursive structures (nonconstructive winning strategies in the semantic game seems problematic if we want to relate truth in semantic games to verificationist truth!) Take statements in Peano arithmetic where all objects chosen are natural numbers (it happens to be truth that in PA, every recursive structure is isomorphic to the natural numbers). When is every statement I can prove also true in the sense of a winning strategy in the recursive semantic game? Conversely, when can the semantic game truth of a sentence by given by a proof? The answer to both is negative. For the first, check that the sentence that all programs x1 and inputs x2, there exists a number of steps y such that the system either halts after y steps or it does not halt. This is the halting problem. It is not decidable, hence there is no winning strategy for Verifier, but the sentence if trivially provable in Peano arithmetic by the law of the excluded middle.

Boyer and Sandu note (as known in an earlier literature) that we can relate the two types of truth by extending the semantic game to allow backward moves. That is, at any node, or at the end of the game, Verifier can go back to any node she played and change her action. Verifier wins if she has a finite winning strategy. It turns out that Verifier can win in the game with backward moves if and only if she can win in the standard game. Further, if a statement can be proven, Verifier can win in the game with backward moves using a recursive strategy. This has some interesting implications for Godel sentences (“This sentence is not provable within the current system.”) which I don’t wish to discuss here.

Note that all of this is just the use of game theory in “games against nature”. We usually think of game theory as being a tool for the analysis of situations with strategic interaction, but the condition that players are rational perfect optimizers means that, in zero sum games, checking whether something is possible for some player just involves checking whether a player called Nature has a winning strategy against him. This technique is so broadly applicable, in economics and otherwise, that we ought really be careful about defining game theory as solely a tool for analyzing the strategies of multiple “actual” agents; e.g., Wikipedia quotes Myerson’s definition that game theory is “the study of mathematical models of conflict and cooperation between intelligent rational decision-makers”. This is too limiting.

Final copy. This article appeared in Synthese 187/March 2012. Philosophers seem to rarely put their working papers online, but Springer has taken down the paywall on Synthese throughout December, so you can read the above link even without a subscription.

“737-Cabriolet: The Limits of Knowledge and the Sociology of Inevitable Failure,” J. Downer (2011)

Things go wrong. Nuclear power plants melt down. Airplanes fall from the sky. Wars break out even when both parties mean only to bluff. Financial shocks propagate in unexpected ways. There are two traditional ways of thinking about these events. First, we might look for the cause and apportion blame for such an unusual event. Company X used cheap, low-quality glue. Bureaucrat Y was poorly trained and made an obviously-incorrect decision. In these cases, we learn from our mistakes, and the mistakes are often not simply problems of engineering, but sociological problems: Why did the social setup of a group fail to catch the mistake? The second type of accident, the “normal accident” described famously by Charles Perrow, offers no lessons and is uncatchable in hindsight because it is too regular. That is, if a system is suitably complex, and if minor effects all occur roughly simultaneously, then the one-in-a-billion combination of minor effects can cause a serious problem. Another way to put this is that even if disasters are one-in-a-billion events, a system which throws out billions of possible disasters of this type is likely to produce one. The most famous case here is Three Mile Island, where among the many failsafes which simultaneously went awry was an indicator light that happened, on the fateful day, to have been blocked by a Post-It note.

John Downer proposes a third category, the “epistemic accident,” which is perhaps well-understood by engineers and scientists, but not by policymakers. An epistemic accident is when a problem occurs due to an error or a gap in our understanding of the world when we designed the system. Epistemic accidents are not normal, since once they happen we can correct them in the future, and since they do not depend on a rare concordance of events. But they also do not lend themselves to blame, since at the time they happen, the scientific knowledge necessary to prevent them was not yet known. This is a fundamentally constructivist way of viewing the world. Constructivism says, roughly, that there is no Platonic Ideal for science to reach. Experiments are theory-laden and models are necessarily abstract. This does mean science is totally relative or pointless, but rather that it is limited, and we will always be, on occasion, surprised by how our models (and this is true in social science as well!) perform in the “real world”. Being cognizant of the limits of scientific knowledge is important for evaluating accidents: particularly innovative systems will be more prone to epistemic accidents, for one.

Downer’s example is the famous Aloha Airlines 243 accident in 1988. On a routine flight from Hilo to Honolulu, the fuselage ripped right off of a 737, exposing a huge chunk of the passenger cabin while the plane was traveling at full speed. Luckily, the plane was not far from Maui, and managed to land with only one death – passengers had to, while themselves strapped in, lean over and hold down a stewardess who was lying down in the aisle in order to keep her from flying out of the plane. This was shocking since the 737 was built with multiple failsafes to ensure that such a rupture did not happen; roughly, the rupture would only happen, it was believed, if a crack many feet long developed on the airplane skin, and this would have been caught at a much smaller stage by regular maintenance.

It turns out that testing of the plane was missing two concepts. First, a combination of the glue being used with salt-heavy air made cracks more likely, and second, the way the rivets were lined up happens to make metal fatigue compound as minor cracks near each rivet connect with each other. And indeed, even in the minor world of massive airplane decompression, this was not the first “epistemic accident”. The reason airplane windows are oval and not square is to avoid almost exactly the same problem: some British-made Comets in the 50s crashed and the impact of their square windows with metal fatigue was found to be the culprit.

What does this mean for economics? I think it means quite a bit for policy. Complicated systems will always have problems that are beyond the bounds of designers to understand, at least until the problem arises. New systems, rather than existing systems, will tend to see these problems, as we learn what is important to include in our models and tests, and what is not. That is, the “flash crash” looks a lot like a “normal accident”, whereas the financial crisis has many aspects that look like epistemic accidents. New and complicated systems, such as those introduced in the financial world, should be handled in a fundamentally conservative way by policymakers in order to deal with the uncertainty in our models. And it’s not just finance: we know, for instance, of many unforeseen methods of collusion that have stymied even well-designed auctions constructed by our best mechanism designers. This is not strange, or a failure, but rather part of science, and we ought be upfront with it.

Google Docs Link (The only ungated version I can find is the Google Docs Quick View above which happens to sneak around a gate. Sociologists, my friends, you’ve got to tell your publishers that it’s no longer acceptable in 2012 to not have ungated working papers! If you have JSTOR access, and in case the link above goes dead, the final version in the November 2011 AJS is here)

“Fact, Fiction and Forecast,” N. Goodman (1954)

Fact, Fiction and Forecast is one of the seminal texts of 20th century philosophy: you may know it from the famous “grue/bleen” example. The text deals principally with two problems, the meaning of counterfactuals and a “new riddle” of induction, where the first is essential for any social scientist to understand, and the second has, I think, some interesting implications for decision theory. I will discuss each in turn. My notes are from the 4th edition, including the foreword by the legendary Hilary Putnam.

The first involves counterfactual conditionals, or sentences of the type “If X were true, then Y would obtain” along with the fact that X is not actually true. Counterfactual conditionals are both the focus of a huge number of economics paper (“If the Fed had done X, then GDP would have done Y”, “If deworming had been expanded to 100% in this village, school attendance would have been Y”, etc.). Counterfactuals are also, I would argue, the concept which has been forefront in the minds of the world’s leading philosophers over the past 60 years.

When economists use counterfactuals, I think they are naively trying to say something like “If the world is precisely the same, except that also X is true, then Y would hold.” There are a ton of problems with this. First, if everything in the world is precisely the same, then Not X is true, and since X and Not X are both true, by the principle of explosion, everything is true, including Not Y. So we must mean that everything in the world is precisely the same, except that X holds and Not X does not. Call the counterfactual set of true statements S’. But here we have more problems: S’ may contain a logical inconsistency, in that X may deductively imply some statement Z which is logically incompatible with something in S. Getting around that problem presents even more difficulties; David Lewis has the most famous resolution with his possible worlds logic, but even that is far from unproblematic.

Ignoring this basic problem of what is meant by a counterfactual, it is not well-known among social scientists that counterfactual conditionals are absolutely not strictly defined by their logical content, in the way that standard deductive logic is. That is, consider the statement If A then B, where A is a counterfactual. Let A’ be logically equivalent to A. It is easy to construct an example where you intuitively accept that A implies B, but not that A’ implies B. For instance, let A be “Bill Clinton were the same person as Julius Caesar,” A’ be “Julius Caesar were the same person as Bill Clinton” and B be “Bill Clinton would be emperor of Rome.” Given the importance of counterfactual logic to economics, there is a lot to be gained for our science from a better understanding of the philosophic issues here.

The more interesting point in Goodman for the decision theorist concerns induction. Hume showed in the 18th century why induction is invalid; the validity of induction involves assuming some sort of continuity of nature, and such an assumption is an induction itself. Even probabilistic induction – “The sun has risen every day, so I think it probable the sun will rise tomorrow” – is invalid for the same reason. There are many arguments contra Hume, but I hope you’ll take my word that they have all failed, and that the validity of induction is no longer an open question. That said, the wisdom of induction certainly is. Though we know induction is invalid reasoning, we nonetheless rely on it trivially every day (I get on a bus going north to my office, and not south, on the inductive assumption that my office is still north of my apartment) and less trivially on important policy issues (acceptance of “science” as a valid method for learning truth, rather than reading sacred books, is implicitly an acceptance of the wisdom of induction). What exactly do we mean when we say induction is wise? We mean that, there exist regularities for which the past existence of the regularity is evidence that we should expect the regularity in the future.

What Goodman points out is that the interesting question is not whether induction is valid – it isn’t – but rather what do we mean by a “regularity” anyway? This problem of induction is precisely the same to a problem in counterfactuals. Consider the regularity that every object is my pocket is a coin made of metal. I have investigated this many times, and every object I check is a metal coin. Consider the counterfactual “If I put a piece of chocolate in my pocket” or the induction on objects in my pocket where the only thing in my pocket today is a chocolate. Surely we don’t think we should induct that the chocolate will be a metal coin when I take it from my pocket. Alternatively, consider the regularity that all metal coins conduct electricity. I have investigated this many times also, and every metal coin I check conducts. If I check another coin, I do believe it will conduct. What is the difference between the chocolate example and the coin example? It is that I trust induction when I believe a law holds for some regularity, and do not trust induction when I believe past draws are simply random. The “grue/bleen” example, if you know it, is even stronger: I interpret it to mean that whatever rationale we use to delineate coincidences from regularities depends on more than how we selected instances in the past, or on the type of the property (say, color, or conductivity) we are examining. Goodman proposes some thoughts on how we know what histories are evidence of laws and what aren’t, but the exact delineation remains controversial.

So what does this mean for decision theory? Decision theory is heavily influenced by de Finetti and Savage, and somewhat by Carnap, and less so by other massive philosophy figures in this literature like Ayer, Goodman, Putnam, and Quine. That is, we conceive of the world as having states over which agents have a prior, and evidence changing that prior according to Bayes’ rule. Let Ω(₶) be the state space, where states are a countably infinite product space of potential observations. Let a “lawlike” set of hypotheses be a set of (infinite-length) observations that are compatible with some law, where the nature of possible laws is given exogenously. For instance, a lawlike set might be “all metals conduct” and the state space simply made up of tests of conductivity of various metals in each period plus a draw from the set {0,1}. The nature of the set of possible laws in the prior is that either all metals conduct, or the conductivity properties of various metals is not linked. Imagine in periods 1 and 2 that all metals conduct and we draw a 0 each time, and that in a second possible world, in periods 1 and 2 all metals conduct except copper in period 2, and we draw a 0 each time. What can we conclude as a Savage-style Bayesian? Think about what conditions on the prior are imposed.

There is one further worry for the standard econ model. How we induct in Goodman depends on what predicates we have as potential sources of laws: how ought we set up the state space? If we, say, put 0 prior on the world where all emeralds are grue, and positive prior on the world where all emeralds are green – and the standard model of state space means that we must include both possibilities as states – then we are violating Carnap’s “principle of total evidence” since we rule of grue before even seeing any evidence, and we are violating any of the standard rationales for putting positive probability on all possible states in the prior. (The Google Books preview contains the entire introduction plus the foreword by Putnam, which should give a good taste of the content. Among economists, Itzhak Gilboa seems to have done the most work on expanding Goodman-style ideas to decision theory.)

“Some Hard Questions for Critical Rationalism,” D. Miller (2009)

Dov Samet was here presenting a new paper of his recently, and he happened to mention the Miller-Popper refutation of learning by induction. Everyone knows Hume’s two refutations: that induction of the form “A has always happened before, hence A will be true in the future/in the larger sample” is invalid because the necessary uniformity of nature assumption requires an inductive step itself, and that statements like “A has always happened before, hence A is likely to happen in the future” suffer a near identical flaw. But what of Bayesians? A Bayesian takes his prior, gathers evidence, and updates inductively. Can induction provide probabilistic support for a theory? Miller-Popper says no, and the proof is easy.

Let h be a theory and e be evidence. Let p(h) be the prior belief of the validity of h, and p(h|e) the conditional belief. Let the Bayesian support for h by e be defined as s(h|e)=p(h|e)-p(h); if this is positive, then e supports h. Note that any proposition h is identical in truth value, for all propositions e, to (h or e)&(h or [not e]). Replace h in the support function with that statement, and you get


= p([h or e]&[h or not e]|e)-p([h or e]&[h or not e])

= p([h or e]|e)+p(h or not e|e)-p(h or e)-p(h or not e)

= s(h or e|e)+s(h or not e|e)

with the first equality holding by the obvious independence of terms inside the probability operator. So what does this mean? It means that the support of evidence e for theory h is just the sum of two types of support: that given to the proposition “h or e” and that given to the proposition “h or not e”. The first by definition is positive, and the second term by definition is negative. So only the first term can said to be providing positive support for h from the evidence. But the truth of (h or e) follows deductively from the assumed truth of e. Every part of the support for h from e that does not deductively follow from e is the second term, but that term is negative! Induction does not work. Another way to see this is with a slight restatement of the above proof: induction only provides probabilistic support for a theory if p(if e then h|e) is greater than p(if e then h). The above math shows that such a statement can never be true, for any h and any e. (There is a huge literature dealing with whether this logic is flawed or not – Popper and Miller provide the fullest explanation of their theorem in this 1987 article).

So an interesting proof. And this segues to the main paper of this post quite nicely. David Miller, still writing today, is one of the few prominent members of a Popper-style school of thought called critical rationalism. I (and hopefully you!) generally consider Popper-style falsification an essentially defunct school of thought when it comes to philosophy of science. There are many well-known reasons: Quine told us that “falsifying one theory” isn’t really possible given auxiliary assumptions, Lakatos worried about probabalistic evidence, Kuhn pointed out that no one thinks we should throw out a theory after one counterexample since we ought instead just assume there was a mistake in the lab, etc. And as far as guiding our everyday work as social scientists, “learn empirical truth as a disinterested body” is neither realistic (scientists cheat, they lie, they have biases) nor even the most important question in philosophy of science, which instead is about asking interesting questions. Surely it is agreed that even if a philosophy of science which provided an entirely valid way of learning truth about the world, it would still miss an important component, the method for deciding what truths are worth learning with our finite quantity of research effort. There are many other problems with Popper-influenced science, of course.

That’s what makes this Miller paper so interesting. He first notes that Popper is often misunderstood: if you think of falsification from the standpoint of a logician, the question is not “What demarcates science, where science is in some way linked to truth?” but rather “What research programs are valid ways of learning from empirical observation?” And since induction is invalid, justificationist theories (“we have good evidence for X because of Y and Z) are also invalid, whereas falsification arguments (“of the extant theories that swans are multicolored or always white, we can reject the second since we have seen a black swan”) is not ruled out by Hume. This is an interesting perspective on Popper which I hadn’t come across before.

But Miller also lists six areas where he thinks critical rationalism has done a poor job providing answers thus far, treating these problems through the lens of one who is very sympathetic to Popper. It’s worth reading through to see how he suggests dealing with questions like model selection, “approximate” truth, the meaning of progress in a world of falsification, and other worries. Worth a read. (December 2009 working paper)

“On the Creative Role of Axiomatics,” D. Schlimm (2011)

The mathematician Felix Klein: “The abstract formulation is excellently suited for the elaboration of proofs, but it is clearly not suited for finding new ideas and methods; rather, it constitutes the end of a previous development.” Such a view, Dirk Schlimm argues, is common among philosophers of science as well as mathematicians and other practitioners of axiomatic science (like economic theory). But is axiomatics limited to formalization, to consolidation, or can the axiomatic method be a creative act, one that opens up new venues and suggests new ideas? Given the emphasis on this site of the explanatory value of theory, it will come as no surprise that I see axiomatics as fundamentally creative. The author of the present paper agrees, diagramming the interesting history of the mathematic idea of a lattice.

Lattices are wholly familiar to economists at this stage, but it is worth recapping that they can be formulated in two identical ways: either as a set of elements plus two operations satisfying commutative, associative and absorption laws, which together ensure the set of elements is a partially ordered set (the standard “axiomatic” definition), or else as a set in which each subset has a well-defined infimum and supremum, from which the meet and join operators can be defined and shown to satisfy the laws mentioned above. We use lattices all the time in economic theory: proofs involving preferences, generally a poset, are an obvious example, but also results using monotone comparative statics, among many others. In mathematics more generally, proofs using lattices unify results in a huge number of fields: number theory, projective geometry, abstract algebra and group theory, logic, and many more.

With all these great uses of lattice theory, you might imagine early results proved these important connections between fields, and that the axiomatic definition merely consolidated precisely what was assumed about lattices, ensuring we know the minimum number of things we need to assume. This is not the case at all.

Ernst Schroder, in the late 19th century, noted a mistake in a claim by CS Peirce concerning the axioms of Boolean algebra (algebra with 0 and 1 only). In particular, one of the two distributive laws – say, a+bc=(a+b)(a+c) – turns out to be completely independent from the other standard axioms. In other interesting areas of group theory, Schroder noticed that the distributive axiom was not satisfied, though other axioms of Boolean algebra were. This led him to list what would be the axioms of lattices as something interesting in their own right. That is, work on axiomatizing one area, Boolean algebra, led to an interesting subset of axioms in another area, with the second axiomatization being fundamentally creative.

Dedekind (of the famous cuts), around the same time, also wrote down the axioms for a lattice while considering properties of least common multiples and greatest common divisors in number theory. He listed a set of properties held by lcms and gcds, and noted that distributive laws did not hold for those operations. He then notes a number of interesting other mathematical structures which are described by those properties if taken as axioms: ideals, fields, points in n-dimensional space, etc. Again, this is creativity stemming from axiomatization. Dedekind was unable to find much further use for this line of reasoning in his own field, algebraic number theory, however.

Little was done on lattices until the 1930s; perhaps this is not surprising, as the set theory revolution hit math after the turn of the century, and modern uses of lattices are most common when we deal with ordered sets. Karl Menger (son of the economist, I believe) wrote a common axiomatization of projective and affine geometries, mentioning that only the 6th axiom separates the two, suggesting that further modification of that axiom may suggest interesting new geometries, a creative insight not available without axiomatization. Albert Bennett, unaware of earlier work, rediscovered the axiom of the lattice, and more interestingly listed dozens of novel connections and uses for the idea that are made clear from the axioms. Oystein Ore in the 1930s showed that the axiomatization of a lattice is equivalent to a partial order relation, and showed that it is in a sense as useful a generalization of algebraic structure as you might get. (Interesting for Paul Samuelson hagiographers: the preference relation foundation of utility theory was really cutting edge math in the late 1930s! Mathematical tools to deal with utility in such a modern way literally did not exist before Samuelson’s era.)

I skip many other interesting mathematicians who helped develop the theory, of which much more detail is available in the linked paper. The examples above, Schlimm claims, essentially filter down to three creative purposes served by axiomatics. First, axioms analogize, suggesting the similarity of different domains, leading to a more general set of axioms encompassing those smaller sets, leading to investigation of the resulting larger domain – Aristotle in Analytica Posteriora 1.5 makes precisely this argument. Second, axioms guide the discovery of similar domains that were not, without axiomatization, thought to be similar. Third, axioms suggest modification of an axiom or two, leading to a newly defined domain from the modified axioms which might also be of interest. I can see all three of these creative acts in economic areas like decision theory. Certainly for the theorist working in axiomatic systems, it is worth keeping an open mind for creative, rather than summary, uses of such a tool. (2009 working paper – final version in Synthese 183)

%d bloggers like this: