Bayesian statistics is so closely linked with induction that one often hears it called “Bayesian induction.” What could be more inductive than taking a prior, gathering data, updating the prior with Bayes Law, and limiting to the true distribution of some parameter?
Gelman (of the popular statistics blog) and Shalizi point that, in practice, Bayesian statistics should actually be seen as Popper-style hypothesis-based deduction. The problem is intricately linked to the “taking a prior” above. In practice, the parameter space, or the domain used in estimation, is never broad enough to encompass every possible theory (even nonparametrics makes assumptions about conditional independence). This is for all the standard modelmaking reasons: tractability, parsimony, etc. Given that this is the case, the authors argue that computing the posterior is only the first step of good Bayesian data work. Once the posterior is estimated, the predictions of the posterior model should be compared with real world data – not with alternative hypotheses as in classical estimation – and if the replicated data do not fit, the estimation should be redone on a different domain. Replacing the domain is a form of model rejection a la Popper.
(The authors also note the “statistical folklore” that classical statistical tests are in many ways a measure of sample size and nothing more. As social scientists, we know that all of our models, strictly speaking, are wrong: otherwise they would not be models. That being the case, any model is social science will be rejected given enough evidence. I think this point is often misunderstood by economists.)
C. Shalizi also has a blog, albeit an infrequently-updated one.
the predictions of the posterior model should be compared with real world data – not with alternative hypotheses as in classical estimation.
I’d like to quibble. Remember the correspondence between testing and confidence regions in classical estimation: the classical confidence sets consists of all the points which the data do not reject at the specified size. This was part of why Haavelmo thought set-valued estimation was much more appropriate for economics than point estimation.
Thanks for the cite, David.
Point taken, Cosma. I am completely with Haavelmo here, but my understanding was that classical statistics was not. That is, wouldn’t a statistician in, say, 1920 have believed that testing H0 against H1 in the point-estimate sense was what a researcher should do…rather than examining the confidence region in general? My knowledge of the history here is pretty weak, so I’m open to interesting citations on this issue.
Historically, what have come to be called “pure tests of significance” actually come before H0 vs. H1 tests, as in Karl Pearson and Fisher. And the whole idea of confidence regions comes from Neyman (1937), which is not only as deep into classical frequentist statistics as it gets, but explicitly grows out of his work with Egon Pearson on hypothesis testing.
For that matter, as an issue of pure frequentist theory, nothing says that a test of a composite H0 against a composite H1 must depend only on point estimates within each model, though that certainly makes life easier! (Unless maybe the two point estimates are both sufficient statistics for their models.)
(Looking at the paper again to check the reference, I see it was communicated by Harold Jeffreys, despite explicitly rejecting Jeffrey’s ideas on Bayesian estimation by name; this shows class on Jeffreys’s part.)
[…] and the Practice of Bayesian Statistics,” A. Gelman & C. Shalizi (2010): afinetheorem.wordpress.com/philosophy-and-the-practice-of-bayesian-statistics-a-gelman-c-shalizi-201… Bayesian statistics is so closely linked with induction that one often hears it called “Bayesian […]
[…] August 7, 2010 at 8:02 pm · Filed under Uncategorized https://afinetheorem.wordpress.com/2010/06/27/philosophy-and-the-practice-of-bayesian-statistics-a-ge… […]