Sunday, June 3, 2012

Will Bayesian statistics become too easy?

A Bayesian approach to statistical inference has become increasing popular since the advent of increased desktop computing power and the development of tailored software. This is a really really good thing. However, I am concerned that it may, in the not very distant future, become too easy, and too much like frequentist methods as they are currently learned and used by life science undergraduate and graduate students. I am concerned that, in order to make Bayesian methods more accessible, they will be dumbed-down --made too easy-- and their value lost.
Part of the benefit of a Bayesian approach is that it more accurately reflects how Science is done. In a nutshell, the Bayesian approach consists of
  1. Prior beliefs: ideas, knowledge, and explicit assumptions about our system, 
  2. Collection of new data.
  3. Using the new data to update our beliefs.
The result of a Bayesian analysis is not a simple yes-no, significant-not significant kind of answer, but rather a probability distribution that reflects our most informed guesses about our variable of interest.

I believe that there are two potential pitfalls in the over simplification of a Bayesian analysis. I believe that the less serious of these pitfalls concerns the results, the posterior distribution of each model parameter. Each of these distributions is really a massive collection of independent guesses at the parameters of interest, given all of our assumptions and the newly collected data. Thus the result is not "an answer" but rather thousands of answers, with some answers more likely than others. In our efforts to satisfy ourselves, editors, and readers, we may try too hard to simplify our results.

Although we may try too hard to simplify our results, I think there is a greater danger that we will try to simplify the prior knowledge and that assumptions that we start with. In my limited experience, ecologists and statisticians are very quick to fall back into the use of the "uninformative prior," as if this is somehow "unbiased." Statisticians recognize that all priors come with a point of view, so there is no such thing as an objective uninformative prior, sometimes more accurately called a reference prior. However, I see us taking the lazy route too often and using a supposedly unbiased reference prior that reduces the tendency to take seriously the literature we read. Lots of data will overwhelm a weak prior. However, it is my experience that priors derive their weakness out of our tendency to not take seriously the quantitative nature of our literature.

As evidence that Bayesian analyses can be made easy, I can point to the numerous specialized programs for population genetics and phylogenetics that are based upon Bayesian approaches. I have seen many students use these with very little notion of what they are doing.

As learning in general is essentially a Bayesian process, my fears are not too serious. Nonetheless, ecologists need to take their priors seriously. Statisticians can help by encouraging us to make our beliefs both informed and explicit. In the end, it will only strengthen our science.

No comments:

Post a Comment