Properties versus Principles in Defining "Good Statistics"

21 Jun 2011 19:51

Now that I'm teaching in a statistics department, I find myself even more apt than before to get into (good-natured!) arguments about the proper way of doing statistics. I pretty much accept the error-statistical ideas of Deborah Mayo, that good statistical procedures are reliable ways of learning from data, ones which are unlikely to lead us into errors. My Bayesian friends sometimes respond by saying things like (paraphrasing from memory): "That doesn't give me any guidance in designing a statistical procedure in a given case, whereas we have a straightforward set of principles which do. What are the error statistician's principles?" I wonder if this isn't the wrong way to go about it, though.

In his book The Theory of Literary Criticism, John Ellis argues that it's a mistake to try to define many categories in terms of criteria which are applicable to the objects of the categories in themselves; they are rather defined (in large part) by their relations to us and to our purposes. The really persuasive (to me) example is "weed": a weed is simply an obnoxious plant. Plants may be obnoxious because they are fast-growing, hardy, perennial, etc., etc., but none of these properties, or any Boolean combination thereof, defines weeds; their relation to our purposes in gardening does. It's perfectly sensible to say "kudzu is a weed, and one of the reasons why is that it grows so fast", but fast growth doesn't define weeds. The quest for criteria or defining principles of weed-hood is (if I may put it this way) fruitless. (He tries to define "literature" as "texts a community of readers uses in a certain way", which I like, but his attempt to elucidate that way is complicated, debatable, and not really relevant here.)

I wonder if one can't say something similar about good statistics? What makes something a good method of statistical inference is that it gives us a reliable, low-error way of drawing conclusions from data. The reasons why a given procedure is reliable, and the ways we find them, are many and various. In the case of the Neyman-Pearson lemma, we directly minimize error probabilities; but sometimes we maximize likelihood, sometimes we use conditioning to update prior probability distributions, etc. None of these --- particularly the last --- defines a reliable way of learning from data.