<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Notebooks   </title>
    <link>http://bactra.org/notebooks</link>
    <description>Cosma's Notebooks</description>
    <language>en</language>

  <item>
    <title>Properties versus Principles in Defining &quot;Good Statistics&quot;</title>
    <link>http://bactra.org/notebooks/2008/06/04#properties-vs-principles-for-statistics</link>
    <description>
&lt;P&gt;Now that I'm teaching in a &lt;a href=&quot;http://www.stat.cmu.edu/&quot;&gt;statistics
department&lt;/a&gt;, I find myself even more apt than before to get into
(good-natured!) arguments about the proper way of doing
&lt;a href=&quot;statistics.html&quot;&gt;statistics&lt;/a&gt;.  I pretty much accept
the &lt;a href=&quot;../reviews/error/&quot;&gt;error-statistical ideas of Deborah Mayo&lt;/a&gt;,
that good statistical procedures are reliable ways of learning from data, ones
which are unlikely to lead us into errors.  My Bayesian friends sometimes
respond by saying things like (paraphrasing from memory): &quot;That doesn't give me
any guidance in designing a statistical procedure in a given case, whereas we
have a straightforward set of principles which do.  What are the error
statistician's principles?&quot;  I wonder if this isn't the wrong way to go about
it, though.

&lt;P&gt;In his book &lt;citE&gt;The Theory of Literary Criticism&lt;/cite&gt;, John Ellis argues
that it's a mistake to try to define many categories in terms of criteria which
are applicable to the objects of the categories in themselves; they are rather
defined (in large part) by their relations to us and to our purposes.  The
really persuasive (to me) example is &quot;weed&quot;: a weed is simply an obnoxious
plant.  Plants may be obnoxious because they are fast-growing, hardy,
perennial, etc., etc., but none of these properties or their Boolean
combinations &lt;em&gt;defines&lt;/em&gt; weeds; their relation to our purposes in
gardening does.  It's perfectly sensible to say &quot;kudzu is a weed, and one of
the reasons why is that it grows so fast&quot;, but fast growth doesn't define
weeds.  The quest for criteria or defining principles of weed-hood is (if I may
put it this way) fruitless.  (He tries to define &quot;literature&quot; as &quot;texts a
community of readers uses in a certain way&quot;, which I like, but his attempt to
elucidate that way is complicated, debatable, and not really relevant here.)

&lt;P&gt;I wonder if one can't say something similar about good statistics?  What
makes something a good method of statistical inference is that it gives us a
reliable, low-error way of drawing conclusions from data (etc., etc., through
the content of Mayo's books and papers).  The reasons why a given procedure is
reliable, and the ways we find them, are many and various.  In the case of the
Neyman-Pearson lemma, we directly minimize error probabilities; but sometimes
we maximize likelihood, sometimes we use conditioning to update prior
probability distributions, etc.  None of these --- particularly the last ---
&lt;em&gt;defines&lt;/em&gt; a reliable way of learning from data.

&lt;ul&gt;Recommended:
	&lt;li&gt;Lucien Le Cam, &quot;Maximum Likelihood: An Introduction&quot;
[&lt;a
href=&quot;http://stat-www.berkeley.edu/users/rice/LeCam/papers/tech168.pdf&quot;&gt;PDF&lt;/A&gt;]
	&lt;/ul&gt;
</description>
  </item>
  </channel>
</rss>