## Gygax Texts, Gygax Confidence Sets

*21 Sep 2022 16:47*

A **Gygax test** of a statistical hypothesis is one which
rejects the null hypothesis with the specified false-positive rate \( \alpha
\), but rejects in a completely random manner, independent of the truth or
falsity of the parameter value. If \( \alpha = 0.05 \), the conventional level
in many fields, this is like saying we reject the null when we roll a 1 on a
twenty-sided die, hence the name.

Given a Gygax test, it's easy to construct a Gygax confidence set. If the parameter space is countable, perform a separate Gygax test for each possible parameter value. (Roll a d20 for every parameter value.) Continuous parameter spaces are slightly more complicated, but we can nonetheless construct Gygax confidence sets (if not necessarily confidence intervals), as I shall explain in parentheses below.

(If the parameter space is the real line, we need a
continuous-time Markov chain where the two states are
"reject" and "accept", and where the invariant distribution puts probability \(
\alpha \) on "reject". Pick one point on the line, arbitrarily, as the origin,
and draw from the invariant distribution for that point. Then, conditional on
that starting value, move to the right and mark out regions of alternating
acceptance and rejection, following the chain, conditional on that initial
value. Similarly, go to the left *independently* of what we do to the
right of the origin. We have thus ensured that every parameter value on the
line, including the true value, is rejected with probability \( \alpha \).
Extending the construction to higher-dimensional parameter spaces is left as an
exercise in random fields.)

Notice that a Gygax test has exactly the promised "size", or probability of
falsely rejecting the null, viz., \( \alpha \), but the test is also completely
uninformative. (The "power", or probability of correctly detecting the
alternative, is also \( \alpha \).) Similarly, a Gygax confidence set will
contain the true value of the parameter with probability (exactly) \( 1-\alpha
\), i.e., it has correct coverage. Notice also, however, that this confidence
set will not shrink as we get more data --- it's
not consistent. This, I think, tells us
something interesting about the relative importance of a statistical
procedure's getting the error probabilities right versus its converging to the
truth. (Cf.) It
also tells us how little we've accomplished when we've *merely* shown
that our test has the right size, or that our confidence set has the right
coverage.

A word on the name, which is really why I wrote this up. I have been using the expression "Gygax test" in my teaching for many years, but was sure I'd borrowed it from someone, probably some teacher or the other, and forgotten who in my usual way. But I cannot find any appearance for it before my 2012 comment on VanderWeele et al. This raises the uncomfortable possibility that I just made up the name. If anyone can point me to an earlier source, so that I can give credit, I would very much appreciate it. If not, I am prepared to take responsibility.