Pete Dunkelberg wrote to tell me that William Dembski, senior fellow at the Discovery Institute, the Mathematical Great White Hope of the "Intelligent Design" school of creationism, had a new pre-print out on information theory. So, for my sins, I downloaded it.
Having now read this production in both the original (7 July 2004) and lightly revised (23 July 2004) version, my considered judgment is the same as my first reaction: Sweet suffering Jesus.
First, two points for style, and then the substance.
Now, this is a perfectly respectable generalization of the regular Shannon information, and in fact one with many interesting properties; it will prove very useful in connection with coding theory, hypothesis testing, and the study of dynamical systems. I can say this with complete confidence because this functional is in fact one of the Rényi informations, introduced by Alfred Rényi in a famous 1960 paper, "On Measures of Entropy and Information", in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. I, pp. 547--561. (Was Dembski even born in 1960?) In Dembski's notation, the Rényi information of order $a$, for non-negative real $a$ is \[ I_q(\mu_1|\mu_2) = \log_2{\int_{\Omega}{{\left(\frac{d\mu_1}{d\mu_2}\right)}^q d\mu_2}} \] which approaches the Shannon information in the limit as $a$ goes to 1. Dembski's "variational information" is clearly just the special case $a$ = 2. Dembski correctly derives some of the more basic properties of this quantity, which Rényi established for arbitrary $a$ in his original paper. There does not seem to be any new mathematics in this section whatsoever. (Compare this part of his paper with, e.g., Imre Varga and János Pipek, "Rényi entropies characterizing the shape and the extension of the phase space representation of quantum wave functions in disordered systems", Physical Review E 68 (2003): 026202 [link].)
One of the best reasons to study these information measures goes roughly as follows. In 1953, the great Soviet probabilist A. I. Khinchin published a list of four reasonable-looking axioms for a measure of information, and proved that the Shannon information was the unique functional satisfying the axioms (up to an over-all multiplicative constant). (I) The information is a functional of the probability distribution (and not of other properties of the ensemble). (II) The information is maximal for the distribution where all events are equally probable. (III) The information is unchanged by enlarging the probability space with events of zero probability. The trickiest one is (IV) If the probability space is divided into two sub-spaces, A and B, the total information is equal to the information content of the marginal distribution of one sub-space, plus the mean information of the conditional distribution of the other sub-space: I(A,B) = I(A) + E[I(B|A)]. (The paper is re-printed in his book on Mathematical Foundations of Information Theory.) If we relax axiom (IV) to require only that I(A,B) = I(A) + I(B) when A and B are statistically independent, then we get a continuous family of solutions, namely the Rényi informations. This, along with their many applications, has lead to a great deal of attention being paid to the Rényi in the information-theory literature. A quite crude search of the abstracts of IEEE Transactions on Information Theory reveals an average of at least five papers a year over the last ten years. It's even introduced, though briefly, in Cover and Thomas's textbook (p. 499). Of particular note is the well-established use of Rényi information in establishing results on the error rates of hypothesis tests, a problem on which Dembski, notoriously, claims to be an expert. (The locus classicus here is Imre Csiszár, "Generalized cutoff rates and Rényi's information measures", IEEE Transactions on Information Theory 41 (1995): 26--34.) In nonlinear dynamics and statistical physics, the Rényi informations play crucial roles in the so-called "thermodynamic formalism", one of the essential tools of the rigorous study of complex systems. See, in particular, the excellent and standard book by Remo Badii and Antonio Politi, Complexity: Hierarchical Structures and Scaling in Physics (reviewed here). Naturally enough, Dembski also claims to be an expert on the measurement of complexity.
Dembski's paper seriously mis-represents the nature and use of information theory in a wide range of fields. What he puts forward as a new construction is in fact a particular case of a far more general idea, which was published forty-four years ago. That construction is extremely well-known and widely used in a number of fields in which Dembski purports to be an expert, namely information theory, hypothesis testing and the measurement of complexity. The manuscript contains exactly no new mathematics. Such is the work of a man described on one of his book jackets as "the Isaac Newton of information theory". His home page says this is the first in a seven-part series on the "mathematical foundations of intelligent design"; I can't wait. Or rather, I can.
Update, 29 March 2015: Replaced ugly images of mathematical equations with MathJax, added link to Rényi's paper which is now online.
Posted at August 10, 2004 16:45 | permanent link