August 31, 2020

Books to Read While the Algae Grow in Your Fur, August 2020

Attention conservation notice: I have no taste, and no qualifications to opine on the economics of the Internet or its implications.

I omit a number of books I read this month and didn't care for, but where my attempts at critique seem mean-spirited even to myself, and, worse, unlikely to inform anyone else. \[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \DeclareMathOperator*{\argmin}{argmin} \]

Christopher C. Heyde, Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation [SpringerLink]
The most basic sort of quasi-likelihood estimator is for regression problems. It requires us to know the model's prediction for the conditional mean of \( Y \) given \( X=x \), say \( m(x;\theta) \), and the conditional variance of \( Y \), say \( v(x;\theta) \). It then enjoins us to minimize variance-weighted squared errors: \[ \hat{\theta} = \argmin_{\theta}{\frac{1}{n}\sum_{i=1}^{n}{\frac{\left(y_i - m(x_i;\theta)\right)^2}{v(x_i;\theta)}}} \] Equivalently, we solve the estimating equation \[ \frac{1}{n}\sum_{i=1}^{n}{\frac{\nabla_{\theta} m(x_i;\theta)}{v(x_i;\theta)} \left(y_i - m(x_i;\theta)\right)} = 0 \] (I've left in the \( \frac{1}{n} \) on the left-hand side to make it more evident that this last expression ought to converge to zero at, but only at, the right value of \( \theta \).)
This is what we'd do if we thought \( Y|X=x \) had a Gaussian distribution \( \mathcal{N}(m(x;\theta), v(x,\theta)) \); the objective function above would then be (proportional to) the log-likelihood. But there are many situations where a quasi-likelihood estimate works well, even if the real distribution isn't Gaussian. If we're dealing with linear regression functions, for instance, the Gauss-Markov theorem tells us that weighted least squares is the minimum-variance linear estimator, Gaussian distribution or no Gaussian distribution.
Heyde's book is about a broad family of quasi-likelihood estimators for lots of different stochastic processes. The basic idea is to find functionals of these processes which involve both the data and the parameters, which will have expected value zero at, but only at, the right parameters. (As with \( Y_i - m(X_i;\theta) \) in the regression example.) More exactly, Heyde enjoins us to look for functionals which will be martingale difference sequences. We then form linear combinations of these functionals and solve for the parameter which sets the combination to zero. (That is, we solve the estimating equation.) The best weights in this linear combination (generally) reflect the variation in the martingale increments. This is an extremely flexible set-up which nonetheless lets Heyde prove some pretty useful results about the properties of his estimators, for a wide range of parametric and non-parametric problems involving stochastic processes.
Recommended for readers who have both a sound grasp on likelihood-based estimation theory for IID data, and some knowledge of stochastic differential equations and martingale theory. But good for those of us with those peculiar deformations. §
(I have been reading this, off and on, since 2002, but I have a rule about not recommending books until I've read them completely...)
Mike Carey and Elena Casagrande, Suicide Risk (vols. 1, 2, 3, 4, 5, 6)
High quality comic book mind candy. It's somewhat reminiscent, in style and theme, of classic Roger Zelazny (especially Creatures of Light and Darkness and Lord of Light), to the point where I'd be surprised if there wasn't direct influence. This is, to be clear, a good thing. §
Nowhere Men
Forgotten Home
Lady Killer
Hobo Mom
Further comic book mind candy, assorted (no particular order). I'd read sequels to the first two.
John McWhorter, Language Families of the World
McWhorter is always at his best as a spirited and engaging expositor of linguistics to non-linguists.
Matthew Hindman, The Internet Trap: How the Digital Economy Builds Monopolies and Undermines Democracy [JSTOR]
This is rather convincing, especially when it comes to the extreme inequality of attention online (*), how little of that goes to local news, and why that does, indeed, undermine democracy. One aspect of his explanation for all this is that using the Internet is very cheap because massive sums go into building the Internet. That is, providing a service like a search engine, that covers as much as users have come to expect, with results of competitive quality, and which handles huge numbers of users at high speed, requires massive investments in computing hardware, interface design, back-end software design, data acquisition, data storage, etc. This creates very substantial barriers to entry, which are not going to go away. (Even Microsoft, throwing billions of dollars and incredible other resources at the problem, is barely competitive in search with Google.) Having assembled all this infrastructure, of course, the marginal cost of running one extra search on it is trivial, but the cost of getting to the first query is formidable. And similarly for streaming video, or doing just about anything else at "Internet scale".
I found this argument completely convincing, but then I learned my Brian Arthur at my father's knee.
There are also some economic models of why he thinks this inequality is intrinsic to the Internet, which are adaptations of the New Economic Geography models of Krugman et al. Now, I have to say I do not think very much of these. The models presume that consumers have a taste for quality and a taste for rapidity of updates from content providers, and that the quality and the number of updates simply multiply together to give consumer utility. But this plainly implies that our ideal is to be alerted continually to content we can just barely stand, and this only sounds like an accurate description of Twitter. The equilibria would change drastically if he allowed for a decreasing marginal utility of updating frequency. I don't think these models add all that much to the more empirical parts of the book, or the basic points about increasing returns. §
*: Portions of it are devoted to arguing that online attention follows a strict power law. I don't think this is really necessary to the argument --- Hindman uses pure resampling in his simulations, so evidently he doesn't think so either --- but I do wish he'd not cite our paper and then go on to use methods we demonstrated are just bad. But this is me taking one of my hobby-horses for a turn around the ring, rather than a serious complaint. ^

Books to Read While the Algae Grow in Your Fur; Commit a Social Science; Pleasures of Detection, Portraits of Crime; Scientifiction and Fantastica Enigmas of Chance; The Dismal Science; Actually, "Dr. Internet" Is the Name of the Monsters' Creator; Linkage; Power Laws

Posted at August 31, 2020 23:59 | permanent link

Three-Toed Sloth