Books to Read While the Algae Grow in Your Fur, August 2020
Attention conservation notice: I have no taste, and no qualifications to
opine on the economics of the Internet or its implications.
I omit a number of books I read this month and didn't care for, but where my
attempts at critique seem mean-spirited even to myself, and, worse, unlikely to
inform anyone else.
\[
\newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]}
\DeclareMathOperator*{\argmin}{argmin}
\]
- Christopher C. Heyde, Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation [SpringerLink]
- The most basic sort of quasi-likelihood estimator is for regression problems. It requires us to know the model's prediction for the conditional mean of \( Y \) given \( X=x \), say \( m(x;\theta) \), and the conditional variance of \( Y \), say \( v(x;\theta) \). It then enjoins us to minimize variance-weighted squared errors:
\[
\hat{\theta} = \argmin_{\theta}{\frac{1}{n}\sum_{i=1}^{n}{\frac{\left(y_i - m(x_i;\theta)\right)^2}{v(x_i;\theta)}}}
\]
Equivalently, we solve the estimating equation
\[
\frac{1}{n}\sum_{i=1}^{n}{\frac{\nabla_{\theta} m(x_i;\theta)}{v(x_i;\theta)} \left(y_i - m(x_i;\theta)\right)} = 0
\]
(I've left in the \( \frac{1}{n} \) on the left-hand side to make it more evident that this last expression ought to converge to zero at, but only at, the right value of \( \theta \).)
- This is what we'd do if we thought \( Y|X=x \) had a Gaussian distribution
\( \mathcal{N}(m(x;\theta), v(x,\theta)) \); the objective function above would
then be (proportional to) the log-likelihood. But there are many situations
where a quasi-likelihood estimate works well, even if the real distribution
isn't Gaussian. If we're dealing with linear regression functions, for
instance, the Gauss-Markov theorem tells us that weighted least squares is the
minimum-variance linear estimator, Gaussian distribution or no Gaussian
distribution.
- Heyde's book is about a broad family of quasi-likelihood estimators for
lots of different stochastic processes. The basic idea is to find functionals
of these processes which involve both the data and the parameters, which will
have expected value zero at, but only at, the right parameters. (As with \(
Y_i - m(X_i;\theta) \) in the regression example.) More exactly, Heyde enjoins
us to look for functionals which will
be martingale
difference sequences. We then form linear combinations of these
functionals and solve for the parameter which sets the combination to zero.
(That is, we solve the estimating equation.) The best weights in this linear
combination (generally) reflect the variation in the martingale increments.
This is an extremely flexible set-up which nonetheless lets Heyde prove some
pretty useful results about the properties of his estimators, for a wide range
of parametric and non-parametric problems involving stochastic processes.
- Recommended for readers who have both a sound grasp on likelihood-based
estimation theory for IID data, and some knowledge of stochastic
differential equations and martingale theory. But good for those of us with
those peculiar deformations. §
- (I have been reading this, off and on, since 2002, but I have a rule about
not recommending books until I've read them completely...)
- Mike Carey and Elena Casagrande, Suicide Risk (vols. 1, 2, 3, 4, 5, 6)
- High quality comic book mind candy. It's somewhat reminiscent, in style
and theme, of classic Roger Zelazny (especially Creatures of Light and
Darkness and Lord of Light), to the point where I'd be
surprised if there wasn't direct influence. This is, to be clear, a good thing. §
- Nowhere Men
- Forgotten Home
- Lady Killer
- Hobo Mom
- Further comic book mind candy, assorted (no particular order). I'd read sequels to the first two.
- John McWhorter, Language Families of the World
- McWhorter is always at his best as a spirited and engaging expositor of
linguistics to non-linguists.
- Matthew Hindman, The Internet Trap: How the Digital Economy Builds Monopolies and Undermines Democracy [JSTOR]
- This is rather convincing, especially when it comes to the extreme
inequality of attention online (*), how little of that goes to local news, and
why that does, indeed, undermine democracy. One aspect of his explanation for
all this is that using the Internet is very cheap because massive sums
go into building the Internet. That is, providing a service like a
search engine, that covers as much as users have come to expect, with
results of competitive quality, and which handles huge numbers of users at high
speed, requires massive investments in computing hardware, interface design,
back-end software design, data acquisition, data storage, etc. This creates
very substantial barriers to entry, which are not going to go away. (Even
Microsoft, throwing billions of dollars and incredible other resources at the
problem, is barely competitive in search with Google.) Having assembled all
this infrastructure, of course, the marginal cost of running one extra search
on it is trivial, but the cost of getting to the first query is formidable.
And similarly for streaming video, or doing just about anything else at
"Internet scale".
- I found this argument completely convincing, but then I learned
my Brian Arthur
at my father's knee.
- There are also some economic models of why he thinks this inequality is
intrinsic to the Internet, which are adaptations of the New Economic Geography
models of Krugman et al. Now, I have to say I do not think very much of these.
The models presume that consumers have a taste for quality and a taste for
rapidity of updates from content providers, and that the quality and the number
of updates simply multiply together to give consumer utility. But this plainly
implies that our ideal is to be alerted continually to content we can just
barely stand, and
this only sounds
like an accurate description of Twitter. The equilibria would change
drastically if he allowed for a decreasing marginal utility of updating
frequency. I don't think these models add all that much to the more empirical
parts of the book, or the basic points about increasing returns. §
- *: Portions of it are devoted to arguing that online attention follows a
strict power law. I don't think this is really necessary to the argument ---
Hindman uses pure resampling in his simulations, so evidently he
doesn't think so either --- but I do wish he'd not cite our paper and then
go on to use methods we demonstrated are just bad. But this is me taking one
of my hobby-horses for a turn around the ring, rather than a serious complaint. ^
Books to Read While the Algae Grow in Your Fur;
Commit a Social Science;
Pleasures of Detection, Portraits of Crime;
Scientifiction and Fantastica
Enigmas of Chance;
The Dismal Science;
Actually, "Dr. Internet" Is the Name of the Monsters' Creator;
Linkage;
Power Laws
Posted at August 31, 2020 23:59 | permanent link