## August 07, 2021

### CLear 2022: Call for Papers

Attention conservation notice: An invitation to put a lot of effort into writing about a recondite academic topic, only to have it misunderstood by anonymous strangers.

Having agreed to be an area chair (area TBD), I ought to publicize the call for papers for the first Conference on Causal Learning and Reasoning (CLeaR 2022):

Causality is a fundamental notion in science and engineering. In the past few decades, some of the most influential developments in the study of causal discovery, causal inference, and the causal treatment of machine learning have resulted from cross-disciplinary efforts. In particular, a number of machine learning and statistical analysis techniques have been developed to tackle classical causal discovery and inference problems. On the other hand, the causal view has been shown to facilitate formulating, understanding, and tackling a broad range of problems, including domain generalization, robustness, trustworthiness, and fairness across machine learning, reinforcement learning, and statistics.

We invite papers that describe new theory, methodology and/or applications relevant to any aspect of causal learning and reasoning in the fields of artificial intelligence and statistics. Submitted papers will be evaluated based on their novelty, technical quality, and potential impact. Experimental methods and results are expected to be reproducible, and authors are strongly encouraged to make code and data available. We also encourage submissions of proof-of-concept research that puts forward novel ideas and demonstrates potential for addressing problems at the intersection of causality and machine learning.

The proceedings track is the standard CLeaR paper submission track. Papers will be selected via a rigorous double-blind peer-review process. All accepted papers will be presented at the Conference as contributed talks or as posters and will be published in the Proceedings.

Topics of submission may include, but are not limited to:

• Machine learning building on causal principles
• Causal discovery in complex environments
• Efficient causal discovery in large-scale datasets
• Causal effect identification and estimation
• Causal generative models for machine learning
• Unsupervised and semi-supervised deep learning connected to causality
• Machine learning with heterogeneous data sources
• Benchmark for causal discovery and causal reasoning
• Reinforcement learning
• Fairness, accountability, transparency, explainability, trustworthiness, and recourse
• Applications of any of the above to real-world problems

The deadline is 22 October 2021; further details are available at the conference website.

(I should write up my "Apology for Causal Discovery" as a proper paper or at least essay, rather than a pair of slide decks and a video which [like all recordings of me] I can't stand to watch, but that's so far back in the queue I could cry.)

Posted at August 07, 2021 15:45 | permanent link

## July 31, 2021

### Books to Read While the Algae Grow in Your Fur, July 2021

Attention conservation notice: I have no taste, and no qualifications to opine on culture-bound syndromes and contagious hysterias, the history and economics of socialist planning, economic inequality, or Islamic theology.

Elaine Showalter, Hystories: Hysterical Epidemics and Modern Culture (Columbia University Press, 1997)
Showalter's theory is, roughly, as follows. Modern life produces lots of seriously unhappy, even traumatized, people. Some, at least, of those people are apt act out their unhappiness in various bodily symptoms and behaviors. This acting out is more or less unconscious, usually more rather than less. There is a certain amount of random flailing around (as it were) when it comes to these symptoms, but people tend to be attracted to patterns of behavior which have some sort of authoritative imprimatur among those around them as reflecting real distress. There is thus a symbiosis between clinicians who recognize syndromes-of-distress and patients who enact those syndromes. Showalter calls the syndromes forms of "hysteria", and the associated narratives "hystories". To really make the symbiosis work, however, one needs a mass medium to widely disseminate the scripts or schemata for the syndrome, perhaps as elements in popular fiction.
Showalter applies this theory to the original "classical hysteria" of Charcot et al. in the late 1800s, and, in the 1980s and 1990s when she was writing, to alien abduction, chronic fatigue syndrome, Satanic ritual abuse, recovered memory, Gulf War syndrome, and multiple personality disorder. The late-20th-century cases are distinguished from the late-19th-century ones by the fact that they all involve conspiracy theories; Showalter is very firm, and correct, about this development, but doesn't really try to explain it. (It's not as though the 19th century had any shortage of conspiracy theories, and it'd need little more than search-and-replace to turn The Awful Disclosures of Maria Monk into a tale of Satanic ritual abuse.) I want to single out the chapters on recovered memory, multiple personality disorder, Satanic ritual abuse and alien abduction for how carefully, and convincingly, Showalter shows they follow her model.
A quarter-century later, some of these syndromes have all but vanished, but there's no shortage of replacements. (Listing them is left as an exercise for the reader.) Why we should be so productive of "hystories" is not really something Showalter adequately explains, beyond gesturing at millennial anxiety and/or modern telecommunications.
At this point I'd like to make one complaint, two anthropological connections, and one mathematical aside.
• Showalter does not give enough weight to the possibility that something which looks like a hysteria with physical symptoms might in fact be a conventional illness. (That is, she doesn't consider how to distinguish social from biological contagion*.) I think in many ways this would have been a much stronger book if it had had a chapter on Lyme disease (which we now know is a bacterial illness transmitted by ticks) and the supposed chronic Lyme disease (which fits Showalter's ideas to a T). It wouldn't surprise me if some of the people who suffer from chronic fatigue syndrome are in fact dealing with currently-unrecognized organic conditions; it would surprise me very much if alien abductees were. (Cf. this contemporary review from Carol Tavris.)
• A lot of Showalter's ideas are close to those put forward by the anthropologist I. M. Lewis in Ecstatic Religion: A Study of Shamanism and Spirit Possession (first ed. 1971). What might be distinctly modern about Showalter's syndromes, as opposed to Lewis's, is the role of mass media in their spread and institutionalization.
• Dan Sperber would have a field day with this. In particular, Showalter's ideas seem extremely compatible with Sperber's about how the "epidemiology of representations" needs to combine transmission and "attraction".
• I'm tempted to model the growth of "hystories" using the classic Simon (1955) process: with some probability each unhappy person spawns a new form of hysteria, otherwise they attach themselves to an existing one with a probability proportional to its current size. (That is, preferential attachment to hysterias.) This will, of course, lead to a heavy-tailed distribution of hysterias. The flaw here is that this model wouldn't explain the disappearance of forms of hysteria; there might need to be some sort of recency effect.
I was alerted to this book, but put off from reading it, by a contemporary review in Nature; it now seems to me that the reviewer was unfair about the quality of Showalter's writing. (Perhaps my taste has been degraded by a quarter century of reading academic prose.) There are ways in which I'd re-write this book (it's still too Freudian, and in places too cutesy [e.g., the coinage "hystories" itself]), and, inevitably, parts are dated. I would really like to read Showalter giving the same treatment to the last quarter century, but, given her experiences after publishing this, I understand why she'd decline, to the public's loss. I urge the book on any reader with a serious interest in social contagion, or in the weirder reaches of modern culture.
*: My former student Dena Asta wrote did some nice research, back in 2012--2013, based on the idea that a social contagion will spread through "communities" or "modules" defined by the social network, while a biological contagion will need physical proximity. To the extent that network modularity and geographic propinquity cut across each other, we can get some handle on what form of contagion we're dealing with, assuming it's contagion at all. This, however, is taking us very far from Showalter's concerns.
Michael Ellman, Socialist Planning (3rd edition, 2014)
This is a very complete revision of a book whose first (1979) edition I reviewed earlier. The revision brings the story up to the early 2010s (in the case of China), and makes extensive use of sources and studies which have only become available since the collapse of the Soviet Union.
Geographically, coverage remains focused on the Soviet Union, but there are also extensive discussions of the Chinese experience, and a great deal more than I remember from the first edition about Yugoslavia, Poland, Hungary, and East Germany. Other eastern-European countries and Vietnam are mentioned sporadically, Cuba even less often, North Korea just a few times in passing. There is extensive information about how plans were drawn up, how the authorities attempted to implement them, what actually happened instead, etc. Coverage of the military sector, and the way preparation for another WWII-style conflict influenced every aspect of Soviet economic planning, is drastically expanded. (According to Ellman, much of the output of the aluminum and fertilizer industries was simply wasted year after year, because factories ran at levels suitable for producing vast numbers of warplanes and munitions, not actual needs.) The general tone is of trying to describe, and evaluate, a phenomenon which has passed and will never recur. To sum up Ellman's judgment: socialist planning was an attempt at modernization from above, driven by the imperative of being militarily competitive with industrialized European powers. In that goal, it succeeded, at least up through the 1950s. As a fulfillment of the ethical aims of socialism, it failed and was doomed to fail.
I find it hard to imagine that a better overview of socialist planning, as it actually existed, will be available any time soon.
Michele Alacevich and Anna Soci, Inequality: A Short History [JSTOR]
This isn't so much a history of inequality as of economists' ideas about inequality. Indeed, much of it takes the form of rehashing famous recent work. (E.g., chapter 4, "Inequality and Globalization", is largely about Branko Milanovic's Global Inequality. [It's a good book.]) I would it interesting to point out that both classical and neo-classical economics focused on the distribution of income across factors of production, rather than the distribution of income (or wealth) across persons or households. But the point is somewhat undercut by the fact that the statistical study of income and wealth distributions owes so much to Pareto, who was also one of the founders of neo-classical economics! I found some of the history in ch. 3, "The Statistical Drift of Inequality Studies", to be interesting, though I think a bit unfair to Pareto (regular readers will understand what such a statement costs me). I also found Alacevich and Soci's repeated slagging on economists for merely doing empirical studies of income distribution a bit unfortunate --- surely before coming up with a theoretical explanation, it's important to know what the phenomena to be explained actually are!
Over-all, if you have read any two of Milanovic, Piketty and Bartels, you will not find much new here. I might assign some of the history-of-statistics portions in my class.
Karin Slaughter, The Last Widow and The Silent Wife
Mind candy, mystery/thriller division. Umpteenth volumes in Slaughter's long-running series, which I enjoy very much. The Last Widow is a 2019 publication which involves (not to spoil anything) biological terrorism, the CDC, and a right-wing attack on a seat of government. Looking back from mid-2021, therefore, I am very relieved that The Silent Wife is merely about personal betrayal and serial killing. Both are very well-written and enjoyable, if full of squicky parts.
(I think it is, however, a defect in construction that the dramatic, newsworthy, and emotionally-scarring events of Last Widow are basically not mentioned in Silent Wife, despite its taking place a mere six weeks later. It's also atypical of Slaughter, since one of the things I enjoy about her series is that there are consequences.)
John Renard (ed.), Islamic Theological Themes: A Primary Source Reader
Does what it says. I'm impressed by the range of texts --- ideologically, geographically, chronologically --- but utterly incompetent to evaluate it.
S. A. Chakraborty, The Kingdom of Copper
Mind candy fantasy: sequel to City of Brass. I found the continuing story enjoyable, but the language is, to borrow a phrase from Le Guin, very much that of Poughkeepsie rather than Elfland, despite being almost entirely set in Elfland (or, more precisely, Jinnistan). Still, I immediately got the sequel after finishing this.
Anna Lee Huber, A Wicked Conceit
Mind candy mystery. I think it's probably just as good as the earlier books, but that series fatigue has set in for me after nine volumes. They will, however, loose little from being read out of order.

Posted at July 31, 2021 23:59 | permanent link

## June 30, 2021

### Books to Read While the Algae Grow in Your Fur, June 2021

Attention conservation notice: I have no taste, and no qualifications to opine on cryptozoology, folklore, economics, or humanistic geography.

Anne Perry, The Cater Street Hangman
Mind candy historical mystery. Enjoyable, but I fail to see why this should have sparked a series of dozens of books over decades.
Benjamin Radford and Joe Nickell, Lake Monster Mysteries: Investigating the World's Most Elusive Creatures
Shorter: There are no lake monsters, just logs, otters, and stories about lake monsters.
Longer: Mostly this is an account of the authors' travels to various lakes which are claimed to have monsters, and the authors' (very tame) adventures debunking the stories, i.e., providing mundane accounts of what could have caused sightings or what's really in particular photographs. They are very fond of invoking logs, tree stumps, and otters. (I am persuaded about the timber and open-minded about the otters.) This is pretty standard fare, of the kind I have enjoyed since I was a boy and my mother would buy me issues of Skeptical Inquirer.
There is also a not-quite-fully-articulated theory of lake monsters hinted at here. If I try to draw this out explicitly, it'd be something like this: lake monsters are a modern myth, originating with Loch Ness in the 1930s, with the idea being that lakes are inhabited by surviving plesiosaurs, or something near enough. (One ancestor of the myth is thus the genre of "lost world" adventure stories.) Pre-modern stories about strange creatures in lakes get invoked by the myth as "evidence", regardless of their content or context; occasionally accounts of pre-modern stories are fabricated as needed. When people who know the myth see strange things on lakes, which is common enough, knowledge of the myth provides an interpretation for an ambiguous experience, and an opportunity for recounting the myth with an additional report attached. (It is enough for these purposes that the people be able to say "I don't know what I saw, but I saw something".) The myth spreads from lake to lake, partly through natural diffusion, and partly through the efforts of local chambers of commerce to drum up tourism.
As I said, the theory of lake monsters in the previous paragraph is me trying to articulate Radford and Nickell's hints by stringing their scattered remarks together with bits of Dan Sperber and Pascal Boyer. The authors themselves repeatedly refer to a work by an actual folklorist (Michel Meuger's 1988 Lake Monster Traditions: A Cross-Cultural Analysis) in ways which make me eager to track down a copy.
Jeff Lemire and Dean Ormston, Black Hammer: Secret Origins
Alex Robinson's Lower Regions
Rick Remender, Eric Nguyen et al., Strange Girl
Kel Symons and Mathew Reynolds, The Mercenary Sea
Comic book mind candy, assorted.
Pierro Sraffa, Production of Commodities by Means of Commodities: Preliude to a Critique of Economic Theory
This is a little book drafted in the 1920s and published in 1960, which became the subject of a huge literature. I have read a lot about it over the years, since it became a touchstone for some strands of heterodox economics, but never actually read it until this month. Having done so I find it very strange, not least because I feel like it could have be shortened still further, and yet clarified, if Sraffa had just used some basic theory for directed graphs and invoked the Frobenius-Perron theorem. (It's possible that the theory about directed graphs didn't exist when he first wrote, and even that the Frobenius-Perron theorem was then too obscure, but by 1960?) I am in fact tempted to re-write it doing just that, but I presume somebody out there in neo-Ricardian / post-Keynesian / post-Marxist land has done so, and I call upon the LazyWeb for a reference.
(Thanks to Z. M. Shalizi for lending me his copy.)
Yi-Fu Tuan, Dominance and Affection: The Making of Pets [JSTOR]
This is a beautifully-written and thought-provoking, perhaps even disturbing, book. It's an examination across history and time of the ways people make others --- plants, animals, and indeed other people --- into playthings, into objects which they can manipulate, and consequently bestow affection upon. I am sure there are people who can read it without coming to look at their own affections in a different light, but I'd prefer not to know them.
This book is part of a loose series that Tuan wrote, looking at what one might call the moral psychology of different aspects of humans' experience of their environments --- Segmented Worlds and Self, Landscapes of Fear, Escapism, Cosmos and Hearth, etc. These are all marked by the same virtues as this book: vast learning worn lightly, smooth-flowing writing, and an acute ethical sensitivity that is never preachy. I recommend them all very highly indeed.
(Thanks to Jan Johnson for the gift of this book.)
Norbert Wiener, The Fourier Integral and Certain of Its Applications
Recommended purely for historical interest. If you already are familiar with Fourier analysis and are curious to see it at any earlier stage in its development, this is interesting work from a pioneer. (And it's full of curious sidelights, such as the fact that Wiener in 1933 doesn't have the word "convolution" in its modern mathematical-English sense, but uses the German Faltung for lack of any translation.) But I don't think there are insights or techniques which aren't fully assimilated into the modern mainstream.
Glenn C. Loury, The Anatomy of Racial Inequality
Re-read for course prep. If it was in print I'd probably make it a required text; as it is I expect to assign passages from chapters 2 ("Racial Stereotypes") and 3 ("Racial Stigma") in the unit on mechanisms that create and perpetuate inequalities.

Posted at June 30, 2021 23:59 | permanent link

## June 03, 2021

### Course Announcement: "Statistics of Inequality and Discrimination" (36-313)

Attention conservation notice: Advertisement for a course you won't take, at a university you don't attend. Even if the subject is of some tangential interest, why not check back in a few months to see if the teacher has managed to get himself canceled, and/or produced anything worthwhile?

In the fall I will, again, be teaching something new:

36-313, Statistics of Inequality and Discrimination
9 units
Time and place: Tuesdays and Thursdays, 1:25 -- 2:45 pm, location TBA
Description: Many social questions about inequality, injustice and unfairness are, in part, questions about evidence, data, and statistics. This class lays out the statistical methods which let us answer questions like Does this employer discriminate against members of that group?, Is this standardized test biased against that group?, Is this decision-making algorithm biased, and what does that even mean? and Did this policy which was supposed to reduce this inequality actually help? We will also look at inequality within groups, and at different ideas about how to explain inequalities between groups. The class will interweave discussion of concrete social issues with the relevant statistical concepts.
Prerequisites: 36-202 ("Methods for Statistics and Data Science") (and so also 36-200, "Reasoning with Data")

This is a class I've been wanting to teach for some years now, and I'm very happy to finally get the chance to feel my well-intentioned but laughably inadequate efforts crushed beneath massive and justified opprobrium evoked from all sides bore and perplex some undergrads who thought they were going to learn something interesting in stats. class for a change try it out.

#### Tentative topic schedule

1. "Recall": Reminders about probability and statistics: populations, distribution within a population, distribution functions, joint and conditional probability; samples and inference from samples. Reminders (?) about social concepts: ascriptive and attained social categories; status, class, race, caste, sex, gender, income, wealth.
2. Income and wealth inequality: What does the distribution of income and wealth look like within a population? How do we describe population distributions, especially when there is an extreme range of values (a big difference between the rich and poor)? Where does the idea of "the 1%" wealthy elite come from? How has income inequality changed over recent decades?
Statistical tools: measures of central tendency (median, mode, mean), of dispersion, and of skew; the concept of "heavy tails" (the largest values being orders of magnitude larger than typical values); log-normal and power law distributions; fitting distributions to existing data; positive feedback, multiplicative growth and "cumulative advantage" processes.
3. Income disparities: How does income (and wealth) differ across groups? How do we compare average or typical values? How do we compare entire distributions? How have income inequalities by race and sex changed over recent decades?
Statistical tools: permutation tests for differences in mean (and other measures of the average); two-sample tests for differences in distribution; inverting tests to find the range of differences compatible with the data; the "analysis of variance" method of comparing populations; the "relative distribution" method of comparing populations
4. Detecting discrimination in hiring: Do employers discriminate in hiring (or schools in admission, etc.)? How can we tell? When are differences in hiring rates evidence for discrimination? How do statistical perspectives on this question line up with legal criteria for "disparate treatment" and "disparate impact"?
Statistical tools: tests for differences in proportions or probabilities; adjusting for applicant characteristics; deciding what to adjust for
5. Detecting discrimination in policing: Do the police discriminate against members of particular racial groups? When do differences in traffic stops, arrests, or police-caused deaths indicate discrimination? Does profiling or "statistical discrimination" make sense for the police? Can groups be simultaneously be over- and under- policed?
Statistical tools: test for differences in proportions; signal detection theory; adjusting for systematically missing data; self-reinforcing equilibria
6. Algorithmic bias: Can predictive or decision-making algorithms be biased? What would that even mean? Do algorithms trained on existing data necessarily inherit the biases of the world? What notions of fairness or unbiased can we actually implement for algorithms? What trade-offs are involved in enforcing different notions of fairness? Are "risk-prediction instruments" fair?
Statistical tools: Methods for evaluating the accuracy of predictions; differential error rates across groups; decision trees; optimization and multi-objective optimization.
7. Standardized tests: Are standardized tests for school admission biased against certain racial groups? What does it mean to measure qualifications, and how would we know whether tests really are measuring qualifications? What does it mean for a measurement to be biased? When do differences across groups indicate biases? (Disparate impact again.) Why correlating outcomes with test scores among admitted students may not make sense. The "compared to what?" question.
Statistical tools: Predictive validity; differential prediction; "conditioning on a collider"
8. Intelligence tests: Are intelligence tests biased? How do we measure latent attributes? How do we know the latent attributes even exist? What would it mean for there to be such a thing as "general intelligence", that could be measured by tests? What, if anything, do intelligence tests measure? What rising intelligence test results (the Flynn Effect) tell us?
Statistical tools: correlation between test scores; factor models as an explanation of correlations; estimating factor values from tests; measurement invariance; alternatives to factor models
9. Implicit bias: Do "implicit association tests" measure unconscious biases? Again on measurement, as well as what it would mean for a bias to be "implicit" or "unconscious". What, if anything, do implicit association tests measure?
Statistical tools: Approaches to "construct validity".
10. Interventions on implicit bias: Can trainings or other interventions reduce implicit bias? How do we investigate the effectiveness of interventions? How do we design a good study an intervention? How do we pool information from multiple studies. Do implicit bias interventions change behavior? Does having a chief diversity officer increase faculty diversity?
Statistical tools: Experimental design: selecting measurements of outcomes, and the importance of randomized studies; meta-analytic methods for combining information.
11. Explaining, or explaining away, inequality: To what extent can differences in outcomes between groups be explained by differences in their attributes (e.g., explaining differences in incomes by differences in marketable skills)? How should we go about making such adjustments? Is it appropriate to treat discrimination as the "residual" left unexplained? When does adjusting or controlling for a variable contribute to an explanation, and when is it "explaining away" discrimination? What would it mean to control for race, sex or gender?
Statistical tools: Observational causal inference; using regression to "control for" multiple variables at once; using graphical models to represent causal relations between variables; how to use graphical models to decide what should and what should not be controlled for; the causal model implicit in decisions about controls.
12. Self-organizing inequalities and "structural" or "systematic" inequalities: Models of how inequalities can perpetuate themselves even when nobody is biased. Models of how inequalities can appear even when nobody is biased. The Schelling model of spatial segregation as a "paradigm". How relevant are Schelling-type models to actual, present-day inequalities?
Statistical tools: Agent-based models; models of social learning and game theory.
13. Statistics and its history: The development of statistics in the 19th and early 20th century was intimately tied to the eugenics movement, which was deeply racist and even more deeply classist, but also often anti-sexist. The last part of the course will cover this history, and explain how many of the intellectual tools we have gone over to document, and perhaps to help combat, inequality and discrimination were invented by people who wanted to use them for quite different purposes. The twin learning objectives for this section are for students to grasp something of this history, and to grasp why the "genetic fallacy", of judging ideas by where they come from (their "genesis") is, indeed, foolish and wrong.
Statistical tools: N/A.

#### Evaluation

There will be one problem set per week; each of these homeworks will involve some combination of (very basic) statistical theory, (possibly less basic) calculations using the theory we've gone over, and analysis of real data sets using the methods discussed in class. There will also be readings for each class session, and a short-answer quiz after each session will combine questions based on lecture content with questions based on the readings.

There will not be any exams.

My usual policy is to drop a certain number of homeworks, and a certain number of lecture/reading questions, no questions asked. The number of automatic drops isn't something I'll commit to here and now (similarly, I won't make any promises here about the relative weight of homework vs. lecture-related questions).

#### Textbook, Lecture Notes

There is, unfortunately, no one textbook which covers the material we'll go over at the required level. You will, instead, get very detailed lecture notes after each lecture. There will also be a lot of readings from various books and articles. (I will not agree with every reading I assign.)

Corrupting the Young; Enigmas of Chance; Commit a Social Science

Posted at June 03, 2021 23:59 | permanent link

## May 31, 2021

### Books to Read While the Algae Grow in Your Fur, May 2021

Attention conservation notice: I have no taste.

Lauren Henderson, Dead White Female and Too Many Blondes
Mind candy mystery from the 1990s. I read all of the later books in this series with great delight as they came out, and while these first two are a bit rougher than her later work, they're still quite tasty, especially if you remember the fashions and mores of the time.
(I've included links to the old paper editions, but you'd be better off tracking down the electronic re-issues.)
Andre Norton, Gates to the Witch World (= Witch World, Web of the Witch World, Year of the Unicorn)
I don't remember what led me to pick this up, but damn could Norton write, and write compressedly. (For instance, a lot of the plot of the first two books here got recycled for the plot of Martha Wells's [very good] Fall of Ile-Rein series.) I had in fact read Year of the Unicorn as a boy, but remembered only a few fragments of the story.
ObLinkage: James Davis Nicoll on Witch World, Web of the Witch World, and Year of the Unicorn
Mind-candy thriller, which I think conforms to every one of the classical dramatic unities.
L. G. Estrella, Two Necromancers, a Bureaucrat, and an Army of Golems and Two Necromancer, a Dragon, and a Vampire
Mind candy of the fluffiest sort. These have the pleasure, and the feel, of a Dungeons & Dragons campaign where vastly over-powered PCs engage in cheerfully cartoonish banter, violence and pillage. It's a bit of a guilty, regressive pleasure for me, but a real pleasure nonetheless. (No links because they're only available in electronic formats.)

On a different note, over the semester I re-read a lot of textbooks and monographs for the undergrad statistical learning class, so I provide some links here for the ones I mined for examples and problem sets found especially useful:

Posted at May 31, 2021 23:59 | permanent link

## April 30, 2021

### Books to Read While the Algae Grow in Your Fur, April 2021

Attention conservation notice: I have no taste, and no qualifications to opine on ethics of any sort.

Michael J. Kearns and Aaron Roth, The Ethical Algorithm: The Science of Socially Aware Algorithm Design
There are, roughly speaking, three schools of thought when it comes to "fairness" and "ethics" in artificial intelligence machine learning predictive statistical modeling and data mining. I will caricature them as follows:
1. "Problem? I don't see any problem": maximize accuracy (or utility, etc.), and let the results take care of themselves.
2. "Everything is problematic": the data sets are biased, the objective functions to be maximized are biased (in some more obscure way), the very maximization algorithms are biased (in some yet more obscure way), and the only hope is to appoint duly-certified ethicists as censors trust that can all somehow be re-imagined after the arrival of the millennium / revolution.
3. "Problems? I'm good at solving problems! what penalty term should we add to the Lagrangian?"
This book is the best presentation I have encountered, and indeed about the best I can imagine, for this third, temporizing school of thought. (It is, in case that's not clear, the tendency with which I have the most sympathy.) That is, this book tends to regard ethical and political desiderata as constraints which should be imposed on algorithms that are otherwise seeking to optimize some well-defined objective function (such as travel-time for mapping software, or "probability that the user will watch the recommended movie" for recommender systems, etc.). There is a strong analogy here to a certain kind of technocratic, American-sense liberal approach to public policy, in which private firms maximize profit, subject (ideally) to constraints imposed by regulation ("don't dump too much dioxin into the water supply"). (I don't recall the book making this analogy explicit.)
I used this book quite successfully in my data mining class, but my students there found the most technical parts (like "possibility frontiers") the most congenial, and the more rhetorical-argumentative bits about fairness more preplexing. I strongly suspect this reflects having a very unusual audience. I would cheerfully teach from it again, and strongly recommend it to readers interested in these subjects, perhaps especially if they're new to this area.
Disclaimers: I have been an admirer of Kearns's work since the 1990s, I know him a bit from conferences &c., and I requested an examination copy of this book before assigning it to my class.
Walter Jon Williams, Fleet Elements
Military space opera mind candy of the very highest grade. For one thing, it earns operatic levels of emotion.
Ausma Zehanat Khan, The Black Khan
Continuing an epic fantasy saga where a lot of the details are the recent history of Afghanistan and environs with the serial numbers filed off. Only in this installment we spend a lot of time at the court of Isfahan Ashfall (via the ruins of Nishapur Nightshaper), complete with a scheming Nizam al-Mulk Nizam al-Mulk. Also, there is even more angsty romance than in the first volume. (Fortunately, AZK writes angsty romance well.) There are at least two more volumes to the saga, which I intend to devour as soon as I can arrange suitably long stretches of un-interrupted time.
Dennis Culver and Justin Greenwood, Crone
Comic book swords-and-sorcery mind candy, in which the former Red Sonja Bloody Bliss, now the titular crone, is dragged out of retirement to re-confront a Dark Lord she knows she killed...
Tony Cliff, Delilah Dirk and the Pillars of Hercules
Comic book historical-fantasy mind candy. (Previously.)
K. C. Constantine, Joey's Case
Similar remarks to last month's entry.

Posted at April 30, 2021 23:59 | permanent link

## March 31, 2021

### Books to Read While the Algae Grow in Your Fur, March 2021

Attention conservation notice: I have no taste, and no qualifications to opine on the sociology of radio and the music industry, or on movies.

(I didn't finish a lot of books this month, since I'm not counting re-reading bits and pieces of arcane tomes on golem-making as needed for my own shambling creation.)

K. C. Constantine, Sunshine Enemies
Mind candy from 1990: the nth in a series of mystery novels set in the fictional western Pennsylvania town of Rockford, PA, somewhere in the environs of Pittsburgh — what I've heard called the yinzerlands. It's a good mystery novel, but what really sets it apart is the dialogue. Constantine has an incredible ear for the way locals of that generation spoke, and turns it into riveting dialogue. The depiction of the life-ways of these communities also feels authentic, but that's harder for me to judge. Strongly recommended if you like well-written detective novels, or are interested in fiction set around here.
Gabriel Rossman, Climbing the Charts: What Radio Airplay Tells Us about the Diffusion of Innovation
This is a short sociological treatise about, primarily, how songs become hits on commercial American radio, or fail to do so. It's well written (not just "well written for sociology"), and has a number of very interesting points to make about topics like the diffusion of innovation, corruption, the role of genres in popular culture, and more besides. The points which most interest me are the diffusion ones.
Rossman's starting point is to look at curves of cumulative adoption over time --- how many radio stations have, by a given date, ever played such-and-such a song? His main methodological tool is to distinguish between two types of adoption curves. One is the classic elongated-S curve, looking roughly like $\frac{e^{t\lambda}}{1+e^{t\lambda}}$, which one would expect to be produced by contagion, whether mediated by a network or by some more mean-field-ish process (like a best-seller list). The other ideal type of curve is "concave", indicating a constant probability of adoption per unit time, so looking like $1-e^{-t\lambda}$. The latter he interprets as indicating some shared external forcing. Most songs which become hits follow the latter pattern (though he has illuminating things to say about the exceptional endogenous hits). The obvious question is the identity of the external force. Rossman makes a compelling case that this is, in fact, the record companies, and not (e.g.) radio station chains; on this basis he goes in to an examination of the history and theory of payola. (Basically: radio "moves product" for the record companies, so you don't want to be the only record company which is not bribing radio stations to play your music.) He also has a less compelling but still fairly persuasive analysis showing that radio stations don't really decide what to play by imitating other radio stations (at least for one "format" of radio station, during one time period). I could go on --- Rossman packs a lot into only ~200 pages --- but forbear.
The central distinction here, between curves due to external forcing and curves due to endogenous contagion, is one that's persuasive in context, but isn't necessarily either airtight or generalizable. That promotional efforts by a record company would translate into a constant hazard for adoption seems plausible enough, but one could imagine a record company whose promotional efforts start small, ramp up rapidly when one song or another takes off, and which tapers when it becomes clear that the pool of new adoptees is almost exhausted, imitating a logistic, "endogenous" diffusion curve. (It doesn't seem like good business strategy, and I take Rossman's word for it that that's not, in fact, how record promotion works.) My efforts to come up with a "just so" story in which contagion produces a constant hazard are less convincing even to me, but I only gave five minutes to the effort. Returning to my perpetual hobbyhorse of the difficulty of establishing social contagion, I would say that this is an example of using subject-matter knowledge (i.e., actual science) to rule out alternatives, which couldn't be done on purely statistical grounds.
Recommended if you have any interest in the diffusion of innovations, or in social contagion. (Probably good if you're interested in the sociology of music, too.) Finally finished, 8 years (!) after I started it, because of reading a more recent paper by the author.
Chernobyl
Fukushima 50
Pandora's Promise
Hotel Rwanda
Watchers of the Sky
Human Flow
The Rest
This is what happens when you live with a historian writing a chapter about 1980--2020... Chernobyl is very well done; some scenes which I thought were imitations of Soviet science fiction movies were in fact imitations of archival footage. Fukushima is a much lower level of art, but still decent. (There is a whole essay to be written about the role of America in that movie, which I am utterly incompetent to do.) Quo Vadis, Aida? is almost unbearably sad. Hotel Rwanda is somehow more purely horrifying than sad. Watchers of the Sky was comparatively optimistic, but having a sincere and committed campaigner against genocide as our UN ambassador did less to improve things than one might wish. Human Flow is the most beautiful movie of the lot. The Rest is fine on its own terms, but diminished by the comparison to the previous movie (not as visually striking, not as thematically wide-ranging, and with too little of Ai Weiwei in the role of the planet's eccentric cat-guy uncle).
Pandora's Power calls for special comment. I am, by temperament and training, receptive to nuclear power having more of a role than many on the left want it to. But this movie, if anything, pushed me away from that position, purely by reaction. The people it chose to showcase as advocates were, for the most part, completely unqualified, both in their earlier opposition and in their later advocacy. Shellenberger in fact seems like someone whose only real principle is attracting attention by outraging liberal piety, a well-trodden path. (Perhaps he's a lovely person and the movie showed him in an bad light.)
Turning from personalities to substance, the arguments here are just tissue thin. If the problem with solar and wind power is intermittency, the obvious solutions are (1) storage, (2) non-intermittent renewable power sources (like hydro power), and (3) a limited role for natural gas or other fossil fuels. (Humanity's carbon budget is not zero.) To listen to the movie, you'd think all of this was impossible, rather than well-studied. (Yes, there are technical challenges, but that'd lead to a serious comparison of alternatives, which the movie avoids at all costs.) Claims that Chernobyl was responsible for millions of deaths are absurd, and anti-nuclear campaigners who repeat them discredit themselves. But it's also absurd to claim that Chernobyl killed basically nobody. (Why oh why might Soviet successor states want to minimize the consequences, it is a mystery, and why might the UN and WHO fail to challenge even obviously falsified official figures, who can say? A village priest squatting in the exclusion zone insists none of his flock gets sick, obviously he's telling the truth.) Concerns about the safe disposal of waste for hundreds to tens of thousands of years, and about nuclear proliferation (particularly with the breeder reactors favored by the move-makers) are dismissed remarkably glibly. (ObRecOfAnInfinitelyBetterMovie: Containment.) That there's a correlation between a country's energy usage and its average lifespan is perfectly true, but that's because countries which use a lot of energy are also ones with sanitation, adequate food, etc., etc. (Obviously it takes energy to provide these goods.) In any case the argument isn't about whether to use lots of energy (*), but how to supply it. I can't tell whether the poverty-porn shots of children in third world slums arise from a clumsy-but-sincere concern for the kids' well-being, from a calculation that "why do you hate brown kids?" is an easy way to morally blackmail the intended audience, or from a feeling that this'd be an amusing way to own the libs.
The only thing which gives me any pause about saying the movie is unmitigated dreck is that Stewart Brand and Richard Rhodes, who I otherwise find to be thoughtful and serious authors from whom I've learned much, agreed to participate. But by the end this had the effect of lowering them a bit in my estimation, which is sad.
After watching, I found this review, which seems very fair, because the movie is, in fact, very bad.
*: Of course there are people who wish humanity would plunge back to pre-industrial levels of energy usage, motivated by some combination of nostalgia for the idiocy of rural life and mis-guided Malthusianism. They are few in number and, thankfully, completely without influence, which will continue to be the case. (Any country where they might, incredibly, manage to impose their views would quickly be stomped by rivals whose madmen in authority were not quite that crazy, assuming their own people didn't do it first.)

Posted at March 31, 2021 23:59 | permanent link

## March 26, 2021

### Sub-Re-Intermediation

Attention conservation notice: 1000-word grudging concession that a bete noire might have a point, followed immediately and at much greater length by un-constructive hole-poking; about social media, by someone who's given up on using social media; also about the economics of recommendation engines, by someone who is neither an economist nor a recommendation engineer.

Because he hates me and wants to make sure that I never get back to any (other) friend or collaborator, Simon made me read Jack Dorsey endorsing an idea of Stephen Wolfram's. Much as it pains me to say, Wolfram has the germ of an interesting idea here, which is to start separating out different aspects of the business of running a social network, as that's currently understood. I am going to ignore the stuff about computational contracts (nonsense on stilts, IMHO), and focus just on the idea that users could have a choice about the ranking / content recommendation algorithms which determine what they see in their feeds. (For short I'll call them "recommendation engines" or "recommenders".) There are still difficulties, though.

#### "Editors. You've re-invented editors."

Or, more exactly, a choice of editorial lines, as we might have with different, competing newspapers and magazines. Well, fine; doing it automatically and at the volume and rate of the Web is something which you can't achieve just by hiring people to edit.

— Back in the dreamtime, before the present was widely distributed, Vannevar Bush imagined the emergence of people who'd make their livings by pointing out what, in the vast store of the Memex, would be worth others' time: "there is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record." Or, again, there's Paul Ginsparg's vision of new journals erecting themselves as front ends to arxiv. Appealing those such visions are, it's just not happened in any sustained, substantial way. (All respect to Maria Popova for Brain Pickings, but how many like her are there, who can do it as a job and keep doing it?) Maybe the obstacles here are ones of scale, and making content-recommendation a separate, algorithmic business could help fulfill the vision. Maybe.

#### Monsters Respond to Incentives

"Presumably", Wolfram says, "the content platform would give a commission to the final ranking provider". So the recommender is still in the selling-ads business, just as Facebook, Twitter, etc. are now. I don't see how this improves the incentives at all. Indeed, it'd presumably mean the recommender is a "publisher" in the digital-advertizing sense, and Facebook's and Twitter's core business situation is preserved. (Perhaps this is why Dorsey endorses it?) But the concerns about the bad and/or perverse effects of those incentives (e.g.) are not in the least alleviated by having many smaller entities channeled in the same direction.

On the other hand, I imagine it's possible that people would pay for recommendations, which would at least give the recommenders a direct financial incentive to please the users. This might still not be good for the users, but at least it would align them more with users' desires, and diversity of those desires could push towards a diversity of recommendations. Of course, there would be the usual difficulty of fee-based services competing against free-to-user-ad-supported services.

#### Imprimatur

To the extent there are concerns about certain content being banned by private companies, those are still there: the network operator, Facebook or Twitter or whatever, retains a veto over content. The recommenders are able to impose further vetoes, but not over-ride the operator.

Further: as Wolfram proposes it, the features used to represent content are already calculated by the operator. This can of course impose all sorts of biases and "editorial" decisions centrally, ones which the recommenders would have difficulty over-riding, if they could do so at all.

#### Increasing returns rule everything around me

Wolfram invokes "competition", but doesn't think about whether it will be effective. There are (at least) two grounds for thinking it wouldn't be, both based on increasing returns to scale.
1. Costs of providing the service: If I am going to provide a recommendation engine to a significant fraction of Facebook's audience, in a timely manner, I require a truly massive computational infrastructure, which will have huge fixed costs, though the marginal costs of each additional recommendation will be trivial. It's literally Econ 101 that this is a situation where competition doesn't work very well, and the market tends to either segment in to monopolistic competition or in to oligopoly (if not outright monopoly). As a counter-argument, I guess I could imagine someone saying "Cloud computing will take care of that", i.e., as long as we tolerate oligopoly among hardware operators, software companies will face constant scale costs for computing. (How could that possibly go wrong, technically or socially?)
2. Quality of the service: Machine learning methods work better with more data. This will mean more data about each user, and more data about more users. (In the very first paper on recommendation engines, back in 1995, Shardanand and Maes observed that the more users' data went in to each prediction, the smaller the error.) Result: the same algorithm used by company A, with $n$ users, will be less effective than if used by company B, with data on $2n$ users. Even when the recommendation engine doesn't explicit use the social network, this will create a network externality for recommendation providers (*). And thus again we get increasing returns and throttled competition (cf.).

Normally I'd say there'd also be switching costs to lock users in to the first recommender they seriously use, but I could imagine the network operators imposing data formats and input-output requirements to make it easy to switch from one recommender to another without losing history.

— Not quite so long ago as "As We May Think", but still well before the present was widely distributed, Carl Shaprio and Hal Varian wrote a quietly brilliant book on the strategies firms in information businesses should follow to actually make money. The four keys were economies of scale, network externalities, lock-in of users, and control of standards. The point of all of these is to reduce competition. These principles work — it is no accident that Varian is now the chief economist of Google — and they will apply here.

#### Prior art

Someone else must have proposed this already. This conclusion is an example of induction by simple enumeration, which is always hazardous, but compelling with this subject. I would be interested to read about those earlier proposal, since I suspect they'll have thought about how it actually could work.

*: Back of the envelope, say the prediction error is $O(n^{-1/2})$, as it often is. The question is then how utility to the user scales with error. If it was simply inversely proportional, we'd get utility scaling like $O(n^{1/2})$, which is a lot less than the $O(n)$ claimed for classic network externalities by Metcalfe's law rule-of-thumb. On the other hand it feels more sensible to say that going from an error of $\pm 1$ on a 5 point scale to $\pm 0.1$ is a lot more valuable to users than going from $\pm 0.1$ to $\pm 0.01$, not much less valuable. Indeed we might expect that even perfect prediction would have only finite utility to users, so the utility would be something like $c-O(n^{-1/2})$. This suggests that we could have multiple very large services, especially if there is a cost to switch between recommenders. But it also suggests that there'd be a minimum viable size for a service, since if it's too small a customer would be paying the switching cost to get worse recommendations. ^

Posted at March 26, 2021 14:03 | permanent link

### Actually, "Dr. Internet" Is the Name of the Monster's Creator

(I can't remember if Henry Farrell came up with this phrase, or I did, as the title for a possible joint project.)

Books to Read While the Algae Grow in Your Fur, July 2021
Hystories: Hysterical Epidemics and Modern Culture (Showalter)
Sub-Re-Intermediation
An Appeal to the Hive Mind (Ironically Enough)
Some Blogospheric Navel-Gazing, or, Strange Memories of the Recent Past
Books to Read While the Algae Grow in Your Fur, July 2017
Kill All Normies: Online Culture Wars from 4Chan and Tumblr to Trump and the Alt-Right (Nagle)
Experimental Considerations Touching on the Art of Winning Lotteries
The Presentation of Self in Internet Life
Speaking Truth to Power About Weblogs, or, How Not to Draw a Straight Line
One Roll of the Dice Will Never Abolish Warblogging
Someone Has Found a Way to Make Money from the Internet!

Posted at March 26, 2021 14:01 | permanent link

### An Appeal to the Hive Mind (Ironically Enough)

Attention conservation notice: Asking for help finding something that you don't know about, that you don't care about, and that a bad memory might have just confabulated.

I have a vivid memory of reading, in the 1990s, an online discussion (maybe just two people, maybe as many as four) about what online fora, search engines, the Web, "agents", etc., were doing to the way people acquire and use knowledge, and indeed to what we mean by "knowledge". My very strong impression is that one of the participants was linked somehow with the MIT Media Lab, and taking a very strong social-constructionist line (unsurprisingly, given that affiliation). At some point the discussion turned to her experiences with an online forum related to a hobby of hers (tropical fish? terraria?). The person I'm thinking of said something like, the consensus of that forum just were knowledge about \$HOBBY. One of her interlocutors made an objection on the order of, why do you trust those random people on the Internet to have any idea what they're talking about? To which the reply was, basically, come on, who'd just make stuff up about \$HOBBY?

I have (genuinely!) thought of this exchange often in the 20-plus years since I read it. But when I recently tried to find it again, to check my memory and to cite it in a work-in-glacial-progress, I've been unable to locate it. (The fact that I don't recall any names of the participants, or the venue, doesn't help.) I am prepared to learn that, because this is something I've thought of often, my mind has re-shaped it into a memorable anecdote, but I'd still like to see what this started from. Any leads readers could provide would be appreciated.

#### Update, the next day

The hive mind Lucy Keer (with an assist from Mike Traven) delivers:

Specifically, the seed around which this story nucleated in my memory may have been a January 1996 piece by Prof. Bruckman in Technology Review — it has the right content (sci.aquaria!), the right date, my father subscribed to TR and I'd even have been visiting my parents when that issue was current. Only it's not a conversation between multiple people but a solo-author essay, it's not primarily about the social aspects of knowledge but about how to find congenial on-line communities and make (or re-make) ones that don't suck (the lost wisdom of the Internet's early Bronze Age), and contains nothing like "who'd just make stuff up about \\$HOBBY?" (In short: Bartlett (1932) meets Radio Yerevan.)

More positively, I very much look forward to reading Bruckman's book (there's an excerpt/precis available on her website).

Posted at March 26, 2021 12:32 | permanent link

### Regression, Thermostats, Causal Inference: Some Finger Exercises


Attention conservation notice: An 800-word, literally academic exercise about an issue in causal inference. Its point is familiar to those in the field, and deservedly obscure to everyone else. Also, too cutesy and pleased with itself by at least half.
I wrote the first version of this for the class where we do causal inference long enough ago that I actually don't remember when --- 2011? 2013? (In retrospect I had probably read Milton Friedman's thermostat analogy but didn't consciously remember it at the time.) Posted now because I've gone over the point with two different people in the last month.

The temperature outside $(X)$ is a direct cause of the temperature inside my house $(Y)$. But every morning I measure the temperature, and adjust my heating/cooling system $(C)$ to try to maintain a constant temperature $y_0$. For simplicity, we'll say that all the relations are linear, so $\begin{eqnarray} X & \sim & \mathrm{whatever}\\ C|X & \leftarrow & a+bX + \epsilon_1\\ Y|X,C & \leftarrow & X-C + \epsilon_2 \end{eqnarray}$ where $\epsilon_1$ and $\epsilon_2$ are exogenous, independent, mean-zero noise terms. We can think of $\epsilon_1$ as a combination of my sloppiness in measuring the temperature and in tuning the heating/cooling system; $\epsilon_2$ is sheer fluctuations.

Exercise: Draw the DAG.

To ensure that the expectation of $Y$ remains at $y_0$, no matter the external temperature, we need $\begin{eqnarray} y_0 & = & \Expect{Y|X=x}\\ & = & \Expect{X - a + bX + \epsilon_1 + \epsilon_2|X=x}\\ & = & (1-b)x -a \end{eqnarray}$ Since this must hold for all $x$, we need $b=1, a=-y_0$.

What follows from this?

• Internal temperature $Y$ is uncorrelated with external temperature $X$: $\begin{eqnarray} \Cov{X,Y} & = & \Expect{XY} - \Expect{X}\Expect{Y}\\ & = & \Expect{X\Expect{Y|X}} - \Expect{X}\Expect{Y}\\ & = & \Expect{X}y_0 - \Expect{X}y_0 = 0 \end{eqnarray}$ The internal temperature will fluctuate around the set-point $y_0$, but those fluctuations will not correlate with the external temperature.
• Internal temperature $Y$ is correlated with the control signal $C$ only through my sloppiness: $\begin{eqnarray} \Cov{C,Y} & = & \Expect{CY} - \Expect{C}\Expect{Y}\\ & = & \Expect{(-y_0 + X + \epsilon_1)(X+y_0-X-\epsilon_1+\epsilon_2)} - (\Expect{X}-y_0)y_0\\ & = & -y_0^2 - \Expect{\epsilon^2} + \Expect{X}y_0 -\Expect{X \epsilon_1} + \Expect{X\epsilon_2} + \Expect{\epsilon_1 \epsilon_2} - \Expect{X}y_0 + y_0^2\\ & = & -\Var{\epsilon_1} \end{eqnarray}$ since all the cross-expectations are zero, and $\Expect{\epsilon_1}=0$.
• The control signal $C$ is correlated with the external temperature: $\begin{eqnarray} \Cov{C,X} & = & \Expect{CX} - \Expect{C}\Expect{X}\\ & = & \Expect{(-y_0 + X+\epsilon_1)X} + (-y_0 +\Expect{X})\Expect{X}\\ & = & \Expect{X^2} - \left(\Expect{X}\right)^2\\ & = & \Var{X} \end{eqnarray}$
• A linear regression of $Y$ on $X$ and $C$ will consistently recover the correct coefficients, namely $+1$ and $-1$. To see this, recall (e.g., from here) that the OLS estimates will tend towards the coefficients of the optimal linear predictor. Those coefficients, in turn, are the solution to $\beta = {\left[ \begin{array}{cc} \Var{X} & \Cov{C,X}\\ \Cov{X,C} & \Var{C} \end{array}\right]}^{-1} \left[ \begin{array}{c} \Cov{Y,X}\\ \Cov{Y,C} \end{array}\right]$ Plugging in our previous results, $\beta = {\left[ \begin{array}{cc} \Var{X} & \Var{X}\\ \Var{X} & \Var{X}+\Var{\epsilon_1} \end{array}\right]}^{-1} \left[ \begin{array}{c} 0\\ -\Var{\epsilon} \end{array}\right]$ After some character-building algebra, you can confirm that the covariance matrix is invertible as long as $\Var{\epsilon_1} > 0$, and then, as promised $\beta = (1,-1)$.

Exercise: Build your character by doing the algebra.

So, as long as control isn't perfect, the naive statistician (or experienced econometrician...) who just does a kitchen-sink regression will actually get the relationship between $Y$, $X$ and $C$ right, concluding that external temperature and the climate control have equal and opposite effects on internal temperature. Sure, there will be sampling noise, but with enough data they'll approach the truth.

Exercise: What do you get if you regress $C$ on $X$ and $Y$?

I have implicitly assumed that I know the exact linear relationship between $X$ and $Y$, since I used that in deriving how the control signal should respond to $X$. If I mis-calibrate the control signal, say if $C = -y_0 +0.999X + \epsilon_1$, then there is not an exact cancellation and everything works as usual.

Exercise: Suppose that instead of measuring the external temperature $X$ directly, I can only measure yesterday's temperature $U$, again with noise. Supposing there is a linear relationship between $U$ and $X$, replicate this analysis. Does it matter if $U$ is the parent of $X$ or vice versa?

Exercise: "Feedback is a mechanism for persistently violating faithfulness"; discuss.

Exercise: "The greatest skill seems like clumsiness" (Laozi); discuss.

Posted at March 26, 2021 09:08 | permanent link

## February 28, 2021

### Books to Read While the Algae Grow in Your Fur, February 2021

Attention conservation notice: I have no taste, and no qualifications to opine about geology, policing, law, or the history of Islamic science.

Michael E. Wysession, How the Earth Works
This is basically a "rocks for jocks" geology course, but I learned stuff from it, so I'm not really in a position to review it.
Rosa Brooks, Tangled Up in Blue: Policing the American City
In which the author, almost on a whim, spends a sabbatical year from her career as a law professor becoming a reserve police officer in Washington, DC, including going through the (modified) police academy for those volunteers, working as a policewoman in Anacostia, etc. She stuck with it for over four years, basically until the book was about to come out. This was, as everyone told her and she freely acknowledges, crazy, but it produced a really good book, equal parts memoir, war stories, and intellectual reflection.
(There are also some quite personal passages about what was involved in doing this while having grown up as not just any red diaper baby, but specifically Barbara Ehrenreich's child. I thought those parts were interesting and even affecting, but I wonder if that shouldn't have been a separate essay.)
Leigh Bardugo, Ninth House
Mind candy contemporary fantasy: what if the Yale secret societies were literally pursuing magic? And what if Yale were to admit an utterly un-well-rounded student with serious issues who happened to be really good at magic? (Bardugo graduated from Yale a surprisingly small number of years ago.)
Jim Al-Khalili, The House of Wisdom: How Arabic Science Saved Ancient Knowledge and Gave Us the Renaissance
This is mostly great. The subtitle is misleadingly Eurocentric; it's mostly about new science under the Abbasaid caliphate. As a practicising physicist, Al-Khalili is both very good at explaining the science done, and somewhat impatient of the meta-scientific subtleties. In particular he's very ready to claim that (a) there's such a thing as The Scientific Method, and (b) some at least of the scholars he's writing about grasped it and used it, along with, at more implicit level, (c) those scholars were pursuing the same kind of endeavor that Al-Khalili is (and for that matter that I am). But I am frankly not familiar enough with the literature on al-Biruni or (especially) al-Haytham to be entitled to an opinion as to whether they thought they were pursuing something that might be called "the scientific method".
(Al-Khalili writes about "Arabic science" because that was the language used; he's quite clear that ethnically many if not most of these men weren't Arabs, any more than European scholars writing in Latin were Romans.)

Posted at February 28, 2021 23:59 | permanent link