## June 21, 2022

### Upcoming Talk: "Matching Random Features"

Attention conservation notice: You have better things to do with an hour of your precious, finite life than staring at a screen while an academic tries to give a hand-wavy summary and advertisement for technical work on abstruse problem you don't care about.

I will be talking on Random-Feature Matching to the One World Approximate Bayesian Computation Seminar at 8:30 am Eastern time (=1:30 pm UK time) on Thursday, 23 June. If you are interested in simulation-based inference but have not (oddly) read my paper, or if you just want to marvel at how bad someone can be at giving a Zoom talk, two years on, please join. (Details on getting access to the Zoom session can be had by following that last link.)

Let me take this opportunity to thank the organizer both for the invitation, and for not insisting on the usual seminar time of 9:30 am UK time.

Posted at June 21, 2022 14:11 | permanent link

### Course Announcement: "Statistics of Inequality and Discrimination" (36-313)

Attention conservation notice: Advertisement for a course you won't take, at a university you don't attend, in which very human and passionately contentious topics deliberately have all the life sucked from them, leaving only the husk of abstractions and the dry bones of methodology.

In the fall I will, again, be teaching my class on inequality

36-313, Statistics of Inequality and Discrimination
9 units
Time and place: Tuesdays and Thursdays, 1:25 -- 2:45 pm, in Wean Hall (WEH) 6403 (tentatively)
Description: Many social questions about inequality, injustice and unfairness are, in part, questions about evidence, data, and statistics. This class lays out the statistical methods which let us answer questions like Does this employer discriminate against members of that group?, Is this standardized test biased against that group?, Is this decision-making algorithm biased, and what does that even mean? and Did this policy which was supposed to reduce this inequality actually help? We will also look at inequality within groups, and at different ideas about how to explain inequalities between groups. The class will interweave discussion of concrete social issues with the relevant statistical concepts.
Prerequisites: 36-202 ("Methods for Statistics and Data Science") (and so also 36-200, "Reasoning with Data"), or similar with permission of the instructor

Last year was the first time I got to teach it, and it was a mixed experience. The students who stuck with it were, gratifyingly, uniformly very happy with it (and I am pretty sure they learned a lot!). But it also had the biggest "melt" of any class I've taught, with fully half of those who initially signed up for it eventually dropping it. The most consistent reason why --- at least, the one they felt comfortable telling me! --- was that they were expecting something with a lot more arguing about politics, and a lot less math and data analysis. I have taken this feedback to heart, and decided to do even more math and data analysis.

#### Tentative topic schedule

Slightly more than one week per. A more detailed listing, with related readings, can be found on the class homepage.
1. "Recall": Reminders about probability and statistics: populations, distribution within a population, distribution functions, joint and conditional probability; samples and inference from samples.
2. Income and wealth inequality: What does the distribution of income and wealth look like within a population? How do we describe population distributions, especially when there is an extreme range of values (a big difference between the rich and poor)? Where does the idea of "the 1%" wealthy elite come from? How has income inequality changed over recent decades?
Statistical tools: measures of central tendency (median, mode, mean), of dispersion, and of skew; measures of dispersion (standard deviation etc.); measures of concentration and inequality (ratios between percentiles, the Lorenz curve, Gini coefficient); the concept of "heavy tails" (the largest values being orders of magnitude larger than typical values); log-normal and power law distributions; fitting distributions to existing data; positive feedback, multiplicative growth and "cumulative advantage" processes.
3. Speed-run through social and economic stratification: Reminders (?) about social concepts: ascriptive and attained social statuses, and qualitative/categorical vs. more-or-less dimensions of differentiation. Important forms of differentiation, including (but not necessarily limited to): sex, gender, income, wealth, consumption, caste, race, ethnicity, citizenship, class, order, education. The legal notion of "protected categories".
4. Income disparities: How does income (and wealth) differ across groups? How do we compare average or typical values? How do we compare entire distributions? How have income inequalities by race and sex changed over recent decades?
Statistical tools: permutation tests for differences in mean (and other measures of the average); two-sample tests for differences in distribution; bootstrapping; inverting tests to find the range of differences compatible with the data; the "analysis of variance" method of comparing populations; the "relative distribution" method of comparing populations
5. Explaining, or explaining away, inequality: To what extent can differences in outcomes between groups be explained by differences in their attributes (e.g., explaining differences in incomes by differences in marketable skills)? How should we go about making such adjustments? Is it appropriate to treat discrimination as the "residual" left unexplained? When does adjusting or controlling for a variable contribute to an explanation, and when is it "explaining away" discrimination? What would it mean to control for race, sex or gender?
Statistical tools: Observational causal inference; using regression to "control for" multiple variables at once, with both linear models and nonparametrically (by means of matching or nearest-neighbors); using graphical models to represent causal relations between variables; how to use graphical models to decide what should and what should not be controlled for; the causal model implicit in decisions about controls.
6. Detecting discrimination in hiring, admissions, etc.: Do employers discriminate in hiring (or schools in admission, etc.)? How can we tell? When are differences in hiring rates evidence for discrimination? How do statistical perspectives on this question line up with legal criteria for "disparate treatment" and "disparate impact"?
Statistical tools: tests for differences in proportions or probabilities; adjusting for applicant characteristics (again)
7. Inequalities in health, disease and mortality: Quantifying differences in the incidence of diseases, in death rates, and in life expectancy. The "deaths of despair" controversy.
Statistical tools: differences in proportions and probabilities again; survival analysis and survival curves; some of the elements of demography.
8. Mobility and Transmission of Inequality: What does it mean to talk about social mobility? Conversely, what doe it mean to say inequality can be transmitted from one generation to the next? What are the mechanisms this happens through? What are the large-scale patterns about mobility and transmission, over the last few decades?
Statistical tools: correlations; conditional probability modeling; Markov models.
9. Measuring segregation: What do we mean by "segregation"? Segregation in law ("de jure") and segregration in fact ("de facto"). Different ways of measuring de facto segregation. Trends in de facto racial segregation since the end of de jure racial segregation. Why different measures of segregation give different results. Segregation by income. Segregation by political partisanship. Consequences of segregation. Inter-generational transmission again.
Statistical tools: Standard measures of segregation; more recent measures of segregation based on information theory; spatial correlation; how do we make adjustments for changing distributions?
10. Algorithmic bias and/or fairness: Can predictive or decision-making algorithms be biased? What would that even mean? Do algorithms trained on existing data necessarily inherit the biases of the world? What notions of fairness or unbiased can we actually implement for algorithms? What trade-offs are involved in enforcing different notions of fairness? Are "risk-prediction instruments" fair?
Statistical tools: Methods for evaluating the accuracy of predictions; differential error rates across groups; decision trees; optimization and multi-objective optimization.
11. Standardized tests: Are standardized tests for school admission biased against certain racial groups? What does it mean to measure qualifications, and how would we know whether tests really are measuring qualifications? What does it mean for a measurement to be biased? When do differences across groups indicate biases? (Disparate impact again.) Why correlating outcomes with test scores among admitted students may not make sense. The "compared to what?" question.
Statistical tools: Predictive validity; differential prediction; "conditioning on a collider"
12. Intelligence tests: Are intelligence tests biased? How do we measure latent attributes? How do we know the latent attributes even exist? What would it mean for there to be such a thing as "general intelligence", that could be measured by tests? What, if anything, do intelligence tests measure? What rising intelligence test results (the Flynn Effect) tell us?
Statistical tools: correlation between test scores; factor models as an explanation of correlations; estimating factor values from tests; measurement invariance; alternatives to factor models; item response theory
13. Measuring attitude and prejudice: How do we measure people's feelings about different groups? Why do different measures give different results? Do "implicit association tests" measure unconscious biases? What, if anything, do implicit association tests measure?
Statistical tools: More on measurement; the distinction between reliability and validity; why it's much easier to quantify reliability than validity; approaches to "construct validity".
14. Evaluating inequality-reducing interventions: If we try to do something to reduce inequality, how do we know whether or not it worked? How do we design a good study of an intervention? How do we pool information from multiple studies? What can we do if only bad studies are available? Do implicit bias interventions change behavior? Does having a chief diversity officer increase faculty diversity? What does, in fact, seem to work?
Statistical tools: Design and analysis of studies; experimental design: selecting measurements of outcomes, and the importance of randomized studies; meta-analytic methods for combining information
15. Policing and crime: When do differences in traffic stops, arrests, or police-caused deaths indicate discrimination? How do we know how many traffic stops, arrests and police-caused deaths there are to begin with? Does "profiling" or "statistical discrimination" make sense for the police, whether or not it's socially desirable? How can the same group be simultaneously over- and under- policed?
Statistical tools: test for differences in proportions; signal detection theory; adjusting for systematically missing data; self-reinforcing equilibria
16. Self-organizing inequalities and "structural" or "systematic" inequalities: Models of how inequalities can perpetuate themselves even when nobody is biased. Models of how inequalities can appear even when nobody is biased. The Schelling model of spatial segregation as a "paradigm". How relevant are Schelling-type models to actual, present-day inequalities?
Statistical tools: Agent-based models; models of social learning and game theory.
17. Statistics and its history: The development of statistics in the 19th and early 20th century was intimately tied to the eugenics movement, which was deeply racist and even more deeply classist (but also often anti-sexist). The last part of the course will cover this history, and explain how many of the intellectual tools we have gone over to document, and perhaps to help combat, inequality and discrimination were invented by people who wanted to use them for quite different purposes. The twin learning objectives for this section are for students to grasp something of this history, and to grasp why the "genetic fallacy", of judging ideas by where they come from (their "genesis") is, indeed, foolish and wrong.
Statistical tools: N/A.
18. How do we know what we do about inequalities? Social data-collection systems and institutions. Measurement again, and measurement as a social process. Difficulties in reducing social reality to data; the case of race in the US census as an example. What systematic data collection leaves out.

#### Evaluation

There will be one problem set per week; each of these homeworks will involve some combination of (very basic) statistical theory, (possibly less basic) calculations using the theory we've gone over, and analysis of real data sets using the methods discussed in class. There will also be readings for each class session, and a short-answer quiz after each session will combine questions based on lecture content with questions based on the readings.

There will be no exams.

My usual policy is to drop a certain number of homeworks, and a certain number of lecture/reading questions, no questions asked. The number of automatic drops isn't something I'll commit to here and now (similarly, I won't make any promises here about the relative weight of homework vs. lecture-related questions).

#### Textbook, Lecture Notes

There is, unfortunately, no one textbook which covers the material we'll go over at the required level. You will, instead, get very detailed lecture notes after each lecture. There will also be a lot of readings from various books and articles. (I will not agree with every reading I assign.)

Teaching: Statistics of Inequality and Discrimination; Corrupting the Young; Enigmas of Chance; Commit a Social Science

Posted at June 21, 2022 13:45 | permanent link

## May 31, 2022

### Books to Read While the Algae Grow in Your Fur, May 2022

Attention conservation notice: I have no taste, and no qualifications to opine on the archaeology of the Southwest, the pre-history of diversity training, or trends in American economic inequality.

Walter Jon Williams, Metropolitan and City on Fire
These are two novels Williams wrote in the '90s about intrigue and machinations in a world-spanning city, where the geomantic forces generated by covering the planet in concrete, metal and plastic are carefully harvested and metered, and our heroine longs to smash it all. They're some of the best stuff Williams has ever done, which is saying a lot. Strictly speaking, they are fantasy, even "urban fantasy", but very much in the manner of well-thought-through science fiction.
As a character, Aiah has something in common with Williams's Caroline Sula and even (when it comes to learning to lie and manipulate) Dagmar Shaw, but she is her own, vivid and plausible, person.
I last read these in 1999; I re-read them because Williams recently said that the long, long delayed third volume will finally happen. I am very eager. §
John Kantner, Ancient Puebloan Southwest
This is a well-written, semi-popular account of the archaeology of the American Southwest, focusing on the period from the rise of Chaco Canyon to the early years of Spanish rule. The writing is mostly smooth and expository (*), and I learned a lot of fascinating-to-me details from it. Kantner does do the usual archaeologist thing of making very confident-sounding assertions about social organization which he must know are far more conjectural than he makes them sound. (**) But this is par for the archaeological course. If you have a non-expert interest in the subject, and can handle the lack of a definite article in the title, this is a worthwhile book. I would read a second edition. §
*: Though inconsistently so; he explains "inference", but not "dendrochronology" or "palynological". --- On a different plane, Kantner persistently writes "inequity" (an evaluative, qualitative judgment) when he should write "inequality" (a descriptive and quantitative comparison). Unless, that is, he regards every inequality as inequitable, which is his right but not something to be just assumed... ^
**: To paraphrase, he does things like assert that a division of such-and-such a community into "moieties" can be inferred from the construction of a wall dividing a building in two. Or, again, there are assertions that a one community couldn't have politically dominated another because the latter kept making pots in its old way. This sort of thing just shows a failure of imagination. (I used to part-own a house that had been built for one large family around 1900, and later split with a wall down the middle. While Pittsburgh has some peculiarities it does not divide duplex residents into two endogamous groups, so that I am expected to regard all North-Halfers as some kind of kin.) It also, I think, betrays a failure to check this sort of inference against cases where much more is known about society and politics from written records. ^
Elisabeth Lasch-Quinn, Race Experts: How Racial Etiquette, Sensitivity Training, and New Age Therapy Hijacked the Civil Rights Revolution (2001)
This is, obviously (?), a work of cultural criticism, but it's done with the tools of a serious historian who is trying to excavate where things like diversity training came from, and why they both emerged when and where they did, and how they survived that initial context. To oversimplify and exaggerate: the late 1960s/early 1970s were a weird time, when plenty of people on the fringes of psychology felt entitled to make stuff up because it sounded good and vibed with their politics, with very little reality-testing. Add the "triumph of the therapeutic" and of self-esteem, plus corporate concerns to ward off liability by claiming to do something (however ineffective), plus the continuing attraction of racialist thinking under another guise (*), and we get a mess.
There are, equally obviously, some political and ethical commitments animating this book, but they are transparent, and honestly ones I have a lot of sympathy for, even if I suspect she and I would often disagree on concrete policies. I would pay very good money to read Lasch-Quinn writing seriously about 2020; unfortunately this is not the kind of work which can be done that quickly, and anyway she seems to have moved on to other topics. §
*: Lasch-Quinn does not use phrases like "reinscribing an essentialized racial binary", but they would actually fit her argument.
Elizabeth Kolbert, Under a White Sky: The Nature of the Future
A collection of journalistic essays. The formula each time is Kolbert visiting some place --- an electrified anti-invasive fish barrier on the reverse-flowing Chicago river, the mouth of the Mississippi, a cave in the Nevada desert where a unique native fish species is being quixotically maintained, the Great Barrier Reef, a carbon-sequestration site in Iceland --- where she can see (as the saying went) "the Earth as transformed by human action", and talk to the workers. Often enough, the reason these efforts are necessary are dealing with side-effects of earlier efforts at control, which Kolbert presents as ironic but unavoidable; we've gone too far down this path to turn back now. (Though she doesn't say so, we'd gone too far when Gilgamesh was king in Uruk.) Stewart Brand is quoted, aptly; so is John McPhee's classic The Control of Nature.
Speaking of McPhee: this is one of the most New Yorker-y books I've ever read. It has all the characteristic virtues: easy prose, lively (but not startling) intelligence, an eye for detail expressed through original (but not outlandish) metaphors, judiciously-chosen historical anecedotes, sympathetic if amused pen-portraits of interesting characters; you come away feeling like you've understood something, without having been taxed. I realize my description may sound a bit barbed, because it is. On the one hand, I want to acknowledge how hard such writing is to pull off --- being scholarly and exhaustive actually takes much less effort and skill --- and record my admiration, indeed my envy. But on the other hand, the reader puts the book down feeling like they've understood something, without necessarily having done so. On the topics where I know enough to think I could judge (mostly having to do with climatology), Kolbert seems accurate, which increases my confidence in the rest of her work. But somehow I was more conscious of the art, and more suspicious of its effects, than I normally am.
This was the first book by Kolbert I've read; I will certainly read more. §
Gino C. Segrè and John D. Stack, Unearthing Fermi's Geophysics
This is a perfectly nice little introduction to geophysics, suitable for third- or fourth- year physics majors. (That is, you are expected to have forgotten undergraduate classical mechanics, thermo, and E& M; fluid and continuum mechanics are introduced here as needed.) The hook here is that this is based on the notes for such a course which Fermi taught, and which Segrè discovered in the archives. Of course it has been vastly fleshed out (the authors reproduce selected pages from Fermi's notes, and "telegraphic" hardly does it justice), and there are a few places where it's been brought up to date, primarily by comparing Fermi's numerical figures with modern measurements. There is thus no discussion of continental drift or of climate change, to name just two important topics. Still, I enjoyed the gimmick, and it's a nice introduction to interesting and important topics in physics. I would imagine that it would suffer, in terms of classroom use or even serious self-study, from lacking exercises. (It would be very interesting to see Fermi's idea of good homework problems!) §
Rebecca M. Blank, Changing Inequality
This is essentially a huge exercise in comparing the American Community Survey's economic statistics in 1979 with those in 2007. The headline is that households at (almost) every level had substantially higher incomes in 2007 than in 1979, even after making all kinds of allowances for changes in the cost of living (*). There was also vastly more inequality, particularly but not only towards the top.
The thing which makes this book more interesting than that sounds is the way Blank does very careful comparisons --- she calls them "simulations" --- why try to tease out the factors which have contributed to these shifts (**). Thus she tries to work out how much of the changes in typical incomes and in measures of inequality can be explained by changes in family structure, by changes in labor-force participation, by changes in income by education level, etc., leaving other factors at their 1979 values. Thus she can give answers to questions like "How much richer-but-unequal would we be just from our being more educated, if salaries and marriage patterns still looks like 1979?" Or, rather, she can give reasonable but still conjectural answers to such questions; any sort of counterfactual assertion rests on untestable hypotheses.
To summarize, much of the increase in typical household incomes comes from increased female labor-force participation. Some of the increase inequality is related; it comes from the increased tendency of highly educated men to be married to highly educated women who also work in well-paid jobs. But lots of the increasing inequality, which takes the form of higher household incomes increasing much faster than those at the median (or even the 80th percentile...) can't be explained in these ways. These findings in turn let Blank say some sensible things about how different policies might reduce inequality. (One finding, at first startling, is that bringing every poor household up to the poverty line would actually do very little to reduce inequality by any of the usual metrics.)
This isn't a scintillating read, but it's serious, sober and (as we used to say) reality-based. I read it in part as fodder for my inequality class, and I am seriously considering having The Kids do (simplified) versions of Blank's comparisons. If you have a serious concern with economic inequality, or social change in America since the 1970s, this is very worth reading. §
*: One important limitation to this conclusion, which Blank duly acknowledges, comes with this data. Because the ACS doesn't track households from one year to another, it doesn't let us saying anything about the stability or security of income. In particular, it doesn't let us say whether a household at the median in 1979 could be more confident of staying at the median than their counterparts in 2007. There is evidence that incomes fluctuate more now than they used to, which, if you believe standard economic theory, would reduce the value of any given level of income. ^
**: Mathematically, I think what she does amounts to a piece-wise constant approximation of Handcock and Morris's "relative distribution" method, which was also invented for studying shifts in inequality. But I haven't ground through the algebra and there might be subtle differences. ^
A. M. Stuart, Singapore Sapphire, Revenge in Rubies, Evil in Emerald
Mind-candy historical mysteries, set in Singapore, mostly among just-barely-genteel Britishers, in the years immediately before World War I. Enjoyable period color, though family tradition requires me to make dark aside about British imperialism as I read. §

Posted at May 31, 2022 23:59 | permanent link

## May 28, 2022

### Don't @ Me

Attention conservation notice: Rationalizing my gut-level dislike of a social medium as Objectively Correct. First drafted in mid-2017, left to rest in my drafts folder because, while sincere, it feels a bit mean. Posted now because I found myself re-writing the next-to-last paragraph.

If, as Leibniz has prophesied, libraries one day become cities, there will still be dark and dismal streets and alleyways as there are now. --- Lichtenberg
I mentioned, some years ago, that in response to reader requests I have a Twitter account. I use this only for announcing new posts here. Messages sent to it will go unread; attempts to communicate through it will be fruitless.

I have, nonetheless, put some time over the years into observing Twitter; I wish I had it back again. There are, so far I can see, only four good uses for Twitter:

1. Announcements of actual, substantive posts, resources or discussions elsewhere. (But we have e-mail and RSS already.)
2. Announcing off-line events, details given elsewhere.
3. Snapshots of cute animals, pretty landscapes, children's birthday parties, and the like.
4. Jokes.

For everything else, well, if someone had deliberately tried to combine the worst features of comments sections and Usenet, they could hardly have done better --- except by first imposing silly length restrictions, followed by kludged-on threads that make Usenet seem a model of clear organization, plus of course an interface that channels people towards the outrage (or main character) of the moment.

I don't know whether it makes people unhappy and angry, or whether only unhappy, angry people persist in using it, but I am not joking when I say that we would all be better off if it disappeared immediately.

--- One of my long-held semi-crank notions is this: all online communication, being through writing, reproduces the social dynamics of literary communities, especially print-literary communities. This law holds independent of the educational level or even intellectual seriousness of the participants. Thus flame-wars, sock-puppets, selective quotation, trawling through the archive for discreditable episodes, "the lurkers support me in e-mail", creating isolated fora to incubate increasingly weird ideas, recycling from supposedly-authoritative source texts long after they're debunked (if they were ever bunked in the first place), spastic attention cascades in which "all fandom was plunged into war", etc., escape from the pages of the little magazines (such as the Philosophical Transactions of the Royal Society), to become part of everyone's life. Twitter has raised this to a new level of awfulness, by making it very hard to actually contribute anything of value, or, having done so, for others to find it and build on it, while still preserving the affordances for weirdness, meanness, and spasm-proneness.

That is my opinion; and it is further my opinion that you people should get off my lawn.

Update, 28 May 2022, further to the theme, in no particular order:

*: Some comments on Frost's review, without having read the book being reviewed. (1) I am, unsurprisingly, extremely sympathetic to the position that hashtag activism is basically futile. (If the authors really neglect Tufekci's empirical and theoretical work as much as Frost says they do, it's pretty damning.) (2) Not examining right-wing hashtag activism seems like an obvious analytical flaw. (Even if your primary interest is in left-wing movements, the comparisons are essential.) (3) It's true that Twitter isn't accountable to its users, or to the people-as-represented-by-government, but Frost for her part never makes clear which of the flaws she identifies would be remedied by such accountability. (4) Doing something about the opioid epidemic by tinkering with drug policy seems a hell of a lot more practical to me that doing something about it by overthrowing American capitalism, or even reversing the trends in inequality over the last half-century. (I would like to see those trends reversed.) ^

Posted at May 28, 2022 12:56 | permanent link

## April 30, 2022

### Books to Read While the Algae Grow in Your Fur, April 2022

Attention conservation notice: I have no taste, and no qualifications to opine on U.S. politics, or the lives and works of 20th century Marxist intellectuals.

Charles Willeford, Miami Blues and New Hope for the Dead
Mind candy mystery: First two "Hoke Moseley" mystery novels, written and set in Miaimi c. 1980. They're still funny and satisfying crime fiction, but very much artifacts of a vanished age. (The cover of the in-print edition of Miami Blues is more than usually misleading.) The community-college bits in Miami Blues, and particularly the pontificating English professor, are made more amusing by learning that Willeford's day job was, precisely, being an English professor at a Miami community college. §
Elizabeth Hand, Available Dark
Sequel to Generation Loss, which I re-read. This time around, Cass gets mixed up with the confluence of Nordic death metal, neo-paganism, bizarre art photography, and Iceland's role in the financial crisis of 2008. Stirring this together with drugs, booze, toxic nostalgia and her convincingly awful combination of bad decisions and sudden insight produces truly absorbing Plot.
Something which registered on the re-read of Generation Loss, but which eluded me the first time around: Cass isn't from just any podunk town in upstate New York, but from the literally haunted town in Hand's Black Light, whose inhabitants have made a deal with, if not the Devil, then at least a nasty avatar of Dionysus. I now believe that it is legitimate to take Cass's visions not as [just] drug-induced hallucinations, but factual descriptions of supernatural experiences. In particular, I think Cass is, if not exactly a valkyrie or banshee, then something in that line, a walking, talking, bourbon-and-meth-swilling, shutter-happy harbinger of doom, and the birds know it. All of which said, these books are eminently enjoyable on a "straight", non-fantastic level, which is a neat trick.
I eagerly look forward to her further mis-adventures. §
Graydon Saunders, A Succession of Bad Days
Mind candy fantasy, the sorcerors' apprenticeship division: 900-or-so pages of the education of wizards, in the same world as Saunder's The March North, with detailed thermodynamics. (It's not called thermodynamics but I dare say anyone who will enjoy this will recognize what is going on.) I did not enjoy this as much as I did The March North, at least in part because all of the characters tend to sound a bit too much the same, i.e., like Saunders. But I enjoyed it enough to keep reading all the way to the end. §
Jennifer Nicoll Victor, Understanding the U.S. Government
The fact that I listened to a course of Poli. Sci. 1 lectures, and learned from them, shows I am not qualified to actually review them. But I enjoyed this. §
Disclaimer: Prof. Victor and I actually collaborated once, in supervising a student project which tried to use social network analysis to get at the question of whether campaign donations affect Congressional outcomes. It was never published because we got null results (and the student moved on to other things). In retrospect, my guess is that resources (including funds) do matter, but that it's rare for disputed issues to have lots of resources on only one side of the dispute (if they did, the dispute wouldn't stay on the agenda for long), and the study wasn't well-positioned to get at the counter-factuals. But, like I said, I learned stuff about how my government works from these lectures, so you probably shouldn't listen to me!
Stanley Pierson, Leaving Marxism: Studies in the Dissolution of an Ideology
Mostly, this is three biographies of three very different intellectuals who all ended up ex-Marxists: Henri de Man, the Belgian advocate of planning and WWWII-collaborator; Max Horkheimer of the Frankfurt School; and Leszek Kolakowski. Pierson emphasizes that, like many Marxist intellectuals, they came from bourgeois backgrounds, were drawn to socialism and to Marxism by its resonance with their bourgeois values, and ultimately left Marxism because of those same values. (He does not inquire into how they differed from intellectuals of bourgeois origins who remained Marxists, or the rare 20th-century Marxist intellectuals from humbler backgrounds like Gramsci.) There are no great revelations here, but they're well-written and well-researched biographical studies. Recommended if you care about intellectuals in politics, or the Marxist tradition. §

Posted at April 30, 2022 23:59 | permanent link

## April 25, 2022

### Intermittent Finds in Complex Systems and Stuff, No. 2

Attention conservation notice: Links to forbiddingly-technical scientific papers and lecture notes, about obscure corners of academia you don't care about, and whose only connecting logic is having come to the attention of someone with all the discernment and taste of a magpie (who's been taught elementary probability theory).
Or whatever the heck it is I study these days. (I did promise that this series would be intermittent.) In no particular order.
Modibo K. Camara, "Computationally Tractable Choice" [PDF]
I'll quote the abstract in full:
I incorporate computational constraints into decision theory in order to capture how cognitive limitations affect behavior. I impose an axiom of computational tractability that only rules out behaviors that are thought to be fundamentally hard. I use this framework to better understand common behavioral heuristics: if choices are tractable and consistent with the expected utility axioms, then they are observationally equivalent to forms of choice bracketing. Then I show that a computationally-constrained decisionmaker can be objectively better off if she is willing to use heuristics that would not appear rational to an outside observer.
If you like seeing SATISFIABILITY reduced to decision-theoretic optimization problems, this is the paper for you. I enjoyed this partly out of technical interest, and partly to see Simon and Lindblom's heuristic arguments from the 1950s rigorously validated.
One last remark: the slippage of "rationality" in the last sentence of the abstract is fascinating. We started by wanting to define "rational behavior" as being about effectively adapting means to ends; we had an intuition, inherited from 18th century philosophy, that calculating the expectation values in terms of rat orgasm equivalents would be a good way to adapt means to ends; we re-defined "rational behavior" as "acting as though one were calculating and then maximizing an expected number of rat orgasm equivalents"; now it turns out that that is provably an inferior way of adapting means to ends, and we have to worry about what it says about rationality. There's something very wrong with this picture! §
(Thanks to Suresh Naidu for sharing this paper with me.)
Carlos Fernández-Loría and Foster Provost, "Causal Decision Making and Causal Effect Estimation Are Not the Same... and Why It Matters", arxiv:2104.04103
To make an (admirably simple) argument even simpler: Think of decision-making as a classification problem, rather than estimation. If your classifier mis-estimates $\mathbb{P}\left( Y|X=x \right)$, but you're nonetheless on the correct side of 1/2 (or whatever your optimal boundary might be), it doesn't matter for classification accuracy! So if you over-estimate the benefits of treatment for those you decide to treat, well, you're still treating them...
Ira Globus-Harris, Michael Kearns, Aaron Roth, "Beyond the Frontier: Fairness Without Privacy Loss", arxiv:2201.10408
My comments got long enough to go elsewhere.
Hrayr Harutyunyan, Maxim Raginsky, Greg Ver Steeg, Aram Galstyan, "Information-theoretic generalization bounds for black-box learning algorithms", arxiv:2110.01584
I was very excited to read this --- look at the authors! --- and it did not disappoint. It's a lovely paper which both makes a lot of sense at the conceptual level and gives decent, calculable bounds for realistic situations. I'd love to teach this in my learning-theory class, even though I'd have to cut other stuff to make room for the information-theoretic background.
Adityanarayanan Radhakrishnan, Karren Yang, Mikhail Belkin, Caroline Uhler, "Memorization in Overparameterized Autoencoders", arxiv:1810.10333
I was blown away when Uhler demonstrated some of the results in a talk here, and the paper did not disappoint.
Mikhail Belkin, "Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation", arxiv:2105.14368
Further to the theme.
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel, "Extracting Training Data from Large Language Models", arxiv:2012.07805
Demonstrates that from GPT-2 they can extract "(public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs", even though "each of the above sequences are included in just one document in the training data".
• I don't understand why they compare zlib entropy to language-model perplexity, when entropy density is basically log(perplexity). This probably wouldn't make a big difference to any results but it bugged me.
• This has to be connected to Radhakrishnan et al., right?
• I'd really like to see someone throw this many parameters, and this much data, at something like Pereira, Singer and Tishby 1996 and see how it does in comparison, both in terms of the usual performance metrics and memorizing random (and inappropriate) bits of the training data. (Pereira may be in a position to do the experiment!)
• Some people will, of course, interpret this as evidence that GPT-2 knows who you are, and so is that much closer to judging the quick and the dead basilisk-dom being amenable to bargaining under the canons of timeless decision theory.
Gabriel Rossman and Jacob C. Fisher, "Network hubs cease to be influential in the presence of low levels of advertising", Proceedings of the National Academy of Sciences 118 (2021): e2013391118
In a pure social-contagion/diffusion-of-innovations process, the contagion/innovation will spread farther, and spread faster, if it begins at one of the the most central nodes in the network, than if it begins at a randomly chosen node, let alone a deliberately-peripheral one. This motivates a lot of effort in applications to search for influential figures and target them. What Rossman and Fisher do is extend the model very modestly, to model "advertising", i.e., a probability for nodes to contract the contagion / adopt the innovation spontaneously, without direct contact with an infected / adopter node. What they show is that even a very small amount of advertising massively reduces the advantage of beginning at a central node. It's a very convincing, lovely, and potentially-applicable result. I also strongly suspect there's a genuine phase transition here, with the transition point moving towards zero external field as the size of the network goes to infinity, but I haven't been able to show that (yet). --- Many thanks to Prof. Rossman for presenting this paper to CMU's Networkshop.
Yuan Zhang, Dong Xia, "Edgeworth expansions for network moments", arxiv:2004.06615
This is technical, but valuable for all of us interested in being able to quantify uncertainty in network data analysis, especially in those of us working graph-limits/graphons/conditionally-independent-dyads framework. --- Thanks to Prof. Zhang for a very enjoyable conversation about this paper during a "visit" to Ohio State via Zoom.
David Childers, "Forecasting for Economics and Business"
Great materials for an undergraduate economics course (73-423) at CMU. Thanks to David for the pointer.
Vera Melinda Galfi, Valerio Lucarini, Francesco Ragone, Jeroen Wouters, "Applications of large deviation theory in geophysical fluid dynamics and climate science", La Rivista del Nuovo Cimento 44 (2021): 291--363, arxiv:2106.13546
The laws of large numbers say that, on large enough scales, random systems converge on their expected values. ("Large scales" here might indeed be number of samples, or length of time series, or something similar.) In symbols which you should not take too literally here, as $n \rightarrow \infty$, $\mathbb{P} \left( |A_n - a_{\infty}| > \epsilon \right) \rightarrow 0$ for every $\epsilon > 0$, where $a_{\infty}$ is the limiting behavior of the process. Large deviations theory is about fluctuations away from the expected behavior, and specifically about finding rate functions $r$ such that $\mathbb{P} \left( |A_n - a_{\infty}| \geq \epsilon \right) \sim \exp{\left( -n r(\epsilon)\right) }$. This is a "large" deviation because the size $\epsilon$ is staying the same as $n$ grows. We'd anticipate seeing this kind of behavior if $A_n$ was the result of some number $\propto n$ of independent random variables, all of which had to cooperate in order to produce that $\epsilon$-sized fluctuation. More specifically, a good point-wise rate function will let us say that $\frac{1}{n}\log{\mathbb{P}\left( A_n \in B \right) } \rightarrow - \inf_{x \in B}{I(x)}$ so that, as the saying goes, an unlikely large deviation is overwhelmingly (exponentially) likely to happen in the least unlikely possible way. Large deviations theory gives us lots of tools for calculating rate functions, and so saying how unlikely various large deviations are (at least to within asymptotic log factors), and for characterizing those least-unlikely paths to improbable events. (I am glossing over all kinds of lovely mathematical details, but follow some links.)
Now climate systems contain a lot random variables, which are mostly tightly dependent on each other but not completely so. And a lot of what we should worry about with climate comes from large fluctuations away from typical behavior. (E.g., transitions from one meta-stable state of the climate, where, say, there is a Gulf Stream in the North Atlantic keeping western Europe warmer than Labrador or Kamchatka, to another meta-stable state where there is not.) So climate modeling is actually a very natural application for large deviations theory. This is a well-written review paper surveying those applications, with a minimum of mathematical apparatus. (The implied reader does, however, remember fluid mechanics and thermodynamics.) It makes me want to learn more about rare-event simulation techniques. §

Posted at April 25, 2022 10:41 | permanent link

### Positive-Definite Tab Closure

Attention conservation notice: A link-dump piece, where some of the links were first opened in 2015.

Tabs I have closed recently, which are of a positive and/or constructive and/or cheerful nature:

(I am sure that I am forgetting to credit sources for these links, and can only plead for forgiveness.)

Posted at April 25, 2022 10:40 | permanent link

## March 31, 2022

### Books to Read While the Algae Grow in Your Fur, March 2022

Attention conservation notice: I have no taste, and no credentials to opine on the sociology of education, political and moral philosophy, medieval Islamic science, or even, strictly speaking, pure mathematics.

Dana Stabenow, A Cold Day for Murder, A Fatal Thaw, Dead in the Water, A Cold-Blooded Business, Play with Fire
Mind candy mysteries, where the Alaskan environment is as much a character as any human being, or husky. Stabenow was, I believe, originally a science fiction and fantasy writer, and I think some of that comes through in the way the very strange world of Alaska is unfolded before the reader. It also comes through in the character of Kate Shugak, a hero of basically-royal birth who lives on the border between civilization and the wilderness, and who roams the countryside defeating monsters and malefactors, especially those who have offended against the laws of kinship and hospitality. (There are a lot of explicit references to Greek myths and I do not believe any of this is coincidence or even unconscious.) The fact that I read five of these in a month, and have more in the queue, tells you how easily they go down. §
Douglas B. Downey, How Schools Really Matter: Why Our Assumption about Schools and Inequality Is Mostly Wrong
Downey studies some nationally-representative longitudinal data sets, which measure student achievement in reading and math at multiple points in the school year, over multiple years. "Longitudinal" here means that each student is being measured multiple times, allowing one to draw inference about how much was learned when. The basic finding Downey extracts from this is that during the school year, richer and poorer students, and black and white students, learn at basically the same rate. But they arrive at school at very different average levels of achievement, and their gaps grow while out of school each year. Thus, on this evidence, schools for the disadvantaged are in fact doing about as well at teaching reading and math as other schools. The inequality in educational outcomes, then, isn't due to inequality in schooling, but to (as Downey puts it) the other 87% of students' lives.
This is remarkably contrary to received opinion, what Downey calls "The Assumption", that schools for the poor are poor schools which do not teach effectively. I get the impression that Downey started by wanting to be talked out of this position, but came to embrace it for lack of intelligent opposition:
I don't think that the people questioning the evidence are bad people, but they are reluctant to let go of the dominant narrative about schools. It would be one thing if the reason was because they had issues with whether the ECLS-K item-response theory scales of reading can be considered truly interval, or if they questioned whether nonschool investments in children are constant across seasons, or if they thought that the approach scholars use to model the overlap days between test dates and the beginnings and ends of school years was insufficient. ... But while many have resisted the empirical patterns in chapters 1--4 and remain committed to The Assumption, the quality of evidence doesn't seem to be the obstacle. [p. 97]
I join Downey's audiences in astonishment. I also join him in thinking that "we really need to reform the distribution of rewards in the broader society", but I just have a hard time swallowing the findings. (Among other things, if he's right, why are parents so convinced otherwise?) But I also don't have any clever explanations to make this pattern in the data into a mere artifact. As a statistician, I do wonder about whether these surveys really cover a nationally representative sample of students and schools. (Though it's far from clear what sort of sampling bias would produce this pattern!) There is also the issue (which Downey highlights in the quote above) of whether these reading and math scores are really "interval". Concepts like "median" make sense with merely ordinal variables, but something like "the change in the median poor kid's reading score from September to May is equal to the change in median scores for rich kids", $X_p(2) - X_p(1) = X_r(2) - X_r(1)$, needs us to be able to compare differences at arbitrary points along the scale. So this is resting a lot on the ways the survey researchers translate students' answers into numerical values, and I'd have liked to see a lot more about that. In particular I'd want to make really sure that this sort of parallel trajectories isn't an artifact of the scaling procedure.
It is unlikely, but not I guess impossible, that I will actually investigate this properly. In the meanwhile, I am informed, but puzzled and unsettled. §
(Text lightly edited 3 June 2022, to resolve some ambiguous pronouns etc.)
Jürgen Jost, Postmodern Analysis
I should begin by admitting that I took real analysis as a sophomore, scraped out a C through the kindness of the teacher, and became a physicist. (I did eventually learn measure-theoretic probability.) So the idea of anyone taking advice from me on pure math textbooks is preposterous.
I should also say that I met Jürgen through Santa Fe more than twenty years ago, admire his work on information geometry and complex systems, have given talks at the Max Planck Institut he directs, etc. If I read one of his books and didn't like it, I'd just say nothing publicly.
With my throat now hopefully adequately cleared: When we all went home in March 2020, I got the idea that this would be when I finally learned some important areas of math properly. This fantasy led to downloading a large number of books from the library, and discovering that I would never read most of them for good reason. But this one I stuck with. It's a really good survey of crucial topics in analysis, starting with the basics of differentiation and Riemann integration, visiting things like ordinary differential equations as dynamical systems, Lebesgue integration, and function approximation, and ending up with the calculus of variations and partial differential equations and their interconnections. It's "postmodern" only in the sense that it comes after the classical works on modern analysis of the mid- / late- 20th century, and tries to give a survey of what a bright young mathematician should know now. The exposition is great, consistently just rigorous enough that I needed to inhibit my lizard-brain physicist impulses ("it'd be nice if that equation had a square-integrable solution, therefore it does"), but always with an eye on applications, i.e., on reality. It's really quite enjoyable, and makes me want to read Jost's other textbooks. §
(The obvious question is whether I would have done any better, as an undergrad, if this had been the text in my real analysis course. Honesty compels me to say: "not on your life"; our textbook was forgettable but decent, the problem was teenage me.)
Final disclaimer: I read the second (2003) edition; the third (2005) edition seems to mostly correct mis-prints, and add some results on coverings in the chapter on $L^p$ function spaces. But I cannot swear to its content the way I can to the 2nd edition.
Stuart Hampshire, Justice Is Conflict
This is a strange (and short) little book of philosophy. The starting point is Plato's analogy, in the Republic, between conflict within the soul and conflict within the city (= polity). Hampshire says that, pace Plato, the way we really resolve conflict in the city is to make sure that all (he says "both") sides know that they have been able to make their case and be heard, even if they cannot get what they want. What ultimately matters is that there was a fair procedure, rather than a substantively just outcome. In the analogy of inner conflict, individual people just have more-or-less incompatible values, and we should not expect to find some way of reconciling them or subordinating one to the most correct values. Nor, he says, should we even want such a reconciliation or ordering.
I am sympathetic --- in some sense he's getting at the core of liberalism --- but I found the argument lacking. The analogy is obviously a bit weak: I don't think he ever really addresses what would correspond to a fair procedure in the soul. (Adversarial or critical thinking is all very well to endorse, but being your own critic has obvious limits.) Also, I think he equivocates about whether unifying values is impossible, or merely undesirable. That's fine by me, because I am strongly in the "impossible" camp --- I encountered "A heterarchy of values determined by the topology of nervous nets" at an impressionable age, and still regard it as irrefutable --- but philosophically a bit unsatisfying.
More frustrating was that Hampshire is fully aware that there are often disputes about which procedures are fair, and this doesn't seem to help us figure that out at all. To use a (banausic and depraved) analogy of my own: if I am writing new code to perform some task, i.e., devising a procedure, I check whether it works right by seeing if it gives the correct answer on test cases, i.e., is substantively correct in particular circumstances. But of course, just to make things circular, in other cases I work out what the answer is by using my procedure. At a much more elevated plane than numerical software, something like this would seem to be at work here, and could use some philosophical illumination. That is, I wish Hampshire would absorb something like Laudan's Science and Values. §
George Malagaris, Biruni [doi:10.1093/oso/9780190124021.001.0001]
Brief historical study of Abu Rayhan Muhammad ibn Ahmad al-Biruni (973 -- 1050?), emphasizing the historical context of Central Asia and the eastern Islamic world in general, giving the main facts of Biruni's biography (including puncturing some picturesque stories), and surveying his major works. Pride of place in Malagris's treatment goes to Biruni's India, fairly enough, but he's pretty comprehensive, and seems to understand the math. (I was astonished to learn that Biruni translated/adapted the Yoga sutras of Patanjali, which must have made some heads explode.) There's also some treatment of his correspondence with ibn Sina; it is simultaneously reassuring and depressing to see that a millennium ago, great scholars were just as capable of mutual incomprehension, dismissal, and pettiness as their modern counterparts, or online posters (cf.) (Actually, I suspect there's the possibility for a very interesting study of different conceptions of "science" in this exchange, and I wonder if someone has done it.) The book concludes with a treatment of Biruni's place in later historical memory, including the way he is claimed by multiple modern nation-states as part of their illustrious past. §
John Scalzi, The Kaiju Preservation Society
Mind candy comic science fiction. It's Scalzi, which means it's funny and mostly but not entirely lightheartedly, and reads extremely smoothly. §
Jane Langton, The Dante Game
Mind candy mystery: the umpteenth book in Langton's series, in which Homer Kelly stumbles his way into an artistic or literary enthusiasm and a homicide investigation. This time it's Dante, and the city of Florence, and the new pope's anti-drug crusade, which is far too successful for some people's liking. It's an old favorite which holds up very well. (Previously.) §

Posted at March 31, 2022 23:59 | permanent link

## February 28, 2022

### Books to Read While the Algae Grow in Your Fur, February 2022

Attention conservation notice: I have no taste, and no qualifications to opine on the history of Central Asia, the philosophy of science, the anthropology of New Guinea and/or cultural creativity, archaeology, Antarctic exploration, or the philosophy of Spinoza.

Adeeb Khalid, Central Asia: A New History from the Imperial Conquests to the Present
By "central Asia", Khalid means "Turkestan", both the eastern parts conquered by the Qing in the 1700s and the western parts conquered by the Romanovs in the 1800s. (Thus Afghanistan, Tibet, Mongolia, etc., feature only incidentally.) He begins with those conquests, after a little scene-setting to make their events comprehensible, and then goes down to 2020 and the on-going police state and cultural genocide in Xinjiang. Khalid's great (and persuasive) theme is how ordinary this history is, in a global perspective --- imperial conquest, the arrival of modernity, the development of nationalism and the construction of national cultures (he doesn't use the phrase "peasants into Uzbeks", but he comes close), Communism as a vehicle for nationalism, ambitious-to-mad state projects to develop economies, to transform nature and/or transform society, widening entanglement with global culture and economic forces... This is what the 19th, 20th and 21st centuries were like, for much if not most of the world. It's extremely scholarly --- Khalid has clearly read and synthesized almost everything --- but still very readable. If you are at all interested in this part of the world, it's very much worth your time. §
Wesley C. Salmon, with Richard C. Jeffrey and Jeffrey G. Greeno, Statistical Explanation and Statistical Relevance (Pittsburgh: University of Pittsburgh Press, 1971)
1300 words of review: Distinctions That Make Differences to Chances.
Annalee Newitz, Four Lost Cities: A Secret History of the Urban Age
I have mixed feelings about this. On the one hand, it's pleasantly-written and engaging popular social science about four interesting and important cities that were, for one reason or another, abandoned and (largely) forgotten: Çatalhöyük, Pompeii, Angkor and Cahokia. I learned from it, and I mostly enjoyed reading it. On the other hand, I sometimes found myself irritated by the sensation that Newitz was pandering to the prejudices of people like me --- all the cities were full of diverse immigrants, etc., etc. (Looking around after writing that, I see James Palmer had a similar reaction to those bits.)
Beyond those matters of tone, though, I do want to quibble with the way Newitz presents these cities. Many archaeologists have a bad tendency to present speculative interpretations as though they were facts. (They are not, of course, alone in this, and I've complained about this before.) This tendency seems to be very much on display here in the chapters on Çatalhöyük and Cahokia, where we have no writings to fill us in on ideologies and structures of inequality (not to say oppression). I can't help but suspect that this makes those cities better screens for modern projections than Pompeii and Angkor. There's also some trash-talking of V. Gordon Childe that strikes me as unfair, and dismissal of the idea that there are developmental trajectories to more hierarchy, size and complexity as Eurocentric myths, rather than cross-cultural empirical regularities. (And of course a key part of the Enlightenment world-view was seeing Europe as a place which had regressed in these regards for a millennium of barbarism, "mired in the superstitions and brutal monarchies of the Middle Ages", as Newitz puts it on p. 210.)
On re-reading this, I see I've given more space to what irritated me, which is mostly incidental, than to what I enjoyed --- so I will just re-iterate that despite my quibbles, I did enjoy. §
(Thanks to Jan Johnson for my copy of the book.)
Fredrik Barth, Cosmologies in the Making: A Generative Approach to Cultural Variation in Inner New Guinea
750-plus words of review: Cosmology and Cosmologists --- The Modern Ok School.
(I forget what chain of references first put this on my radar --- probably something in the Dan Sperber / Pascal Boyer nexus, but that's honestly just me guessing.)
Edmund Stump, The Roof at the Bottom of the World: Discovering the Transantarctic Mountains
A scientist's winningly enthusiastic history of exploration in the Antarctica mountains, from the first visits to the continent, through the heroic era, to the early 1960s. (It's startling just how much more massive the US's post-1945 efforts were than everything that came before.) The stories are supplemented with Stump's own memories of decades of geologizing on the continent, and his very good photographs. §
Steven Nadler, A Book Forged in Hell: Spinoza's Scandalous Treatise and the Birth of the Secular Age
Partly exposition of the Theological-Political Treatise, partly a biography of Spinoza, partly intellectual, political and religious history to set the context. I enjoyed it, but since I've never actually read the Treatise, despite an interest in Spinoza, I'm in no position to judge it. §

Posted at February 28, 2022 23:59 | permanent link

## January 31, 2022

### Books to Read While the Algae Grow in Your Fur, January 2022

Attention conservation notice: I have no taste, and no qualifications to opine on the history and geopolitical context of Antarctic exploration, the social structure of medieval China, or philosophy of any kind.

Berlin Station
MI-5
For reasons I will not elaborate on, I binge-watched the entirety of these two spy drama series over a period of about eight weeks. (My viewing-partner needed a lot of distraction, and had lived in both Berlin and London.) Both had ripped-from-the-headline plots and some good acting [*], including some overlapping cast. Over-all, I liked Berlin Station better, since it had more ambitious and more coherent plots, though there was a development late in the third and final season which at last made me get why people write "fix it" fanfic. (Yes, I went looking and found that people had indeed written the relevant fixit fics. Yes, I read them. No, I will not link to them.) There is a dissertation to be written about the absurdity of many of the plots in MI-5 (the economics alone -- oy vey). One nice question to investigate in such a dissertation would be whether those hare-brained notions arise from the writers' sincere ideas about how the world works, the audience's ideas about the world, the writers' ideas of the audience's ideas about the world, or the writers' ideas of what the audience will tolerate in escapist entertainment. §
*: Except for the painful imitations of American accents in MI-5.
Adrian Howkins, Frozen Empires: An Environmental History of the Antarctic Peninsula
A solid history of political conflicts over the Antarctic Peninsula between the British Empire, Argentina, Chile, the US and the Soviet Union, with other parties showing up as bit players. Howkins makes a big deal out of a contrast between the imperial powers' claiming "environmental authority", in the sense of producing universally-valid and useful scientific knowledge about the environment, and the "environmental nationalism" of Argentina and Chile, claiming a more intimate, specific and un-generalizable connection to Antarctica and its environment. (I'd like to read some of the literary works Howkins references, but lack the Spanish.) In this view, the Antarctic Treaty, which suspends sovereignty claims over the continent but limits influence to countries engaged in serious scientific research, constitutes a full, apparently final, victory of environmental authority over environmental nationalism. The actual Antarctic environment and its history is thus not in the foreground. It appears more by way of an obstacle to (e.g.) Chile trying to actually have a naval or administrative presence on the Peninsula, or whaling becoming unimportant, than in the foreground. While I began this very skeptical that there was anything interesting to say about imperialism in the only part of the world where there wasn't anyone to imperialize, by the end Howkins had me convinced this was, in fact, a real part of the history of Antarctica. (That Argentine and Chilean nationalists were an alternative to imperial environmental authority, as opposed to just wanting to be the authoritative imperialists themselves --- there I was less persuaded.) §
Nicolas Tackett, The Destruction of the Medieval Chinese Aristocracy
This is awesome: it's a social network study of office-holding elite of the later Tang dynasty (after the An Lushan rebellion*), based on funerary inscriptions that gave extensive biographical and genealogical details. Archaeologists have dug up thousands of these, along with others recorded by epigraphers; in some cases these can be connected to biographies in the official dynastic histories (and the two sources usually agree). By assembling a database of these inscriptions, Tackett is able to, in turn, construct a social network of the Tang elite --- rich families that held high office, for many generations on end, in many cases over multiple dynasties. Tackett documents their persistence in office, their peregrinations around the empire, their residences in or between the two capital cities of Chang-an and Luoyang, and their intermarriages and ties of patronage. (Interestingly, the marriage network seems to show two modules or blocks**, one centered on the imperial family. I would have expected more; this would be worth investigating with good community-discovery methods.)
Tackett's argument, convincing to this non-expert, is that this elite was incredibly successful at maintaining their position, despite all the challenges put in their way --- not just An Lushan, but the rise of more-or-less recognized hereditary warlords in the northeast, and the examination system. (My fellow Eisensteinians will perk up when Tackett discusses the role of family manuscript libraries in preparing for competitive examinations in a pre-print society.) In this account, this elite was perfectly set to continue perpetuating itself for generations to come, until the Huang Chao rebellion captured and wrecked the capital cities in 880--881, and in doing so just flat-out killed an immense proportion of those elites. This was the destruction of the title, and more or less the close of Tackett's story.
Now obviously I am not any kind of expert on medieval China, and so it would be presumptuous of me to judge whether Tackett has fairly encompassed all the relevant evidence, and so render a judgment on his account of both the continued pre-eminence of this elite, and its extinction. But it makes a great deal of sense, and I really want to get my hands on the data. I'd recommend it for anyone interested in historical social networks, especially recovering social networks from text, at least if they have basic familiarity with the outlines of pre-modern Chinese history. §
*: While it's tangential to his point, Tackett cannot resist pointing out that Steven Pinker, in describing the An Lushan rebellion as proportionally the worst disaster in human history, relied on a source which obviously confused a decline in the Tang state's ability to enumerate (and so tax and conscript) its subjects with an actual death toll.
**: Tackett says "cliques", but clearly doesn't mean the word in its graph-theoretic sense.
Ernest Gellner, The Devil in Modern Philosophy
1974 essay collection by one of my gurus; I first read it in 1997 when I'd just discovered Gellner and was tearing through everything of his I could find, and re-read it now because the CMU library got electronic access. The essays here range in time from the 1950s, when Gellner was attacking Wittgenstein and "ordinary language" philosophy, through the early 1970s. So the oldest layer here consists of companion pieces to Words and Things, while the most recent are studies for Legitimation of Belief. On re-reading, what I found the most interesting was that top-most layer. I would particularly single out the study of French 18th century materialism, as exemplified by d'Holbach's System of Nature, and the final essay "On Chomsky". Gellner's point in the latter is that what made Chomsky truly revolutionary was his insistence that ordinary human "lifeworld" competences require explanation, and that real explanations must be impersonal, mechanistic, structural. In Gellner's rendition, Chomsky's real objection to behaviorism wasn't that it was inhuman, but that only pretended to give mechanistic explanations. (I think this is right.)
I can't recommend this to anyone who isn't already deeply into Gellner, but I do want to take the occasion to plug Legitimation of Belief, which is terrific. §
Susanna Clarke, Piranesi
This is radically different from Jonathan Strange and Mr. Norrell, but still amazing. Having carefully preserved myself from spoilers, there were only one or two points where I could see what was coming before the narrator did, and that was, for me, part of the charm, so I will keep my mouth shut about the marvelous transformations you will experience as you read this. You should read this. §

Posted at January 31, 2022 23:59 | permanent link

## December 31, 2021

### Books to Read While the Algae Grow in Your Fur, December 2021

Attention conservation notice: I have no taste, and no qualifications to opine on the fountainheads of the western philosophical tradition, the history of 17th century science, political philosophy, cognitive psychology, the transmission of inequality, or even social-scientific measurement.

Plato, trans. and ed. Christopher Rowe, Theaetetus and Sophist
Theaeatetus is about knowledge, and more specifically how false belief is even possible --- say, falsely identifying someone else as Socrates, if we (supposedly) know Socrates. It's notable for Socrates propounding at least three distinct theories of knowledge, and undermining them all, ending in perplexity. There are some deeply interesting pieces here, including bits (like the analogies of the wax impressions, and of the aviary) where Plato is trying to think through how to make something knowledge-like work. Then there are the bits of metaphysics about being and not being which I frankly cannot comprehend, and have to hope sounded more plausible in Greek. (I do not think this is Rowe's fault.)
(The dialogue is also notable that early on Socrates makes a big song and dance about how he's just a "midwife" and is only going to help bring out the ideas already in young Theaetetus's mind. Then the whole rest of the dialogue is Socrates setting up and knocking down theories, with one piece of criticism from Theaetetus's teacher Theodorus [161]; the youth contributes exactly nothing, beyond the usual "just as you say, Socrates" or "I do not altogether follow, Socrates". [See also.])
Sophist is, supposedly, a sequel, where Theatetus converses with another distinguished visitor, an unnamed philosopher from Elea. (Socrates has vanished.) The goal here is to try to define the character of the sophist, by means of a series of binary distinctions. The visitor propounds a series of very distinct-looking definitions, all unflattering, which are held to be equivalent. To give something of the flavor, one definition (223) is
Then according to what we are saying now, Theaetetus, it seems that if we take expertise in appropriation, in hunting, in animal-hunting, in land-animal-hunting, in the hunting of humans, by persuasion, in private, involving selling for hard cash, offering a seeming education, the part of it that hunts rich and reputable young men is --- to go by what we are saying now --- what we should call the expertise of the sophist.
while another (268) is
The expert in imitation, then, belonging to the contradiction-producing half of the dissembling part of belief-based expertise, the word-conjuring part of the apparition-making kind from image-making, a human sort of production marked off from its divine counterpart --- if someone says that the one who is 'of this family kind, of this blood' is the real sophist, it seems his account will be the truest.
In between, there is a lot of discussion of, essentially, how multiple statements can all be true of the same object.
(Theaetetus opens with a frame-story about someone having witnessed, and taken notes on, the original conversation between Theaetetus, Socrates and Theodorus, and ordering his slave to read the dialogue that follows. This conceit is forgotten in Sophist.)
I am impressed with Theaetetus (though not with Theaetetus), but both books are strange, and left me feeling I'd missed the point. §
Mary Sisson, Tribulations
Mind candy science fiction, sequel to Trang and Trust. It's deeply enjoyable and I hope we don't have to wait another seven years for more. §
Lois McMaster Bujold, Penric and the Shaman, Penric's Mission, Mira's Last Dance, The Prisoner of Limnos, The Orphans of Raspay, The Physicians of Vilnoc, The Assassins of Thasalon, Knot of Shadows
Mind candy fantasy, following on from Penric's Demon but all, I think, self-contained. These are short, minor Bujolds (except for Assassins, which is a full-length novel), but even minor Bujold is a treat. (No purchase link since these only seem available electronically.) §
Domenico Bertoloni Meli, Mechanism: A Visual, Lexical, and Conceptual History
This is a brief but deeply erudite historical study of what "mechanism", "the mechanical philosophy" and mechanical explanations meant during the long 17th century that gave us the Scientific Revolution. Bertoloni Meli has read, seemingly, absolutely everything, in multiple languages, and can move skillfully and insightfully from historiographic debates about "the mechanization of the world picture" to contemporary ideas in the philosophy of science about explanation by mechanisms to the details of how ligature of arteries were drawn in anatomical texts, and what this tells us about how doctors' understanding of what ligatures did changed. All of this is done with very graceful writing and elegantly-chosen illustrations. It's incredibly impressive and makes me want to read a lot more of his work. §
(On a local and merely personal note, this book is based on lectures given at the University of Pittsburgh in 2016. I was told about those lectures and invited to attend them by a then-new acquaintance who worked in the history of science. Only in retrospect did I get why she seemed so disappointed when I had to cancel on short notice. I am not very swift on the uptake, but --- Reader, I married her.)
Joseph Heath, Enlightenment 2.0: Restoring Sanity to Our Politics, Our Economy, and Our Lives
2800 word review: Enlightenment Is Other People. §
Patrick Sharkey, Stuck in Place: Urban Neighborhoods and the End of Progress toward Racial Equality
Sharkey's primary emprical finding is that, among all black families, there is a substantial minority of very poor black families living from generation to generation in neighborhoods with many other poor black families, and who mostly move (if they move at all) from one such neighborhood to another. Moreover, these families are really much worse off than typical Americans, in every way which we can measure, and which drags down over-all averages for blacks as a group. What Sharkey wants to argue, on this basis, is that part of the reason for these persistently bad outcomes is that concentrating these poor, troubled families in neighborhoods with a lots of other poor, troubled families makes it harder for any of them to improve their situation.
The natural methodological worry goes like so: suppose that there are some poor, troubled families who will struggled to improve their situation, partly because of internal issues, partly because of larger social forces which would afflict them wherever they lived. But because they are poor and troubled, all sorts of processes, starting with housing costs, will concentrate them in neighborhoods with other families in similar situations. Even if the neighborhood has no effect on life prospects, it would still be a sign of those prospects. Under mild assumptions, it'd be a stronger sign the longer a family has been stuck in such a place. More plausibly: regressions of life outcomes on neighborhood of residence, neighborhood of origin, parents' or grant-parents' neighborhoods, etc., could all be explained through infinitely many combinations of genuine neighborhood effects, and neighborhoods acting as signs.
Now there are ways you can begin to pick apart this causal-inference tangle, and in various of the journal papers on which this book is based Sharkey does so. (Some, but not all, of this material is covered in the online appendix.) In particular, his joint paper with Felix Elwert on the inheritance of dis-advantage is actually just as good as I'd expect of Felix. But in this book I grew impatient, while reading, with the feeling that I was just being told about every possible linear regression which you could run on the Panel Study of Income Dynamics where both race and neighborhood poverty rate were regressors, as though that addressed the issue. I realize that this says more about my professional deformations than the merits of this book.
I read this for the inequality class, and while I didn't assign any of it this time, I might well do so if I re-teach it. I will definitely be recommending the backing papers as supplemental reading. §
Richard A. Zeller and Edward G. Carmines, Measurement in the Social Sciences: The Link between Theory and Data (1980)
I wish I liked this more, because it's heart is in the right place. In particular, trying to see what remains of psychometric's classical test theory after admitting that systematic error is possible is a worthwhile undertaking! But this book's faith in what can be achieved through factor analysis and comparing correlation coefficients is utterly misguided. (Cf., though Clark doesn't discuss this book explicitly, Glymour.) I had hoped I could recommend this to The Kids, but an adequate exposition of necessary caveats would rival the text itself for length. §

Posted at December 31, 2021 23:59 | permanent link

## November 30, 2021

### Books to Read While the Algae Grow in Your Fur, November 2021

Attention conservation notice: I have no taste, and no qualifications to opine about how to conduct either social science, or the German Social Democratic Party at the end of the 19th century.

Eduard Bernstein, The Preconditions of Socialism (1899; trans. and ed. Henry Tudor)
The original revisionist. Here are some of Bernstein's more important and representative heresies, from the viewpoint of orthodox, Second International Marxism: the dialectic is unhelpful and not actually essential to Marx and Engels's best work; the number of people who own capital is growing, not shrinking; class structure is not simplifying to a stark opposition of capitalists and proletarians; workers are not being increasingly immiserated; formal democracy is essential; it turns out that in even partially-democratic states, organized political action can do a lot to improve worker's lives, without waiting for the revolution; the state couldn't just take over running the economy successfully; etc., etc. As should be obvious from my tone, I find a lot of these ideas extremely congenial, though Bernstein was, it must be said, rather more sanguine about colonialism, and especially about European nationalism, than looks wise in retrospect. (Since, 15 years after this book, he was opposing World War I in the Reichstag, I wonder if he ever explicitly admitted errors on those points.) A dedicated proponent of orthodoxy could, naturally, argue that while the prophecies haven't been fulfilled yet, their hour will come round at last...
This edition is the first un-abridged English translation, with helpful footnotes explaining now-dated references, and giving full citations for his quotations &c. (The first, seriously abridged, English translation is online.) It says something about me that I found this an exciting read.
Scott Ashworth, Christopher R. Berry and Ethan Bueno de Mesquita, Theory and Credibility: Integrating Theoretical and Empirical Social Science
My remarks having passed the 900 word mark, they became a separate review.
Mind-candy fantasy, in a world of little magic, but a lot of superstition and a lot of desire for vengeance. Not Cherryh's identity-bending best, but I wanted a comfort re-read and this delivered.
Mind candy thriller. What if (I refuse to regard this as a spoiler) Dexter, but the serial killer who hunts killers was a Sydney homicide detective? (I haven't bothered to go check the publication dates to see if that actually explains it, or it's just convergent evolution in the space of psycho-killer mysteries.) OK but left me without any desire to continue the series.
Lee Goldberg, Gated Prey
Extremely fluffy mind-candy mystery. (Previously.)

Posted at November 30, 2021 23:59 | permanent link

## November 23, 2021

### Call to Pittsburgh (2021 edition)

We are looking to hire this year, both on the teaching track and the tenure track. It's a great department and you should apply if you're at all interested in professing statistics, even or indeed especially if your background isn't traditional stats. (I say this despite the fact that every application we get now means more work for me later.) If any reader has questions I might be able to answer, please don't hesitate to get in touch.

Posted at November 23, 2021 11:00 | permanent link

### Import Substitution Is a Harsh Mistress

Attention conservation notice: 1400 words on the development economics of space colonization from someone who is neither an economist nor even a rocket scientist. Yet another semi-crank notion, quietly nursed for many years, drafted in this form in 2011, posted a decade later because I can't stand to do any more grading and want to procrastinate of Very Important Reasons I am not at liberty to reveal at this time.

So, what with the end of space shuttle flights and all, my feed-reader has been filled with people bemoaning the state of human space flight. While I share the sheer romantic longing for it (expressed with greater or lesser sophistication), if we want to consider other rationales for sending people into space, it's hard to come up with anything which can't be done better by robots. The only one I can think of is providing, as it were, a distributed back-up system for humanity --- places which could carry on the species should the Earth becomes uninhabitable. If this is the point, it imposes some constraints which are not, I think, sufficiently appreciated.

Colonies which could help in this way have to be at least potentially self-sufficient, without dependence on the Earth --- no spare parts, no processed intermediate inputs, nothing. Since there are no natural environments off Earth in which people can live, they will have to create artificial environments, which means that extra-terrestrial human societies must be industrial civilizations. Self-sufficiency means creating, in miniature, a whole industrial ecology.

Go read Brian Hayes's Infrastructure if you haven't already; I'll wait. We're talking about replicating all of those functions, and more. Now, remember that all the technologies whose complexities Hayes documents so lovingly have been developed to assume, and to make use of: gravity of 9.8 m s-2, ambient temperatures between ~230 and ~320 K, an unlimited supply of atmosphere which is about 20% oxygen at a pressure of about 105 N m-2, abundant and cheap liquid water, etc. Moreover, our technologies assume that their environment is big, so they can dump waste products, starting with heat and mechanical vibrations, into the environment. Simply sticking terrestrial machinery inside a small, fragile, carefully-controlled artificial environment is not going to work well. (You want to try running a smelter inside your space habitat?) So duplicating these capacities for a space colony will mean re-designing everything to fit local conditions profoundly different from anything we've faced before.

This will take a lot of design work and trial-and-error, hence it will be expensive: the workers and designers could have been doing other things, the gear and machine parts and material resources could have been put to other uses. How are these development costs to be recovered? The extra-terrestrial market, we will have to assume, will begin and long remain very much smaller than Earth's, so sharing those fixed development costs over a small population implies high average costs. (Colonies in different parts of the solar system will face different local conditions, and need to develop largely different technologies, so we can treat this colony by colony.) What about expanding the market by exporting? Suppose momentarily a complete subsidy for the fixed costs, and so think about marginal cost pricing. For exportable items, their cost at Earth will equal marginal cost of production in space plus marginal cost of interplanetary transport. Unless making comparable items on Earth is (almost literally) astronomically more expensive, there will be no export market for the colonies. And this is assuming, again that investors were willing to write off all development costs.

(At this point, readers may be tempted to invoke comparative advantage, and say that even if Space is less efficient at producing everything than Earth is, both Space and Earth will be better off if Space makes what it is relatively better at. Carefully examined, however, what the classic Ricardian argument proves is that there is an opportunity cost to not using the less-efficient country's factors of production, viz., the stuff which it could have, inefficiently, produced. To minimize the opportunity cost of letting those factors go idle, they should be employed in their least-inefficient use. So even if making widgets costs 1000 times as much in Space as on Earth, if widgets are the least-inefficient of Space's factors of production, it should make widgets, and trade them for other things. But this presumes that Space and its factors would exist without the trade. Since, for us, the whole question is whether there should be any workers, capital, etc., in Space, this line of argument just doesn't apply.)

Unless people come up with something valuable which can be made in space but cannot, or almost cannot, be made on Earth, it's hard to think of any manufactured goods which it would be sensible to export from space. What might make sense would be for space colonies to find comparatively cheap natural resources, requiring minimal on-site processing, and export them to Earth, in exchange for, well, everything else. Ideally the exports from the colonies would also be very stable physically and chemically, so they could be sent by slow, low-energy, automated (and therefore cheap) orbits to Earth. When you figure out what those resources are, especially ones that Earth doesn't already have in abundance, let the worlds know; please don't say "helium 3". Alternatively, one thing which can be produced on (say) Titan vastly more cheaply than on Earth is the experience of being on Titan: encapsulated in the form of science or entertainment, that experience could be shipped very cheaply to Earth, which might be willing to pay for it. Of course, neither an economy based on resource-extraction nor one based on scientific papers and reality TV would be self-sufficient. The logic of endogenous comparative advantage would, in fact, lock in place the mother of all core-periphery divisions, with the space colonies as the eternally dependent periphery.

A colony could, I suppose, decide to impose on itself the costs of developing its own industrial infrastructure, so as to replace imports from Earth. Those costs, to repeat, would be very high. Moreover, there's really no substitute for experience and experiment in improving technologies, so the initial quality and reliability will be low. Since, again, the local market will be small, it will not be able to support many producers, perhaps just one in each sector. There will be little scope for a diversity of local approaches to the problems of the industry, slowing innovation. There will also be little or no competition, with all that entails.

The picture of space colonies which might actually become self-sufficient, then, looks something like this. The population is forced by its leaders to endure endless privations to build monopolistic industries which produce inferior goods to those already available on the universal market, grimly tending towards autarky while exporting primary goods for the time being, on the promise that one day all of these sacrifices will be redeemed when they become the future of humanity. Somehow, I doubt there are many who find the idea of building socialism in one habitat compelling; Ken MacLeod may know them all by name.

(I have assumed everything stays within the solar system, because, pace Krugman, interstellar trade makes no sense at all. A civilization which could command enough energy to accelerate a large object to a significant fraction of the speed of light, so that trips between nearby stars take only decades, has no economic problem. At perhaps-attainable velocities, with thousands or tens of thousands of years of travel time, exchange is economically irrelevant, though it might still be attempted for cultural reasons. The obstacles in the way of human interstellar travel are of course immense. I have long thought it vastly more plausible to send robots which could then build suitable environments in which to grow human beings [also recently proposed by Charlie Stross], and that involves bio-engineering hand-waving of epic proportions.)

Comment, Nov. 2021: On re-reading, my treatment of the Ricardian argument is a little cavalier, but I don't feel energetic enough to write out and solve a New Economic Geography model where population and comparative advantage are both endogenous. If anyone is inspired to do this properly, though, I'd be genuinely fascinated to read it, and promise to link here.

Update, 16 January 2022:: Tweaked the phrasing about opportunity costs in the 4th paragraph a little (and I hope removed more typos than I added).

Posted at November 23, 2021 10:45 | permanent link

## November 17, 2021

### Random-Feature Matching

$\newcommand{\ModelDim}{d}$

So I have a new preprint:

CRS, "A Note on Simulation-Based Inference by Matching Random Features", arxiv:2111.09220
We can, and should, do statistical inference on simulation models by adjusting the parameters in the simulation so that the values of randomly chosen functions of the simulation output match the values of those same functions calculated on the data. Results from the "state-space reconstruction" or "geometry from a time series" literature in nonlinear dynamics indicate that just $2\ModelDim+1$ such functions will typically suffice to identify a model with a $\ModelDim$-dimensional parameter space. Results from the "random features" literature in machine learning suggest that using random functions of the data can be an efficient replacement for using optimal functions. In this preliminary, proof-of-concept note, I sketch some of the key results, and present numerical evidence about the new method's properties. A separate, forthcoming manuscript will elaborate on theoretical and numerical details.

I've been interested for a long time in methods for simulation-based inference. It's increasingly common to have generative models which are easy (or at least straightforward) to simulate, but where it's completely intractable to optimize the likelihood --- often it's intractable even to calculate it. Sometimes this is because there are lots of latent variables to be integrated over, sometimes due to nonlinearities in the dynamics. The fact that it's easy to simulate suggests that we should be able to estimate the model parameters somehow, but how?

An example: My first Ph.D. student, Linqiao Zhao, wrote her dissertation on a rather complicated model of one aspect of how financial markets work (limit-order book dynamics), and while the likelihood function existed, in some sense, the idea that it could actually be calculated was kind of absurd. What she used to fit the model instead was a very ingenious method which came out of econometrics called "indirect inference". (I learned about it by hearing Stephen Ellner present an ecological application.) I've expounded on this technique in detail elsewhere, but the basic idea is to find a second model, the "auxiliary model", which is mis-specified but easy to estimate. You then adjust the parameters in your simulation until estimates of the auxiliary from the simulation match estimates of the auxiliary from the data. Under some conditions, this actually gives us consistent estimates of the parameters in the simulation model. (Incidentally, the best version of those regularity conditions known to me are still those Linqiao found for her thesis.)

Now the drawback of indirect inference is that you need to pick the auxiliary model, and the quality of the model affects the quality of the estimates. The auxiliary needs to have at least as many parameters as the generative model, the parameters of the auxiliary need to shift with the generative parameters, and the more sensitive the auxiliary parameters are to the generative parameters, the better the estimates. There are lots of other techniques for simulation-based inference, but basically all of them turn on this same issue of needing to find some "features", some functions of the data, and tuning the generative model until those features agree between the simulations and the data. This is where people spend a lot of human time, ingenuity and frustration, as well as relying on a lot of tradition, trial-and-error, and insight into the generative model.

What occurred to me in the first week of March 2020 (i.e., just before things got really interesting) is that there might be a short-cut which avoided the need for human insight and understanding. That week I was teaching kernel methods and random features in data mining, and starting to think about how I wanted to revise the material on simulation-based inference for my "data over space and time" in the fall. The two ideas collided in my head, and I realized that there was a lot of potential for estimating parameters in simulation models by matching random features, i.e., random functions of the data. After all, if we think of an estimator as a function from the data to the parameter space, results in Rahimi and Recht (2008) imply that a linear combination of $k$ random features will, with high probability, give an $O(1/\sqrt{k})$ approximation to the optimal function.

Having had that brainstorm, I then realized that there was a good reason to think a fairly small number of random features would be enough. As we vary the parameters in the generative model, we get different distributions over the observables. Actually working out that distribution is intractable, that's why we're doing simulation-based inference in the first place, but it'll usually be the case that the distribution changes smoothly with the generative parameters. That means that if there are $\ModelDim$ parameters, the space of possible distributions is also just $\ModelDim$-dimensional --- the distributions form a $\ModelDim$-dimensional manifold.

And, as someone who was raised in the nonlinear dynamics sub-tribe of physicists, $\ModelDim$-dimensional manifolds remind me about state-space reconstruction and geometry from a time series and embedology. Specifically, back behind the Takens embedding theorem used for state-space reconstruction, there lies the Whitney embedding theorem. Suppose we have a $\ModelDim$-dimensional manifold $\mathcal{M}$, and we consider a mapping $\phi: \mathcal{M} \mapsto \mathbb{R}^k$. Suppose that each coordinate of $\phi$ is $C^1$, i.e., continuously differentiable. Then once $k=2\ModelDim$, there exists at least one $\phi$ which is a diffeomorphism, a differentiable, 1-1 mapping of $\mathcal{M}$ to $\mathbb{R}^k$ with a differentiable inverse (on the image of $\mathcal{M}$). Once $k \geq 2\ModelDim+1$, diffeomorphisms are "generic" or "typical", meaning that they're the most common sort of mapping, in a certain topological sense, and dense in the set of all mappings. They're hard to avoid.

In time-series analysis, we use this to convince ourselves that taking $2\ModelDim+1$ lags of some generic observable of a dynamical system will give us a "time-delay embedding", a manifold of vectors which is equivalent, up to a smooth change of coordinates, to the original, underlying state-space. What I realized here is that we should be able to do something else: if we've got $\ModelDim$ parameters, and distributions change smoothly with parameters, then the map between the parameters and the expectations of $2\ModelDim+1$ functions of observables should, typically or generically, be smooth, invertible, and have a smooth inverse. That is, the parameters should be identifiable from those expectations, and small errors in the expectations should track back to small errors in the parameters.

Put all this together: if you've got a $\ModelDim$-dimensional generative model, and I can pick $2\ModelDim+1$ random functions of the observables which converge on their expectation values, I can get consistent estimates of the parameters by adjusting the $\ModelDim$-generative parameters until simulation averages of those features match the empirical values.

Such was the idea I had in March 2020. Since things got very busy after that (as you might recall), I didn't do much about this except for reading and re-reading papers until the fall, when I wrote it up as grant proposal. I won't say where I sent it, but I will say that I've had plenty of proposals rejected (those are the breaks), but never before have I had feedback from reviewers which made me go "Fools! I'll show them all!". Suitably motivated, I have been working on it furiously all summer and fall, i.e., wrestling with my own limits as a programmer.

But now I can say that it works. Take the simplest thing I could possibly want to do, estimating the location $\theta$ of a univariate, IID Gaussian, $\mathcal{N}(\theta,1)$. I make up three random Fourier features, i.e., I calculate $F_i = \frac{1}{n}\sum_{t=1}^{n}{\cos{(\Omega_i X_t + \alpha_i)}}$ where I draw $\Omega_i \sim \mathcal{N}(0,1)$ independently of the data, and $\alpha_i \sim \mathrm{Unif}(-\pi, \pi)$. I calculate $F_1, F_2, F_3$ on the data, and then use simulations to approximate their expectations as a function of $\theta$ for different $\theta$. I return as my estimate of $\theta$ whatever value minimizes the squared distance from the data in these three features. And this is what I get for the MSE:

OK, it doesn't fail on the simplest possible problem --- in fact it's actually pretty close to the performance of the MLE. Let's try something a bit less well-behaved, say having $X_t \sim \theta + T_5$, where $T_5$ is a $t$-distributed random variable with 5 degrees of freedom. Again, it's a one-parameter location family, and the same 3 features I used for the Gaussian family work very nicely again:

OK, it can do location families. Since I was raised in nonlinear dynamics, let's try a deterministic dynamical system, specifically the logistic map: $S_{t+1} = 4 r S_t(1-S_t)$ Here the state variable $S_t \in [0,1]$, and the parameter $r \in [0,1]$ as well. Depending on the value of $r$, we get different invariant distributions over the state-space. If I sampled $S_1$ from that invariant distribution, this'd be a stationary and ergodic stochastic process; if I just make it $S_1 \sim \mathrm{Unif}(0,1)$, it's still ergodic but only asymptotically stationary. If I used the same 3 random Fourier features, well, this is the distribution of estimates from time series of length 100, when the true $r=0.9$, so the dynamics are chaotic:

I get very similar results if I use random Fourier features that involve two time points, i.e., time-averages of $\cos{(\Omega_{i1} X_{t} + \Omega_{i2} X_{t-1} + \alpha+i)}$, but I'll let you look at those in the paper, and also at how the estimates improve when I increase the sample size.

Now I try estimating the logistic map, only instead of observing $S_t$ I observed $Y_t = S_t + \mathcal{N}(0, \sigma^2)$. The likelihood function is no longer totally pathological, but it's also completely intractable to calculate or optimize. But matching 5 ($=2\times 2 + 1$) random Fourier features works just fine:

At this point I think I have enough results to have something worth sharing, though there are of course about a bazillion follow-up questions to deal with. (Other nonlinear features besides cosines! Non-stationarity! Spatio-temporal processes! Networks! Goodness-of-fit testing!) I will be honest that I partly make this public now because I'm anxious about being scooped. (I have had literal nightmares about this.) But I also think this is one of the better ideas I've had in years, and I've been bursting to share.

As $r$ in the logistic map varies from 0 (dark blue) to 1 (light pink), time-averages of 3 random Fourier features trace out a smooth, one-dimensional manifold in three-dimensional space. Different choices of random features would give different embeddings of the parameter space, butthat three random features give an embedding is generic.

Update, 21 June 2022: a talk on this, in two days time.

Posted at November 17, 2021 20:30 | permanent link