## April 27, 2018

### Course Announcement: Data over Space and Time (36-467/667)

Attention conservation notice: Notice of an advanced statistics class at a university you probably don't attend, covering abstruse topics you probably don't care about. Also, it's the first time the class is being offered, so those who do take it will have the fun of helping me debug it.

This course is an introduction to the opportunities and challenges of analyzing data from processes unfolding over space and time. It will cover basic descriptive statistics for spatial and temporal patterns; linear methods for interpolating, extrapolating, and smoothing spatio-temporal data; basic nonlinear modeling; and statistical inference with dependent observations. Class work will combine practical exercises in R, some mathematics of the underlying theory, and case studies analyzing real problems from various fields (economics, history, meteorology, ecology, etc.). Depending on available time and class interest, additional topics may include: statistics of Markov and hidden-Markov (state-space) models; statistics of point processes; simulation and simulation-based inference; agent-based modeling; dynamical systems theory.

Co-requisite: For undergraduates taking the course as 36-467, 36-401. For graduate students taking the course as 36-667, consent of the professor.

Course materials will be posted publicly on the class website (once that's up).

Posted at April 27, 2018 09:19 | permanent link

## April 12, 2018

### Major depression, qu'est-ce que c'est?

Attention conservation notice: 1100+ words on a speculative scientific paper, proposing yet another reformation of psychopathology. The post contains equations and amateur philosophy of science. Reading it will not make you feel better. — Largely written in 2011 and then forgotten in my drafts folder, dusted off now because I chanced across one of the authors making related points.

As long-time readers may recall, I am a big fan of Denny Borsboom's work on psychometrics, and measurement problems more generally, so I am very pleased to be able to plug this paper:

Denny Borsboom, Angélique O. J. Cramer, Verena D. Schmittmann, Sacha Epskamp and Lourens J. Waldorp, "The Small World of Psychopathology", PLOS ONE 6 (2011): e27407 [Data, code, etc., not verified by me]
Abstract: Mental disorders are highly comorbid: people having one disorder are likely to have another as well. We explain empirical comorbidity patterns based on a network model of psychiatric symptoms, derived from an analysis of symptom overlap in the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV).
We show that a) half of the symptoms in the DSM-IV network are connected, b) the architecture of these connections conforms to a small world structure, featuring a high degree of clustering but a short average path length, and c) distances between disorders in this structure predict empirical comorbidity rates. Network simulations of Major Depressive Episode and Generalized Anxiety Disorder show that the model faithfully reproduces empirical population statistics for these disorders.
In the network model, mental disorders are inherently complex. This explains the limited successes of genetic, neuroscientific, and etiological approaches to unravel their causes. We outline a psychosystems approach to investigate the structure and dynamics of mental disorders.

In the initial construction of the graph here, two symptoms are linked if they are mentioned in the DSM as criteria for the same disorder. That is, Borsboom et al. think of the DSM as a bipartite graph of symptoms and disorders, and project down to just symptoms. (There is some data-tidying involved in distinguishing symptoms and disorder.)

The small-world stuff leaves me cold — by this point it might be more interesting to run across a large-world network — but the model is intriguing. Each node (i.e., symptom) is a binary variable. The probability that node $i$ gets activated at time $t$, $p_{it}$, is a function of the number of activated neighbors, $A_{i(t-1)}$: $p_{it} = a + (1-a) \frac{e^{b_i A_{i(t-1)}-c_i}}{(1-a)+e^{b_i A_{i(t-1)}-c_i}}$ In words, the more linked symptoms are present, the more likely it is for symptom $i$ to be present to, but symptoms can just appear out of nowhere.

Statistically, this is a logistic regression: $b_i$ is how much symptom $i$ is activated by its neighbors in the graph, $c_i$ is a threshold specific to that symptom, and $a$ controls the over-all rate of spontaneous symptom appearance and disappearance. Using a very interesting data set (the National Comorbidity Survey Replication of about 9200 US adults), Borsboom et al. in fact fixed the $b_i$ and $c_i$ parameters by running logistic regressions. The $a$ parameter, which was kept the same across symptoms, was tweaked to make the rate of spontaneous occurrence not too unreasonable.

What Borsboom et al. did with this model was to run it forward for 365 steps (i.e., a year), and then look at whether, in the course of the previous year, it would have met the DSM criteria for major depression, and for generalized anxiety disorder, and then repeat across multiple people. It did a pretty good job of matching the prevalence of both disorders, and got their co-morbidity a bit too high but not crazily so.

Now, as a realistic model, this is rubbish, for a host of reasons. Lots of the edges have to be wrong; the edges should be directed rather than undirected; the edges should be weighted; the logistic form owes more to what psychologists are used to than any scientific plausibility. (Why should psychopathology be a spin glass?) The homogeneity of parameters across people could easily fail. And yet even so it comes within spitting distance of reproducing the observed frequencies of different conditions, and their co-morbidities.

Notice that despite this, there are no underlying disease variables in this network, just symptoms. So why do we believe that there are unitary disease entities? I can see at least three routes to that:

1. Perhaps this symptom-network model simply fails to match the detailed statistics of the data, while latent-disease-entity models can. This might be a bit boring, perhaps, but it would be persuasive if one could show that no model without the disease entities could work. (I find that dubious, but my doubt is not evidence.)
2. Alternately, one might appeal to causal autonomy. The temperature of a gas, in a strong sense, amounts to the average kinetic energy of its molecules, and one can accurately simulate gases at a molecular level without ever invoking the notion of temperature. But if I manipulate the gas to have a certain temperature, then, very quickly, the effects on pressure and volume, and even the velocity distribution of individual molecules, is always (pretty much) the same, no matter how I bring the temperature about. This is what lets us give sensible causal, counter-factual accounts at the level of temperature, and thermodynamics more generally. (Cf. Glymour.)
Now, in the network model, we can imagine "giving someone" generalized anxiety disorder, by activating some set of nodes which meets the DSM criteria for that condition. There are actually multiple, only partially-overlapping symptom sets which will do. In the network model, these different instantiations of generalized anxiety disorder will have similar but not identical consequences (for other symptoms, duration of the condition, response to treatments, etc). If, in reality, it makes no difference how someone comes to meet the criteria for generalized anxiety disorder, the implications for the future are always the same, that would be a powerful argument that the disorder is something real.
More medically: think how we distinguish diabetes into type 1 (the body doesn't make enough insulin) and type 2 (the body doesn't respond properly to insulin). This is, I'd say, because they differ greatly in their causal implications, but once you find yourself in one of these classes, it makes little difference how you got there.
3. It could be that a description in terms of higher-level entities like depression allows for a higher efficiency of prediction than just sticking with symptoms. This notion could even be made fairly precise; it may also end up being the same as the second route.

Of course, it might be that to make any of these three defenses (or others which haven't occurred to me) work properly, we'd have to junk our current set of disorders and come up with others...

Posted at April 12, 2018 14:30 | permanent link

## April 01, 2018

### An _Ad Hominid_ Argument for Animism

Attention conservation notice: Note the date.

A straight-forward argument from widely-accepted premises of evolutionary psychology shows that humans evolved in an environment featuring invisible beings with minds and the ability to affect the material world, especially through what we'd call natural forces.

1. (Premise) Humans have evolved psychological modules, which carry out specific sorts of computations on very specific sorts of representations, as triggered by environmental conditions. These modules are in fact adaptations to the "environment of evolutionary adaptation", or, rather, environments.
2. (Premise) Indeed, when we encounter a human cognitive module, we should presume that it is an evolved adaptation.
3. (Premise) Humans have modules for theory-of-mind, social exchange, and otherwise dealing with intentional agents by reckoning with their beliefs, desires, intentions, and (crucially) capacities to act on those intentions.
4. Therefore, the human modules for theory-of-mind, social exchange, and dealing with intentional agents are evolved adaptations to our ancestral environment.
5. (Premise) Humans often engage those modules when dealing with invisible beings, often manifesting as (what scientists categorize as) natural forces.
(In fact, such engagement of those modules was near-universal up to the emergence of WEIRD societies. The historical record shows aberrant individuals who did not do this, but it's plain even from texts those individuals authored, when they have come down to us, that their bizarre behavior had absolutely no traction on the vast, neurotypical majorities of their societies. [One is reminded of the militantly color-blind trying to convince others that colors do not exist.] Moreover, treating natural forces as manifestations of invisible beings who are intentional agents, amenable to bargaining, threats, supplication, etc., etc., is still very common in WEIRD societies, perhaps even modal.)
6. (Premise) Engaging a wrong or inappropriate module is expensive, even potentially dangerous, and thus mal-adaptive, and so should be selected against.
7. If natural forces are mindless and invisible beings did not exist in the EEA, then engaging theory-of-mind and social-exchange modules to deal with natural forces and invisible beings would be mal-adaptive.
(Occasionally, people suggest that it's so dangerous to ignore another intentional agent that it was adaptive for our ancestors to suspect intentionality everywhere, on "better safe than sorry" grounds. I have never seen this supported by a concrete calculation of the costs, benefits and frequencies of the relevant false-positive and false-negative errors. I have also never seen it supported by a design analysis of why our ancestors could not have evolved to realize that storms, earthquakes, droughts, diseases, etc., were no more intentional agents than, say, fruit, or stone tools.)
8. Since those modules are adaptive, we must conclude that invisible beings with beliefs, desires, intentions, and the power to act on them, especially through "natural" forces were a common, recurring, predictable feature of the environments of evolutionary adaptation.

Of course, none of this implies that those invisible beings aren't as extinct as mammoths.

To spoil the [not very funny] joke: even if the relevant modules exist, they are engaged not by intentional-agent-detectors, but by human mental representations of intentional agents. Once the idea starts that storms are the wrath of some invisible being, that can be self-propagating. For further details, I refer to the works of Dan Sperber, especially Explaining Culture (and to some extent Rethinking Symbolism). Credit for the phrase "ad hominid argument" goes, I believe, to Belle Waring, back in the Early Classic period of blogging.

Posted at April 01, 2018 22:59 | permanent link

## February 28, 2018

### Books to Read While the Algae Grow in Your Fur, February 2018

Attention conservation notice: I have no taste.

Joel Michell, Measurement in Psychology: A Critical History of a Methodological Concept
Comments having passed the 1500 word mark, including long quotations, this will have to be a separate review.
H. P. Lovecraft, At the Mountains of Madness
This is an umpteenth re-read, of course. (I tend to do them in the winter.) This one made me want to read a history of subsequent Elder Thing archaeology, where the mountains and the city were revisited during the International Geophysical Year, and it's become obvious that 99% of this is as much a product of the discoverers' imagination and preconceptions as, say, Arthur Evans's views of the Minoans. (But that 1%...)
Lauren Willig, The English Wife
Mind candy historical mystery, set in New York and London just a bit before 1900. An interesting aspect of the writing is that here, as in her historical romance novels, Willig uses two time-lines, where the characters in one time-line are trying to discover what happened in the other. But in the romances the time-lines are parallel, whereas here they converge; what this signifies, I couldn't say.
Peter Godfrey-Smith, Theory and Reality: An Introduction to the Philosophy of Science
I can easily say that this is the one of the best modern introductory books on the philosophy of science I've ever read. (Another, of a very different sort, is William Poundstone's Labyrinths of Reason.) It's presented roughly historically, beginning with Logical Positivism and moving forward, through Popper, Kuhn, such post-Kuhnians as Lakatos, Feyerabend and Laudan, and classic 1970s/1980s "sociology of scientific knowledge", before ending with a range of contemporary topics. Throughout, Godfrey-Smith strikes a good balance between persuading the reader that there are problems worth wrestling with, and that they're not hopeless.
To the former: too many scientists, encountering issues from the philosophy of science, find them pointless, or at most things which could be cleared up in an afternoon with a little clear thinking and maybe some algebra. (Occasionally this results in weird little cults like self-styled "strong inference", which is firmly put in its place here.) Godfrey-Smith is very good at conveying how there are real issues here, which very smart people have wrestled with, without coming to any truly satisfactory answers.
This then raises the possibility that the exercise is futile, not because it's unimportant but because it's doomed, that the problems are just too hard for us. Against this, Godfrey-Smith is good at conveying how, if we're still confused about questions like "When does observing something that a theory predicts confirm the theory?", or "How can the social organization of a scientific community support its cognitive goals?", we're at least understanding the issues much better. (For example, it's become very clear that social organization does matter.)
This book is worthwhile reading for any scientist interested in philosophical issues. It might be even more worthwhile for those who aren't interested, but...
--- Two thoughts which occurred to me while reading Godfrey-Smith's discussion of how "naturalistic" philosophy of science is anti-foundationalist, in the sense of eschewing the search for philosophical foundations for the sciences which are somehow prior to the sciences themselves.
1. Strong forms of this would say that such foundations are impossible or undesirable. A weaker form, however, would compare the track-records of philosophy and science, and say that it's rash to expect philosophy to be more secure than (say) neurophysiology any time soon. (Where this would leave, say, social psychology is a nice question.) I am not sure whether anyone has taken this position within the philosophical literature, or even what it would be called.
2. Saying that we will use the results of scientific inquiry to understand the process of scientific inquiry can sound like a vicious circle, but can also, more reasonably, be just a self-consistency check. If our best scientific understanding of the world and ourselves implied that scientific inquiry was unreliable, we would have a real problem. Worries about science being self-undermining are a a long-running theme in the history of the sort of philosophy of science that Godfrey-Smith writes about, going back before the Logical Positivists into the nineteenth century (see, e.g., Leszek Kolakowski's The Alienation of Reason / History of Positivist Thought from Hume to the Vienna Circle and his Husserl and the Search for Certitude), and continues on today (naturally in meme format). Even if all naturalistic philosophy of science achieves is showing that science doesn't undermine itself the way that the more ambitious and outrageous forms of sociology of knowledge do, this would be a real accomplishment.)
Richard Thompson Ford, The Race Card: How Bluffing about Bias Makes Race Relations Worse
Let me spoil the ending:
No doubt some readers will wish to ask whether I really think playing the race card is now the biggest racial justice issue this society faces. No, I don't. I hope it's clear that I believe old-school bigotry remains a severe social problem and that subtler and systemic racial disadvantages --- even when they can't be blamed on "racists" --- are profound social evils that demand redress. These are bigger problems than playing the race card. But the race card is an impediment to dealing with these problems. It distracts attention from larger social injustices. It encourages vindictiveness and provokes defensiveness when open-mindedness and sympathy are needed. It leads to an adversarial, tit-for-tat mind-set ("You're a bigot!" "No, you're just playing the race card!") when a cooperative spirit of dialogue is required.
The race card is symptomatic of a real crisis in the way we currently think and talk about race: a crisis borne of our failure to keep up with a changing social world, a crisis of social change and of intellectual stasis. We need new intellectual tools and new language to deal with the new realities of American racism. Thus far we've failed to develop them, so we find ourselves increasingly unable to discuss issues of race intelligently and convincingly. We find ourselves listening to and repeating the slogans and catch-phrases of the past, whether or not they apply, like a catechism that's long since lost its power to invoke or inspire, or like a curse that damns guilty perpetrator and innocent bystander with indiscriminate contempt. [p. 349]
And this was in 2008! (Ford's skepticism about the Implicit Association Test is looking pretty good these days. His confidence that open expressions of outright racism have been driven to the fringes of American public life, maybe not so much.)
More constructively, I found chapter 2's discussion of "racism by analogy" thought-provoking, and chapter 3 on legal criteria for discrimination and disparate impact quite eye-opening.
John Pfaff, Locked In: The True Causes of Mass Incarceration --- and How to Achieve Real Reform
This is a thoughtful book about the causes of mass incarceration, and what can and should be done to reverse it. I should say at the beginning that Pfaff is as outraged as anyone about how many people we have in prison (or otherwise subject to "corrections"), so that when Pfaff challenges elements of what he calls the "standard story", it's not to minimize the disaster and disgrace, it's to help efforts at reform actually improve things. I found a lot of it convincing, but I should say up-front that I haven't tried to independently check any of Pfaff's figures or calculations.
The most convincing parts of the preliminary de-bunking are as follows:
1. Private prisons are awful, but they are quantitatively too small to account for mass incarceration. Also, the lobbying efforts of private prison corporations are too small, and come too late in the surge in incarceration, to explain it.
2. Most of our prison population isn't there for drug offenses, or non-violent offenses in general, but for violent crimes, and so undoing mass incarceration will mean changing how we deal with those convicted of violence. Pfaff presents this as a refutation of the idea that mass incarceration is due to the war on drugs, which I think is a bit too hasty (as I will explain below).
3. Maximum legal prison sentences have gone way up, and longer prison terms would naturally lead to more people being in prison. But this can't explain most of the growth in incarceration, because the actual average length of time served hasn't increase very much.
It then behooves Pfaff to explain why, in his view, we have so many more people in prison than we used to, even adjusting for population. Implicitly --- this is a popular book and he does no explicit models here --- he works with a "compartment" model, where the compartments or stages are something like: $[\text{Commit crime}] \Rightarrow [\text{Arrested}] \Rightarrow [\text{Charged}] \Rightarrow [\text{Convicted}] \Rightarrow [\text{Prison}] \Rightarrow [\text{Release/Parole}]$ where at each stage before prison one might be diverted away (e.g., arrested but not charged), and prison is of course of variable duration. The advantage of approaching the question "why are so many people in prison?" this way is that if you can track the number of people in each stage, and the flows of people from one stage to the next, they have to add up: the number of people in prison on 1 July 2018 will be the number who were in prison on 1 July 2017, plus those convicted and sentenced over the year, minus those released over the year. (At the risk of being dis-respectful, I am counting deaths in prison under "release".) Changing the proportions who go on from one stage to the next changes the flows, and hence will accumulate over time to the number of those in prison.
Pfaff claims that the big change which drove up the number of people in prison wasn't at the stage of being arrested, or convicted, or even the length of time spent in prison, but rather in the proportion of those arrested who are actually charged with a crime. This is a decision made by local public prosecutors. If we believe Pfaff's numbers, this locates a key source of the problem.
Unfortunately, as he is at pains to say, we have very little systematic information on prosecutors' offices and how they make their decisions. We do know that they face a somewhat perverse set of incentives, in that declining to charge someone who goes on to do something bad is electoral poison, but charging someone who's really harmless has almost no downside (for the prosecutor; it has plenty for the person charged, and their family and community). Prosecutors also face little opposition from public defenders, which is a big part of why almost all criminal charges are settled by plea deals, not brought to trial. The whole business is a mess, with almost no accountability (either to hierarchical superiors or to the democratic public), and scarcely any systematic reporting. Pfaff does not attempt to say why any of these issues should have gotten worse during this period, however.
Popular books about policy or social problems usually have a last chapter which talks about what to do about the issue. Pfaff follows this practice, and, as usual, it's the weakest part of the book, because his proposals are so much smaller than his own account of the scale of the problems. (Whether this is better or worse than the alternative tradition, of proposing measures which would solve the problem but also be totally unworkable, is a nice question.) In no small part this is because he has fairly convincingly localized the problem, but he's localized it not so much to a black box as to a mob of 3,000-odd ill-coordinated black boxes.
--- I said above that I am not sure Pfaff is entirely fair to the blame-the-drug-war camp; in particular, I think he ignores a fairly obvious counter-argument. He attacks the idea that the growth in incarceration is a result of the war on drugs, by pointing out that only a minority of those in prison are there for non-violent drug offenses, while the majority are there for violent crimes. Grant that this is true (as I said, I haven't checked his figures*.) How much of that violence is due to the war on drugs? Legal businesses get robbed, of course, and from time to time one even reads of, say, dentists conspiring to assassinate rival dentists, but this sort of thing is rare in trades where the law is available to settle disputes and protect property. A criminalized but lucrative drug trade, on the other hand, seems conducive to violence. Localizing the trade to specific neighborhoods make those dangerous, law-less places, further inciting violence (cf. Allen and Levoy). Effects like these are hard to quantify --- we can't just read them off from administrative data, as Pfaff likes to do --- but they could be very important. I'm not sure where this leaves us.
*: One point which would be good to check is how possessing of a firearm while committing another crime gets coded in these records. If every drug-dealer who gets busted while also carrying a gun counts as "violent", for example, that might make a substantial difference. (Or it might not; that's why someone should check.) ^
Danielle Allen, Cuz: The Life and Times of Michael A.
A memoir of the life, imprisonment and death of Allen's cousin Michael. It's at once the specific story of a unique person and their family, and a slice through what's gone wrong with our country*, that someone could be thrown in prison for eleven years for some stupid crimes committed at fifteen (where Michael was the only one hurt), ultimately setting his life on a path where, at age 29, his corpse was found in a shot-up car on the street. Michael made bad choices, which Danielle never shies from, but he made them in a foolishly, evilly un-forgiving context, in a society which essentially threw his life away for no good reason, and that is messed up. It's horribly, horribly sad, but beautifully told.
Disclaimer: I know Prof. Allen, and have participated in a series of workshops she organized and contributed to a book she edited, but I feel under no obligation to write a positive notice of her books.
*: One of the things which makes this a complicated book is that it is also, implicitly and in glimpses, the story of what has gone right with our country that it now creates people like its author. ^

Posted at February 28, 2018 23:59 | permanent link

## January 31, 2018

### Books to Read While the Algae Grow in Your Fur, January 2018

Attention conservation notice: I have no taste.

Maria Konnikova, The Confidence Game: And Why We Fall for It... Every Time
An engaging popular-science look at confidence games, their players and their marks. (Konnikova references a lot of the social psychology literature, which is certainly better than ignoring it, but I haven't had the heart to check how many of those studies have failed to replicate.)
Yuri Slezkine, The House of Government: A Saga of the Russian Revolution