April 27, 2018

Course Announcement: Data over Space and Time (36-467/667), Fall 2018

Attention conservation notice: Notice of an advanced statistics class at a university you probably don't attend, covering abstruse topics you probably don't care about. Also, it's the first time the class is being offered, so those who do take it will have the fun of helping me debug it.

This course is an introduction to the opportunities and challenges of analyzing data from processes unfolding over space and time. It will cover basic descriptive statistics for spatial and temporal patterns; linear methods for interpolating, extrapolating, and smoothing spatio-temporal data; basic nonlinear modeling; and statistical inference with dependent observations. Class work will combine practical exercises in R, some mathematics of the underlying theory, and case studies analyzing real problems from various fields (economics, history, meteorology, ecology, etc.). Depending on available time and class interest, additional topics may include: statistics of Markov and hidden-Markov (state-space) models; statistics of point processes; simulation and simulation-based inference; agent-based modeling; dynamical systems theory.

Co-requisite: For undergraduates taking the course as 36-467, 36-401. For graduate students taking the course as 36-667, consent of the professor.

Course materials will be posted publicly on the class website (once that's up).

Corrupting the Young; Enigmas of Chance

Posted at April 27, 2018 09:15 | permanent link

April 12, 2018

Major depression, qu'est-ce que c'est?

Attention conservation notice: 1100+ words on a speculative scientific paper, proposing yet another reformation of psychopathology. The post contains equations and amateur philosophy of science. Reading it will not make you feel better. — Largely written in 2011 and then forgotten in my drafts folder, dusted off now because I chanced across one of the authors making related points.

As long-time readers may recall, I am a big fan of Denny Borsboom's work on psychometrics, and measurement problems more generally, so I am very pleased to be able to plug this paper:

Denny Borsboom, Angélique O. J. Cramer, Verena D. Schmittmann, Sacha Epskamp and Lourens J. Waldorp, "The Small World of Psychopathology", PLOS ONE 6 (2011): e27407 [Data, code, etc., not verified by me]
Abstract: Mental disorders are highly comorbid: people having one disorder are likely to have another as well. We explain empirical comorbidity patterns based on a network model of psychiatric symptoms, derived from an analysis of symptom overlap in the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV).
We show that a) half of the symptoms in the DSM-IV network are connected, b) the architecture of these connections conforms to a small world structure, featuring a high degree of clustering but a short average path length, and c) distances between disorders in this structure predict empirical comorbidity rates. Network simulations of Major Depressive Episode and Generalized Anxiety Disorder show that the model faithfully reproduces empirical population statistics for these disorders.
In the network model, mental disorders are inherently complex. This explains the limited successes of genetic, neuroscientific, and etiological approaches to unravel their causes. We outline a psychosystems approach to investigate the structure and dynamics of mental disorders.

In the initial construction of the graph here, two symptoms are linked if they are mentioned in the DSM as criteria for the same disorder. That is, Borsboom et al. think of the DSM as a bipartite graph of symptoms and disorders, and project down to just symptoms. (There is some data-tidying involved in distinguishing symptoms and disorder.)

The small-world stuff leaves me cold — by this point it might be more interesting to run across a large-world network — but the model is intriguing. Each node (i.e., symptom) is a binary variable. The probability that node $i$ gets activated at time $t$, $p_{it}$, is a function of the number of activated neighbors, $A_{i(t-1)}$: \[ p_{it} = a + (1-a) \frac{e^{b_i A_{i(t-1)}-c_i}}{(1-a)+e^{b_i A_{i(t-1)}-c_i}} \] In words, the more linked symptoms are present, the more likely it is for symptom $i$ to be present to, but symptoms can just appear out of nowhere.

Statistically, this is a logistic regression: $b_i$ is how much symptom $i$ is activated by its neighbors in the graph, $c_i$ is a threshold specific to that symptom, and $a$ controls the over-all rate of spontaneous symptom appearance and disappearance. Using a very interesting data set (the National Comorbidity Survey Replication of about 9200 US adults), Borsboom et al. in fact fixed the $b_i$ and $c_i$ parameters by running logistic regressions. The $a$ parameter, which was kept the same across symptoms, was tweaked to make the rate of spontaneous occurrence not too unreasonable.

What Borsboom et al. did with this model was to run it forward for 365 steps (i.e., a year), and then look at whether, in the course of the previous year, it would have met the DSM criteria for major depression, and for generalized anxiety disorder, and then repeat across multiple people. It did a pretty good job of matching the prevalence of both disorders, and got their co-morbidity a bit too high but not crazily so.

Now, as a realistic model, this is rubbish, for a host of reasons. Lots of the edges have to be wrong; the edges should be directed rather than undirected; the edges should be weighted; the logistic form owes more to what psychologists are used to than any scientific plausibility. (Why should psychopathology be a spin glass?) The homogeneity of parameters across people could easily fail. And yet even so it comes within spitting distance of reproducing the observed frequencies of different conditions, and their co-morbidities.

Notice that despite this, there are no underlying disease variables in this network, just symptoms. So why do we believe that there are unitary disease entities? I can see at least three routes to that:

  1. Perhaps this symptom-network model simply fails to match the detailed statistics of the data, while latent-disease-entity models can. This might be a bit boring, perhaps, but it would be persuasive if one could show that no model without the disease entities could work. (I find that dubious, but my doubt is not evidence.)
  2. Alternately, one might appeal to causal autonomy. The temperature of a gas, in a strong sense, amounts to the average kinetic energy of its molecules, and one can accurately simulate gases at a molecular level without ever invoking the notion of temperature. But if I manipulate the gas to have a certain temperature, then, very quickly, the effects on pressure and volume, and even the velocity distribution of individual molecules, is always (pretty much) the same, no matter how I bring the temperature about. This is what lets us give sensible causal, counter-factual accounts at the level of temperature, and thermodynamics more generally. (Cf. Glymour.)
    Now, in the network model, we can imagine "giving someone" generalized anxiety disorder, by activating some set of nodes which meets the DSM criteria for that condition. There are actually multiple, only partially-overlapping symptom sets which will do. In the network model, these different instantiations of generalized anxiety disorder will have similar but not identical consequences (for other symptoms, duration of the condition, response to treatments, etc). If, in reality, it makes no difference how someone comes to meet the criteria for generalized anxiety disorder, the implications for the future are always the same, that would be a powerful argument that the disorder is something real.
    More medically: think how we distinguish diabetes into type 1 (the body doesn't make enough insulin) and type 2 (the body doesn't respond properly to insulin). This is, I'd say, because they differ greatly in their causal implications, but once you find yourself in one of these classes, it makes little difference how you got there.
  3. It could be that a description in terms of higher-level entities like depression allows for a higher efficiency of prediction than just sticking with symptoms. This notion could even be made fairly precise; it may also end up being the same as the second route.

Of course, it might be that to make any of these three defenses (or others which haven't occurred to me) work properly, we'd have to junk our current set of disorders and come up with others...

Minds, Brains, and Neurons; Networks; Enigmas of Chance

Posted at April 12, 2018 14:30 | permanent link

April 01, 2018

An _Ad Hominid_ Argument for Animism

Attention conservation notice: Note the date.

A straight-forward argument from widely-accepted premises of evolutionary psychology shows that humans evolved in an environment featuring invisible beings with minds and the ability to affect the material world, especially through what we'd call natural forces.

  1. (Premise) Humans have evolved psychological modules, which carry out specific sorts of computations on very specific sorts of representations, as triggered by environmental conditions. These modules are in fact adaptations to the "environment of evolutionary adaptation", or, rather, environments.
  2. (Premise) Indeed, when we encounter a human cognitive module, we should presume that it is an evolved adaptation.
  3. (Premise) Humans have modules for theory-of-mind, social exchange, and otherwise dealing with intentional agents by reckoning with their beliefs, desires, intentions, and (crucially) capacities to act on those intentions.
  4. Therefore, the human modules for theory-of-mind, social exchange, and dealing with intentional agents are evolved adaptations to our ancestral environment.
  5. (Premise) Humans often engage those modules when dealing with invisible beings, often manifesting as (what scientists categorize as) natural forces.
    (In fact, such engagement of those modules was near-universal up to the emergence of WEIRD societies. The historical record shows aberrant individuals who did not do this, but it's plain even from texts those individuals authored, when they have come down to us, that their bizarre behavior had absolutely no traction on the vast, neurotypical majorities of their societies. [One is reminded of the militantly color-blind trying to convince others that colors do not exist.] Moreover, treating natural forces as manifestations of invisible beings who are intentional agents, amenable to bargaining, threats, supplication, etc., etc., is still very common in WEIRD societies, perhaps even modal.)
  6. (Premise) Engaging a wrong or inappropriate module is expensive, even potentially dangerous, and thus mal-adaptive, and so should be selected against.
  7. If natural forces are mindless and invisible beings did not exist in the EEA, then engaging theory-of-mind and social-exchange modules to deal with natural forces and invisible beings would be mal-adaptive.
    (Occasionally, people suggest that it's so dangerous to ignore another intentional agent that it was adaptive for our ancestors to suspect intentionality everywhere, on "better safe than sorry" grounds. I have never seen this supported by a concrete calculation of the costs, benefits and frequencies of the relevant false-positive and false-negative errors. I have also never seen it supported by a design analysis of why our ancestors could not have evolved to realize that storms, earthquakes, droughts, diseases, etc., were no more intentional agents than, say, fruit, or stone tools.)
  8. Since those modules are adaptive, we must conclude that invisible beings with beliefs, desires, intentions, and the power to act on them, especially through "natural" forces were a common, recurring, predictable feature of the environments of evolutionary adaptation.

Of course, none of this implies that those invisible beings aren't as extinct as mammoths.

To spoil the [not very funny] joke: even if the relevant modules exist, they are engaged not by intentional-agent-detectors, but by human mental representations of intentional agents. Once the idea starts that storms are the wrath of some invisible being, that can be self-propagating. For further details, I refer to the works of Dan Sperber, especially Explaining Culture (and to some extent Rethinking Symbolism). Credit for the phrase "ad hominid argument" goes, I believe, to Belle Waring, back in the Early Classic period of blogging.

Learned Folly; Minds, Brains, and Neurons

Posted at April 01, 2018 22:59 | permanent link

Three-Toed Sloth