February 05, 2019

"Causal inference in social networks: A new hope?" (Friday at the Ann Arbor Statistics Seminar)

Attention conservation notice: Self-promoting notice of a very academic talk, at a university far from you, on a very recondite topic, solving a problem that doesn't concern you under a set of assumptions you don't understand, and wouldn't believe if I explained to you.

I seem to be giving talks again:

"Causal inference in social networks: A new hope?"
Abstract: Latent homophily generally makes it impossible to identify contagion or influence effects from observations on social networks. Sometimes, however, homophily also makes it possible to accurately infer nodes' latent attributes from their position in the larger network. I will lay out some assumptions on the network-growth process under which such inferences are good enough that they enable consistent and asymptotically unbiased estimates of the strength of social influence. Time permitting, I will also discuss the prospects for tracing out the "identification possibility frontier" for social contagion.
Joint work with Edward McFowland III
Time and place: 11:30 am -- 12:30 pm on 8 February 2019, in 411 West Hall, Statistics Department, University of Michigan

--- The underlying paper grows out of an idea that was in my paper with Andrew Thomas on social contagion: latent homophily is the problem with causal inference in social networks, but latent homophily also leads to large-scale structure in networks, and allows us to infer latent attributes from the graph; we call this "community discovery". Some years later, my student Hannah Worrall, in her senior thesis, did an extensive series of simulations showing that controlling for estimated community membership lets us infer the strength of social inference, in regimes where community-discovery is consistent. Some years after that, Ed asked me what I was wanting to work on, but wasn't, so I explained about what seemed to me the difficulties in doing some proper theory about this. As I did so, the difficulties dissolved under Ed's questioning, and the paper followed very naturally. We're now revising in reply to referees (Ed, if you're reading this --- I really am working on it!), which is as pleasant as always. But I am very pleased to have finally made a positive contribution to a problem which has occupied me for many years.

Posted at February 05, 2019 21:04 | permanent link

February 03, 2019

On Godzilla and the Nature and Conditions of Cultural Success; or, Shedding the Skin

Attention conservation notice: 1100+ words of Deep Thoughts on a creature-feature monster and cultural selection, from someone with no qualifications to write on either subject. Expresses long-held semi-crank notions; composed while simultaneously reading Morin on diffusion chains and drinking sake; revived over a year after it was drafted because Henry was posting about similar themes, finally posted because I am procrasting finishing a grant proposal celebrating submitting a grant proposal on time.

Godzilla is an outstanding example of large-scale cultural success, and of how successful cultural items become detached from their original meanings.

Godzilla's origins are very much in a particular time and place, namely Japan, recently (if not quite immediately) post-WWII and the national trauma of the atomic bombings and their lingering effects. This is a very particular setting, on the world-historical scale. It is now seven decades in the past, and so increasingly gone from living memory, even for the very long-lived population of Japan.

Against this, Godzilla has been tremendously successful culturally all over the world, over basically the whole time since it appeared. I don't mean that it's made money (thought it has) --- I mean that it has been popular, that people have liked consuming stories (and images and toys and other representations) about it, that they have liked creating such representations, and that they have liked thinking about and with Godzilla.. (In contemporary America, for instance, Godzilla is so successful that the suffix "-zilla" is a morpheme, denoting something like "a destructive, mindlessly-enraged form of an entity".) Necessarily, the vast majority of this success and popularity has been distant in time, space, social structure and cultural context from 1950s Japan. How can these two observations --- the specificity of origins and the generality of success --- be reconciled?

To a disturbing extent, of course, any form of cultural success can be self-reinforcing (cf. Salganik et al.), but there is generally something to the representations which succeed (cf., again, Salganik et al.). But, again, Godzilla is endemic in many contexts remote in space, time and other cultural features from immediately-post-war Japan. So it would seem that whatever makes it successful in those contexts, including here and now as I write this, must be different from what made it successful at its point of origin.

It could be that Godzilla is successful in 1950s Japan and in 2010s USA because it happened to fit two very different but very specific cultural niches --- the trauma of defeat culminating in nuclear war, on the one hand; and (to make something up) a compulsive desire for re-enactments of 9/11 on the other hand. But explaining wide-spread success by a series of particular fits falters as we consider all the many other social contexts in which Godzilla has been popular. Maybe it happened, by chance, to appeal narrowly to one new context, but two? three? ten?

An alternative is that Godzilla has managed to spread because it appeals to tastes which are not very context-specific, but on the contrary very widely distributed, if not necessarily constant and universal. In the case of Godzilla, we have a monster who breaks big things and breathes fire: an object of thought, in other words, enduringly relevant to crude interests in predators, in destruction, and in fire. Since those interests are very common across all social contexts, something which appeals to them has a very good source of "pull".

This is not to say that Godzilla wasn't, originally, all about being the only country ever atom-bombed into submission. But it is to say that we can draw a useful distinction between the meanings successful cultural products had originally and those attached to them as they diffuse. It is analogous to the distinction the old philosophy of science used to draw between an idea's "context of discovery" and its "context of justification", though that had a normative force I am not aiming at. (For the record, I think that many of the criticisms of the discovery-justification distinction are weak, mis-conceived or just flat wrong, and that it's actually a pretty useful distinction. But that's another story for another time.)

For Godzilla, like many other successful cultural products, the "context of invention" was a very historically-specific confluence of issues, concerns and predecessors. But the "context of diffusion" was that it could appeal to vastly more generic tastes, and make use of vastly more generic opportunities. These are still somewhat historically-specific (e.g., no motion-picture technology, no Godzilla), but much less so. I am even tempted to formulate a generalization: the more diffused a cultural product is, in space or time or social position, the less its appeal owes to historically-specific contexts, and the more it owes to forces which are nearly a-historical and constant.

What holds me back from declaring cultural diffusion to be a low-pass filter is that it is, in fact, logically possible for a cultural product to succeed in many contexts because it seems to be narrowly tailored to them all. What's needed, as a kind of meta-ingredient, is for the cultural product to be suggestively ambiguous. It is ambiguity which allows very different people to find in the same artifact the divergent but specific meanings they seek; but it also has to somehow suggest to many people that there is a specific, compelling meaning to be found in it. When we consider cultural items which have endured for a very long time, like some sacred texts or other works of literature, then I suspect we are seeing representations which have been strongly selected for suggestive ambiguity.

It is a cliche of literary criticism that each generation gives its own interpretation of these great works. It is somewhat less of a cliche, though equally true, that every generation finds a reason to interpret them. Pace Derrida and his kin, I don't think that every text or artifact is equally amenable to this sort of re-interpretation and re-working. (Though that notion may have seemed more plausible to literary scholars who were most familiar with a canon of books inadvertently selected, in part, for just such ambiguity.) There are levels of ambiguity, and some things are just too straightforward to succeed this way1. It is also plainly not enough just to be ambiguous, since ambiguous representations are very common, and usually dismal failures at propagating themselves. The text or artifact must also possess features which suggest that there is an important meaning to be found in it2. What those features are, in terms of rhetorical or other sorts of design, is a nice question, though perhaps not beyond all conjecture. (I strongly suspect Gene Wolfe of deliberately aiming for such effects.) Something keeps the great works alive over time and space, saving them from being as dead as Gilgamesh, of merely historical interest. Because they are interpreted so variously, they can't be surviving because any one of their interpretations is the right one, conveying a compelling message that assures human interest. Rather, works outlast ages precisely because they simultaneously promise and lack such messages. This quality of suggestive ambiguity could, of course, also contribute to academic and intellectual success --- making it seem like you have something important to say, while leaving what that thing is open to debate, is one route to keeping people talking about you for a long time.

… or so I think in my more extreme moments. In another mood, I might try to poke holes in my own arguments. As for Godzilla, I suspect it's too early to tell whether it possesses this quality of suggestive ambiguity, but my hunch is that this dragon is not a shape-shifter.

1. I seem to recall that Umberto Eco once, to make this point, had a parable about employing a screw-driver to clean out your ears. But if my memory has not invented this, I cannot now find the passage.^

2. Though, again, we should be aware of the self-reinforcing nature of cultural success, the way that something might seem important to re-interpret or re-work in part because it is already widely known.^

Posted at February 03, 2019 15:08 | permanent link

Data Over Space and Time

Collecting posts related to this course (36-3467/36-667).

Posted at February 03, 2019 14:15 | permanent link

January 31, 2019

Books to Read While the Algae Grow in Your Fur, January 2019

Attention conservation notice: I have no taste. I also have no qualifications to discuss the history of millenarianism, or really even statistical graphics.

Bärbel Finkenstädt, Leonhard Held and Valerie Isham (eds.), Statistical Methods for Spatio-Temporal Systems
This is an edited volume arising from a conference, with all the virtues and vices that implies. (Several chapters have references to the papers which first published the work expounded in other chapters.) I will, accordingly, review the chapters in order.
Chapter 1: "Spatio-Temporal Point Processes: Methods and Applications" (Diggle). Mostly a precis of case studies from Diggle's (deservedly standard) books on the subject, which I will get around to finishing one of these years.
Chapter 2: "Spatio-Temporal Modelling --- with a View to Biological Growth" (Vedel Jensen, Jónsdóttir, Schmiegel, and Barndorff-Nielsen). This chapter divides into two parts. One is about "ambit stochastics". In a random field $Z(s,t)$, the "ambit" of the space-time point-instant $(s,t)$ is the set of point-instants $(q,u)$, $u < t$, where $Z(q,u)$ is (causally) relevant to $Z(r,t)$. (This is what, in my own work, I've called the "past cone" of $(s,t)$.) Having a regular geometry for the ambit imposes some tractable restrictions on random fields, which are explored here for models of growth-without-decay. The second part of this chapter will only make sense to hardened habituees of Levy processes, and perhaps not even to all of them.
Chapter 3: "Using Transforms to Analyze Space-Time Processes" (Fuentes, Guttorp, and Sampson): A very nice survey of Fourier transform, wavelet transform, and PCA approaches to decomposing spatio-temporal data. There's a good account of some tests for non-stationarity, based on the idea that (essentially) we should get the nearly same transforms for different parts of the data if things really are stationary. (I should think carefully about the assumptions and the implied asymptotic regime here, since the argument makes sense, but it also makes sense that sufficiently slow mean-reversion is indistinguishable from non-stationarity.)
Chapter 4: "Geostatistical Space-Time Models, Stationarity, Seperability, and Full Symmetry" (Gneiting, Genton, and Guttorp): "Geostatistics" here refers to "kriging", or using linear prediction on correlated data. As every schoolchild knows, this boils down to finding the covariance function, $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)]$. This chapter considers three kinds of symmetry restrictions on the covariance functions: "separability", where $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] = C_S(s_1, s_2) C_T(t_1, t_2)$; the weaker notion of "full symmetry", where $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] =$\mathrm{Cov}[Z(s_1, t_2), Z(s_2, t_1)]$; and "stationarity", where$\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] = $\mathrm{Cov}[Z(s_1+q, t_1+h), Z(s_2+q, t_2+h)]$. As the authors explain, while separable covariance functions are often used because of their mathematical tractability, they look really weird; "full symmetry" can do a lot of the same work, at less cost in implausibility.
Chapter 5: "Space-Time Modelling of Rainfall for Continuous Simulations" (Chandler, Isham, Belline, Yang and Northrop): A detailed exposition of two models for rainfall, at different spatio-temporal scales, and how they are both motivated by and connected to data. I appreciate their frankness about things that didn't work, and the difficulties of connecting the different models.
Chapter 6, "A Primer on Space-Time Modeling from a Bayesian Perspective" (Higdon): Here "space-time modeling" means "Gaussian Markov random fields". Does what it says on the label.
All the chapters combine theory with examples --- chapter 2 is perhaps the most mathematically sophisticated one, and also the one where the examples do the least work. The most useful, from my point of view, were Chapters 3 and 4, but that's because I was teaching a class where I did a lot of kriging ad PCA, and (with some regret) no point processes. If you have a professional interest in spatio-temporal statistics, and a fair degree of prior acquaintance, I can recommend this as a useful collection of examples, case studies, and expositions of some detailed topics.
Errata, of a sort: There are supposed to be color plates between pages 142 and 143. Unfortunately, in my copy these are printed in grey, not in color.
Disclaimer: The publisher sent me a copy of this book, but that was part of my fee for reviewing a (different) book proposal for them.
Kieran Healy, Data Visualization: A Practical Introduction
Anyone who has looked at my professional writings will have noticed that my data visualizations are neither fancy nor even attractive, and they never go beyond basic R graphics. This is because I have never learned any other system for statistical visualization. And I've not done that because I'm lazy, and have little visual sense anyway. This book is the best guide I've seen to (1) learning the widely-used, and generally handsome, ggplot library in R, (2) learning the "grammar of graphics" principles on which it is based, and (3) learning the underlying psychological principles which make some graphics better or worse visualizations than others. (This is not to be confused with learning the maxims or even the tacit taste of a particular designer, even one of genius.) The writing is great, the examples are interesting, well-chosen and complete, and the presumptions about how much R, or statistics, you know coming in are minimal. I wish something like this had existed long ago, and I'm tempted, after reading it, to totally re-do the figures in my book. (Aside to my editor: I am not going to totally re-do the figures in my book.) I strongly recommend it, and will be urging it on my graduate students for the foreseeable future.
ObLinkage: The book is online, pretty much.
ObDisclaimer: Kieran and I have been saying good things about each other's blogs since the High Bronze Age of the Internet. But I paid good cash money for my copy, and have no stake in the success of this book.
Anna Lee Huber, Mortal Arts
More historical-mystery mind candy, this time flavored by the (dismal) history of early 19th century psychiatry. (Huber is pretty good, though not perfect, at avoiding anachronistic language, so nobody says "psychiatry" in the novel.)
Norman Cohn, The Pursuit of the Millennium: Revolutionary Millenarians and Mystical Anarchists of the Middle Ages
I vividly remember finding a used copy of this in the UW-Madison student bookstore when I began graduate school, in the fall of 1993, and having my mind blown by reading it that fall*. Coming back to it now, I find it still fascinating and convincing, and does an excellent job of tracing millenarian movements among the poor in Latinate Europe from the fall of Rome through the Reformation. (There are a few bits where he gets a bit psychoanalytic, but the first edition was published in 1957.) If I no longer find it mind-blowing, that's in large part because reading it sparked an enduring interest in millenarianism, and so I've long since absorbed what then (you should forgive the expression) came as a revelation.
The most controversial part of the book, I think, is the conclusion, where Cohn makes it very clear that he thinks there is a great deal of similarity, if not actual continuity, between his "revolutionary millenarians and mystical anarchists" and 20th century political extremism, both of the Fascist and the Communist variety. He hesitates --- wisely, I think --- over whether this is just a similarity, or there is an actual thread of historical continuity; but I think his case for the similarity is sound.
*: I was supposed to be having my mind blown by Sakurai. In retrospect, this incident sums up both why I was not a very good graduate student, and why I will never be a great scientist.

Posted at January 31, 2019 23:59 | permanent link