June 29, 2011

Knights, Muddy Boots, and Contagion; or, Social Influence Gets Medieval

Three papers have appeared recently, critiquing methods which people have been using to try to establish social influence or social contagion: "The Spread of Evidence-Poor Medicine through Flawed Social Network Analysis" (arxiv:1007.2876) by Russell Lyons; "The Unfriending Problem" by Hans Noel and Brendan Nyhan (arxiv:1009.3243); and "Homophily and Contagion Are Generically Confounded in Observational Social Network Studies" (arxiv:1004.4704, blogged about here) by Andrew Thomas and myself. All three were of course inspired by the works of Nicholas Christakis, James Fowler and collaborators. This has lead to a certain amount of chatter online, including rash statements about how social influence may not exist after all. That last is silly: to revert to my favorite example of accent, there is a reason that my Pittsburgh-raised neighbors say "yard" differently than my friends from Cambridge, and it's not the difference between drinking from the Monongahela rather than the Charles. Similarly, the reason my first impulse when faced with a causal inference problem is to write out a graphical model and block indirect paths, rather than tattooing counterfactual numbers in invisible ink on my experimental subjects, is the influence of my teachers. (Said differently: culture happens.) So, since we know social influence exists and matters, the question is how best to study it.

Fortunately, one consequence of this recent outbreak of drama is a very long and thoughtful message from Tom Snijders to the SOCNET mailing list. Since there is a public archive, I do not think it is out of line to quote parts of it, though I would recommend anyone interested in the subject to (as the saying goes) read the whole thing:

What struck me most in the paper by Lyons ... are the following two points. The argument for social influence proposed by Christakis and Fowler (C&F) that earlier I used to find most impressive, i.e., the greater effect of incoming than of outgoing ties, was countered: the difference is not significant and there are other interpretations of such a difference, if it exists; and the model used for analysis is itself not coherent. This implies that C&F's claims of having found evidence for social influence on several outcome variables, which they already had toned down to some extent after earlier criticism, have to be still further attenuated. However, they do deserve a lot of credit for having put this topic on the agenda in an imaginative and innovative way. Science advances through trial and error and through discussion. Bravo for the imagination and braveness of Nick Christakis and James Fowler.

...Our everyday experience is that social influence is a strong and basic aspect of our social life. Economists have found it necessary to find proof of this through experimental means, arguing (Manski) that other proofs are impossible. Sociologists tend to take its existence for granted and are inclined to study the "how" rather than the "whether". The arguments for the confoundedness of influence and homophilous selection of social influence (Shalizi & Thomas Section 2.1) seem irrefutable. Studying social influence experimentally, so that homophily can be ruled out by design, therefore is very important and Sinan Aral has listed in his message a couple of great contributions made by him and others in this domain. However, I believe that we should not restrict ourselves here to experiments. Humans (but I do not wish to exclude animals or corporate actors) are purposive, wish to influence and to be influenced, and much of what we do is related to achieve positions in networks that enable us to influence and to be influenced in ways that seem desirable to us. Selecting our ties to others, changing our behaviour, and attempting to have an influence on what others do, all are inseparable parts of our daily life, and also of our attempts to be who we wish to be. This cannot be studied by experimental assignment of ties or of exchanges alone: such a restriction would amount to throwing away the child (purposeful selection of ties) with the bathwater (strict requirements of causal inference).

The logical consequence of this is that we are stuck with imperfect methods. Lyons argues as though only perfect methods are acceptable, and while applauding such lofty ideals I still believe that we should accept imperfection, in life as in science. Progress is made by discussion and improvement of imperfections, not by their eradication.

A weakness and limitation of the methods used by C&F for analysing social influence in the Framingham data was that, to say it briefly, these were methods and not generative models. Their methods had the aim to be sensitive to outcomes that would be unlikely if there were no influence at all (a sensitivity refuted by Lyons), but they did not propose credible models expressing the operation of influence and that could be used, e.g., to simulate influence processes. The telltale sign that their methods did not use generative models is that in their models for analysis the egos are independent, after conditioning on current and lagged covariates; whereas the definition of social influence is that individuals are not independent....

Snijders goes on, very properly, to talk about the models he and his collaborators have been developing for quite a few years now (e.g.), which can separate influence from homophily under certain assumptions, and to aptly cite Fisher's dictum that the way to get causal conclusions from observations studies is to "Make your theories elaborate" --- not give up. Lyons's counsels of perfection and despair are "words of a knight riding in shining armour high above the fray, not of somebody who honours the muddy boots of the practical researcher". (Again, if this sounds interesting, read the full message.) I agree with pretty much everything Snijders says, but feel like adding a few extra points.

  1. It is of course legitimate to make modeling assumptions, but that one then needs to support those assumptions with considerations other than their convenience to the modeler. I see far too many papers where people say "we assume such and such", get results, and don't try to check whether their assumptions have any basis in reality (or, if not, how far astray that might be taking them). Of course the support for assumptions may be partial or imperfect, might have to derive in some measure from different data sources or even from analogy, etc., through all the usual complications of actual science. But if the assumptions are important enough to make, then it seems to me they are important enough to try to check. (And no, being a Bayesian doesn't get you out of this.)
  2. As we say in our paper, I suspect that much more could be done with the partial-identification or bounds approach Manski advocates. The bounds approach also seems more scientifically satisfying than many sensitivity analyses, which make almost as many restrictive and unchecked assumptions as the original models. Often it seems that this is all that scientists or policy-makers would actually want anyway, and so the fact that we cannot get complete identification would not be so very bad. I wish people smarter than myself would attack this for social influence.
  3. It would be very regrettable if people came away from this thinking that social network studies are somehow especially problematic. On the one hand, as shown in Sec. 3 of our paper, when social influence and homophily are both present, individual-level causal inference which ignores the network is itself confounded, perhaps massively. (I've been worrying about this for a while.) But the combination of social influence and homophily would seem to be the default condition for actual social assemblages, while individual-level studies from (e.g.) survey data have become the default mode of doing social science.
    On the other and more positive side, we have it seems to me lots of examples of successfully pursuing scientific, causal knowledge in fields where experimentation is even harder than in sociology, such as astronomy and geology. Perhaps explaining the clustering of behavior in social networks is fundamentally harder than explaining the clustering of earthquakes, but we're even more at the mercy of observation in seismology than sociology.

Manual tracback: Slate; A Fine Theorem

Networks; Enigmas of Chance

Posted at June 29, 2011 13:24 | permanent link

Three-Toed Sloth