The Bactra Review: Occasional and eclectic book reviews by Cosma Shalizi   167


by Howard S. Becker

University of Chicago Press, 2017

Tell Us Another Story About Data, Great-Uncle Howie

Becker has been a presence in American sociology since the early 1950s, when he was a graduate student at Chicago; this is a somewhat testamentary book, about the different kinds of evidence sociologists use to support their ideas (or undermine each other's ideas), and their strengths and weaknesses. The rhetorical mode is Nestorian: Great Uncle Howie telling stories from the old days (including, dis-armingly, some of what he nows sees to be his own mistakes). His specific stories are largely about sociology, but there is nothing which wouldn't, mutatis mutandis, apply to any other social science.

Becker has three big points: it's vital to really know how your data are generated; there are advantages to being able to revise the kind of data you're gathering as you go along; and that ethnographic fieldwork can start without any sort of theoretical pre-conceptions. I fully endorse the first, partially endorse the second, and have to take issue with the third.

I. Knowing Where the Data Came From

Our data, whether qualitative or quantitative, only provides (good) evidence for or against our ideas to the extent it actually measures what we want it to measure. Thus it is imperative to actually understand the measurement process behind the data, probing it carefully for systematic flaws and artifacts. When those are identified, we need to remove them where possible, and trace the distortions they may be introducing when they cannot be removed. The idea that errors in measurement are just random noise, which will cancel out in aggregate, is itself a conclusion we need to empirically check, not a presupposition we can make for free.

Gathering data about social phenomena is, itself, usually a social process, involving social interactions both inside the data-gathering organization and with the objects of research. This means that many issues relating to the sociology of work, presentation of self, etc., are much more relevant to social measurement than to measurements in the natural sciences. Indeed, as Becker emphasizes, the causes of errors in social measurement are often social phenomena of interest in their own right. Becker is so fond of this point, in fact, that he neglects the importance of using what one learns about those phenomena to go back to better answer the question one started with. (To borrow one of his examples, when physicists find their experiments interfered with by extraneous influences, they not only carefully investigate those influences, they also figure out how to eliminate them or compensate for them, so as to actually study the phenomena of original interest!) But I imagine Becker would accept this addition quite cheerfully.

All this, needless to say, I whole-heartedly endorse. I would be very happy to teach big chunks of this in my statistics classes, particularly to undergraduates. I am more optimistic about what can be done by way of technical corrections, and especially by finding Manski-style bounds on the effects of errors, than Becker seems to be[1]. But this a difference of emphasis.

II. Iterate

Becker has some very shrewd things to say about the advantages of modes of research where you can easily change the kind of measurements you do, or the kind of questions you're asking, in light of what you find out during an earlier stage of the investigation. He wrongly, however, identifies this sort of research with ethnographic fieldwork, as opposed to systematic collection of quantitative data from a population. It is true that the Census, like other official statistical agencies, has a very slow response time, and there are (good) reasons for this, many of which Becker goes into. But those of our colleagues who deal with on-line data collection really can, and do, change things around in just the sort of way Becker talks about, even though they are getting the kind of data we'd otherwise look to the Census or the General Social Survey for. There are other issues with on-line data collection (I'll get back to this below), but I think Becker's real point needs to be re-framed, in terms of how quickly, and cheaply, the data-gathering process can be re-directed, rather than the fieldwork-vs-quantitative divide.

(A minor, but related, point is that Becker repeatedly says social scientists just cannot do experiments, with no argument or elaboration. Without getting in to what is possible in situations of great power inequalities [prisons, the military, Facebook, randomized-controlled trials in the Third World, etc.], this just isn't so. There are whole how-to books about survey experiments, and some of the most interesting recent work on the sociology of culture rests on clever, careful experiments.)

III. Slaves of a Defunct Theorist

Finally, Becker expounds some truly naive ideas about ethnographic fieldwork, presenting it as just talking to people and noticing what's interesting, innocent of any theory. But this just ignores both what more sophisticated thinkers have had to say about the status of such reports from the field, and the way the field-worker goes into the field with theoretical, or at least theory-ish, ideas already.

On the first point, I will just re-cycle the great Dan Sperber in his little book On Anthropological Knowledge (now available as a free PDF from the author). When Becker says that the medical students he observed had a shared student culture that revolved around "medical responsibility" and "clinical experience" (pp. 182ff), he is certainly not just relaying, verbatim, what they told him. (Even if he was, why believe them?) Rather, those claims are his interpretation of a vast number of other things which they did say and do. This interpretation is something like a summary, and something like a paraphrase, and something like an explication, or providing the missing premise(s) that turns an enthymeme into a valid argument. It is a claim on the order of "the medical students say and do a lot of things which all make sense if the students all share the certain values, values which you, my audience, will understand if I call them 'medical responsibility' and 'clinical experience'; they must have come to hold these values because of the shared influence of a common culture". Fleshed out like this, it becomes clear that this is an ambitious claim, one loaded with lots of theoretical ideas ("common culture"; actions are explained by "values" rather than e.g. incentives or habits). It is also clear that alternative interpretations (different summary-paraphrases) are at least possible, and indeed multiple interpretations might all be equally valid. (For example, different audiences might find different aspects of the students' words and deeds mysterious.) The sort of tabulation of who said what to whom that Becker offers on pp. 185--187 isn't really getting at the issue of whether this is the best interpretation, or even an accurate one, or whether this is really explanatory.

To the second point, while Becker likes to present fieldwork as beginning with just hanging out in a social environment, waiting for interesting things to appear so that he can then to try to sort them out, the fact is that the fieldworker doesn't go in to the new social environment innocent of theoretical ideas. Becker had, after all, completed a Ph.D. degree in which he was extensively drilled in certain general ideas about how social groups worked, including the idea that there are discrete social groups, and a correspondence between such groups and "cultures", which include shared values, which motivate and explain the actions of group members. Indeed he says (p. 183) that his notion of medical student culture "mirrored a classic understanding of culture I'd acquired in graduate school, William Graham Sumner's description of folkways (1906) as solutions, collectively arrived at, to persisting problems that a group confronted". But he doesn't seem to appreciate how this undermines his claims to have employed a "Buffonian" approach, with "no initial hypotheses, no ideas we were going to 'test' against the data we were about to start collecting".

To illustrate with a parable, consider the evil-mirror-universe version of Becker, who studied economics rather than sociology at Chicago, and then went to do fieldwork at the same medical school. (Cf. the contrast between The Moral Economy of the Peasant and The Rational Peasant.) Evil-econ-mirror Becker would have started looking around, also without a specific hypothesis to test, but working with the idea that behavior is best explained in terms of the incentives, choices and information available to individuals, who are all trying to selfishly get as much for themselves as they can, and in equilibrium with other such individuals. I offer this alternative not because I think those ideas are right (they aren't), or even better than 1950s-vintage American social theory (a pox on both houses), or even to suggest that there is no way to decide between the interpretations of Becker and evil-econ-mirror Becker. For all I know, evil-econ-mirror-Becker would have found that trying to interpret the students in "rational"-choice terms would have been so contorted and hopeless that he gave up those ideas (at least for this problem), and settled on "cultural values" instead. All I am saying is that even if Becker started without a definite idea to test, he certainly started with a notion of the kind of idea he was going to end up with.

The Future

There is one last area where I have to record a disappointment with the book, which is that it completely neglects the possibilities for gathering data about social behavior on-line. In principle, much of this falls under the heading Becker does consider, of records generated by organized institutions in the course of their work, but his examples, and his imagination for this, are all for public-sector institutions (police forces, medical examiners' offices, schools), or, within the private sector, stuff like pay and piece-work manufacturing output records. Social life still generates all those records, of course, but now also huge volumes of electronic, machine-readable records, which are just crying out for analysis, which of course is happening all around us. If such analysis neglects the real, perhaps eternal, issues of social measurement which Becker goes over, then its value will be severely compromised (which I fear is too often the case). It is, of course, rather absurd of me to complain that someone who has been doing sociology since the early 1950s isn't up-to-date about the latest technological possibilities of data collection --- but honestly I wish I could read what he'd have to say.

[1]: For example, drawing on Timmermans's study of how medical examiners assign causes of death, Becker suggests that what gets recorded in official statistics as "suicide" is substantially different from what sociologists studying suicide-as-such want to know. Assuming Becker (and Timmermans, who I haven't read but now want to) are right, there is indeed a problem. But my own impulse would be to start calculating how big the gaps between suicideofficial and suicidesociological would have to be to change any conclusions, and to want more detailed studies of cause-of-death statistics, to try to put quantitative bounds on the gap between the two. (All of this might of course be subject to various local or systematic variations, which would might also need investigation.) ^

x+223 pages, bibliography, index,
In print as a hardback (ISBN 978-0-226-46623-1) and a paperback (ISBN 978-0-226-46637-8)


Drafted 3 November 2017, revised 13 December 2017