Goldthorpe helpfully summarizes each chapter in a proposition at its head, as follows:
[P]opulations in this technical sense could, substantively, be of quite different kinds. They could be human or other animal populations, but also populations of, say, molecules or galaxies. The common feature of such populations was that, while their individual elements were subject to considerable variability and might appear, at least in some respects, indeterminate in their states and behaviour, they could nonetheless exhibit aggregate-level regularities of a probabilistic kind.
The aims of a science dealing with such pluralistic subjects of study --- or, that is, of what could be called a 'population science' --- were then twofold. The initial aim was to investigate, and to establish, the probabilistic regularities that characterise a particular population, or its appropriately defined subpopulations....
However, Neyman also made it clear that once population regularities had been empirically established, the further aim of a population science had to be that of determining the processes or 'mechanisms' which in their operation at the individual level actually produced these regularities. And since the regularities --- the explananda of a population science --- were probabilistic, the mechanisms that would need to be envisaged would be ones that, rather than being entirely grounded in deterministic laws, incorporated chance. [pp. 7--8, Goldthorpe's emphasis]
This may give the impression the book is all abstract argument, which is not the case at all; there are many concrete illustrations of why he thinks his ideas are better than alternatives, some drawn from his own career (particularly on social stratification and the transmission of inequality), but also from other areas of sociology.
I like where Goldthorpe's heart is at, but find his defense of status quo practice in sociology (nos. 7 and 8, mostly) unpersuasive. As a methodological individualist, he very sensibly wants to explain social phenomena in terms of the actions, and interactions, of individuals, especially actions which are (at least) subjectively rational (no. 9). So far so good; this is just having your head screwed on right, as agreed to by everyone from Karl Popper and Jon Elster through Raymond Boudon and Peter Hedström to Manuel DeLanda [*]. But then you'd want to model interacting individuals, and ideally you'd want to compare those models to data on individuals' actions and interactions.
What Goldthorpe instead defends is running regressions on survey data, so both the data and the statistical models bear only very complicated, indirect and lossy relationships to the phenomena described in the kind of theory Goldthorpe (very correctly) advocates. Sociologists of his convictions should almost exclusively build, and compare to data [**], agent-based models --- and some do. [***] Goldthorpe's right that cross-sectional survey data on individuals' attributes are the most accurate and representative kind of social data we've got. But lower-quality data on individuals' actions and interactions might still lead to better inferences for the kind of models Goldthorpe should want. Consider, for example, the sort of narrative-relational data Roberto Franzosi extracts from newspapers. This gives measurements of who did what to whom when, and (perhaps) why, and so is a lot more directly aligned with the sort of models Goldthorpe ought to want (because those models, in turn, are more directly aligned with the sort of theory and explanations he wants). There are issues about sampling bias and coding schemes, etc., for such data, and maybe those drawbacks generally outweigh the advantages, but that can hardly be decided a priori. Alternatively, we might seek new methodology, to figure out how to do statistical inference for ABMs based on our surveys.
(That methodology doesn't exist, yet, but I will geek out about it nonetheless. On the one hand, this might actually create a useful role for the usual sociologists' regressions, as auxiliary models in indirect inference for the ABMs. [But then they might be replaced by totally random features.] On the other hand, I worry that trying to do inference on interactive dynamical models from cross-sectional distributions of individual outcomes would run smack into serious identification problems. [Multiple Markov processes can have the same invariant distribution.] But I shall draw a close to these speculations, and return to Goldthorpe's book.)
As I said, I think lots of this is right-headed, and that it would be good for other sociologists to adopt these positions. I also think that, as I indicated, if Goldthorpe just followed his own ideas a bit more, he'd end up in an interesting and intellectually productive place. It may be that some of his readers in sociology will take this next step. This is, self-consciously, an old warrior's reflections on battles past (and tales of even older warriors few now remember) --- but I have always been fond of such works.
*: At least, DeLanda should agree if we sprinkled in some talk of assemblages; see chapters 1 and 2 of his A New Philosophy of Society (mini-review at the link above). Whether he actually does, I wouldn't presume to say. ^
**: Like Goldthorpe, I am taking it
for granted here that we want to compare our models to data (for estimation,
for hypothesis testing, for model-checking, and even for prediction). That
being said, an important part of developed sciences is playing around
with analyzing deliberately stylized and simplified models which
aren't brought to the data and shouldn't be --- their point is to explore the
consequences of assumptions, and build understanding for more complicated
situations. (I don't think Goldthorpe would disagree.) If theoretical
sociology actually existed as a sub-discipline, subjecting data-free ABMs to
analysis would be an important topic for it. ^
***: Since, of course, ABMs are just interacting Markov processes, in some cases their microstructure won't matter and they can be usefully aggregated to compartment models, as in demography. This will, naturally, simplify both the probabilistic analysis of the models and their statistical inference. ^