February 28, 2010

Books to Read While the Algae Grow in Your Fur, February 2010

Eric D. Kolaczyk, Statistical Analysis of Network Data: Methods and Models
This is the best available textbook on the subject. (I say this with all due respect for Wasserman and Faust, which was published sixteen years ago.)
Chapter one gives examples of networks, emphasizing that many non-social assemblages are networks, or have networks embedded in them, and can be profitably studied as such; this is story-telling and pretty pictures. Chapter two is background, divided into graph theory and graph algorithms (aimed at statisticians), and the essentials of probability and statistical inference (aimed at computer scientists). Chapter 3 deals with data collection (what do we measure? how do we gather the data? how do we organize it?) and visualization (how do we make those pretty pictures?). Chapter 4 covers descriptive statistics for networks, including ideas about partitioning networks into more-or-less distinct components, a.k.a. "community discovery". Both chapters 3 and 4 have terminal sections on what to do with time-varying networks; these are much less detailed than the rest, because we don't really know what to do with time-varying networks yet.
Chapter 5 deals with the fact that we generally do not have access to complete networks, but rather to samples of them. Inference from samples to larger assemblages (here, the complete network) is a fundamental statistical problem; depending on how the sample was collected, direct extrapolation from the sample to the whole can be quite accurate or highly misleading. Kolaczyk properly begins by reviewing the techniques used for sample inference in population surveys, such as Horvitz-Thompson estimation, which try to compensate for the biases introduced by the sampling scheme; he then turns to the most common sorts of network sampling methods, and gives some examples of how to incorporate the sampling into inferences. This is an area where much more needs to be done, but it's absolutely fundamental, and I'm most pleased to see it appear here.
Chapter 6 considers probabilistic models of network structure and their statistical inference, mostly through the method of maximum likelihood. It begins with the classical Erdos-Renyi (-Rappoport-Solomonoff) random graph model and some of its immediate generalizations; the theory here is exceedingly pretty, but of course it never fits anything in the real world. It then turns to small-world (Watts-Strogatz) models, and to preferential-attachment and duplication models (introduced by Price, re-introduced by Barabasi and Albert owing to ignorance of the literature), including the particular duplication model due to Wiuf et al. which can be estimated by maximum likelihood (as we've seen). The last part of the chapter discusses exponential-family random graph models, which are a fascinating topic I will post more about soon. Chapter 7 is on inferring network structure from partial measurements, including link prediction, inference of phylogenetic trees, and inference of flow- or message- passing networks from traffic measurements ("network tomography"). There could have been a bit more integration between these two chapters, but there could stand to be more integration in the literature, too.
Chapter 8 looks at processes taking place on networks, divided between predicting random fields on networks, and modeling dynamical processes on them. For the first, Kolaczyk emphasizes Markov random fields (including the Hammersley-Clifford-[Griffeath-Grimmett-Preston-et-alii] theorem) and kernel regression. The only kind of dynamic process on networks treated in any detail is epidemic modeling; as usual, this is because much, much more remains to be done. Chapter 9 looks at statistical models of traffic on networks, some of them going back more than half a century in the economic geography literature. Finally chapter 10 is really more of an appendix, sketching the basic formalism of graphical models, and indicating how it connects to both Markov random fields and to exponential-family random graphs.
The material is up-to-date, the explanations are clear, the graphics are good, and the examples are interesting, covering social networks, biochemistry and molecular biology, neuroscience and telecommunications with about equal comfort. I would have no hesitation at all in using this for a class of first- or second- year graduate students, plan to use parts of it next time I teach 462, and can warmly recommend it for self-study. It should become a standard work.
(Amusingly, Powell's currently recommends that people who buy Kolaczyk also get Jenny Davidson's Breeding [which I'm still reading], and vice versa. This tells me that (i) not many people other than me have bought either book from them, and (ii) they need to make their data-mining algorithms a bit more outlier-resistant.)
Dog Soldiers
J. Random British Army squad vs. werewolves in deepest, darkest Scotland. Recommended by Carrie Vaughn.
Intelligence, season 2
I like where they took the story (though I have special reasons to be amused by the involvement of Caribbean financiers), and am sad the series got canceled.
The Last Winter
Decent horror movie about Arctic isolation and global warming. Suffers towards the end from showing too much of the bogey. (ROT-13'd spoilers: Fcrpgeny pnevobh whfg nera'g gung fpnel; naq V xrcg guvaxvat bs Nhqra, gubhtu gung'f cebonoyl vqvbflapengvp.)
Dexter 3
Few things are quite so restorative when facing the winter blahs as a well-made TV show that understands the true meaning and importance of friendship and family ties.
Previously: 2. Subsequently: 4, 5--8
Marshall G. S. Hodgson, Rethinking World History: Essays on Europe, Islam and World History
Hodgson was a historian of Islam at the University of Chicago, best known for his monumental and fantastic Venture of Islam (I, II, III), which was an attempt to tell the story of "conscience and history in a world civilization". Both the "world" and the "civilization" part are important: Hodgson was one of those historians who breaks the world into civilizations, but didn't think of them as distinct organisms or similar weirdness; rather as complexes of very broadly-distributed but also very involving literate traditions. Moreover, the "world" part mattered a lot too: he constantly kept in view the fact that civilizations were never isolated from each other, and their interactions were vital to who they developed, particularly to "Islamicate" civilization, which for a long time occupied the central position in the "Afro-Eurasian Oecumene". The whole of it was an effort to see the history of Islam as part of world history, and to see world history itself objectively. He also tried very hard to try to inhabit and convey the moral universe of the people he wrote about; this was partly about historical understanding and partly about his own earnest Quaker conscience.
Hodgson spent many, many years working on a world history, which was left in an even more fragmentary state than The Venture of Islam at the time of his death; an unpublishable mess. Rethinking World History is a compilation of fragments this manuscript and selections from The Venture, along with some journal papers and letters. The product is an excellent epitome of Hodgson's more general and theoretical ideas about history and historiography: the central role of Islam in world history and the broad course of Islamicate civilization; the nature of tradition and the very broad, diffuse complexes of traditions that constitute civilizations, and the way all traditions constantly change; the errors of then-conventional "orientalist" scholarship; the sheer unprecedented weirdness of the modern "technical" age; the need to crush Eurocentrism if we're to understand history (and in particular the "optical illusion" which makes us think there's a "western civilization" going from ancient Greece through Rome to medieval western Europe and modern European states and their off-shoots); and finally the fundamental unity of human history, and how that manifested itself over time.
There is also an introduction by the editor, one Edmund Burke III, which is partly helpful, but also oddly dismissive of Hodgson. However this dismissal just takes the form of saying Hodgson's "culturalist" and doesn't acknowledge Immanuel Wallerstein (of all people!) and the more dodgy sort of Marxist; Burke doesn't even mention a single material error or omission these supposed flaws lead Hodgson into. While I appreciate Burke's work in pulling together the book, I wish he'd thought harder when writing his introduction.

Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Pleasures of Detection, Portraits of Crime; Enigmas of Chance; Networks; Writing for Antiquity; Islam; The Great Transformation

Posted at February 28, 2010 23:59 | permanent link

Three-Toed Sloth