Last summer, I was ~~conned into arranging~~ happily
volunteered to organize a small session on complex networks at the Joint
Statistical Meeting, and naturally invited people whose only real connection
was that they study networks and I found their work interesting. I am
immensely pleased to be able to say that one outcome of this is a new joint
paper between two of the speakers, who I dare say would not have collaborated
if I hadn't brought them together in Seattle:

- Jörg Reichardt and Douglas R. White, "Role models for complex networks", arxiv:0708.0958
*Abstract*: We present a framework for automatically decomposing ("block-modeling") the functional classes of agents within a complex network. These classes are represented by the nodes of an image graph ("block model") depicting the main patterns of connectivity and thus functional roles in the network. Using a first principles approach, we derive a measure for the fit of a network to any given image graph allowing objective hypothesis testing. From the properties of an optimal fit, we derive how to find the best fitting image graph directly from the network and present a criterion to avoid overfitting. The method can handle both two-mode and one-mode data, directed and undirected as well as weighted networks and allows for different types of links to be dealt with simultaneously. It is non-parametric and computationally efficient. The concepts of structural equivalence and modularity are found as special cases of our approach. We apply our method to the world trade network and analyze the roles individual countries play in the global economy.

Roughly speaking, the idea behind the (horribly-named) "generalized block
models" of networks is that the nodes (people, countries, companies, proteins)
can be broken up into a certain number of discrete classes, or blocks. All of
the nodes in a given block have the same kind of pattern of connectivity,
meaning that they connect to other nodes in other blocks, or within the block
in the same kind of ways. Put like this, it sounds (like many network
concepts) viciously self-referential, but then so does
page-rank. Still, its resemblance to some aspects
of structuralist
anthropology
is no
accident, as many sorts of kinship-groups and castes are similarly defined
by how they relate to other kinship-groups and castes, which in turn are
defined by their relations. [**Update**: see below.] The
conventional ethnographic approach to figuring out such groupings and their
relationships might be caricatured as a combination of asking your native
informants how they see the matter, combined with more or less inspired
guess-work on the part of the ethnographer. This does not work so well when
dealing with, say, protein networks, since it is hard to find a native
informant among the proteins. Even with social networks, it is not at all
obvious that the members understand all the different roles people play in the
network (though they may think they do), so block-modelers generally rely less
on self-presentations, and more on trial-and-error efforts to find blocks that
fit the connectivity patterns.

Rather than floating
off into the empyrean, what Jörg and Doug have done here is to develop
an algorithm for automatically extracting a set of roles and their
inter-relations from the observable ties in the network. While
their *idea* of structurally equivalent roles comes from mathematical
theories of social structure (see below), the *algorithm* grows out of
earlier work by Jörg and Stefan Bornholdt on decomposing networks into
nearly-independent communities or modules
(cond-mat/0402349
and cond-mat/0603718),
itself part of a recent burst
of work on that problem by statistical physicists.

What Jörg and Stefan realized is that the problem of finding modules
could be mapped on to a problem in statistical mechanics, called
the Potts model.
Imagine that each node can have one of a certain number of colors. When two
nodes share the same color, they are in the same module or community. The
Potts model also has nodes with multiple colors, which can interact in one of
two ways. Two nodes can have an attractive ("ferromagnetic") interaction,
which means it is energetically favorable for them to have the same color; or
they can have a repulsive ("antiferromagnetic") interaction, which means it is
energetically favorable for them to have *different* colors. The
problem then is to find the assignment of colors to nodes which gives the best
over-all value of the energy. This is hard to do exactly, because different
interactions can push the same node in different directions
(cf.), but one can quite
rapidly find configurations which come close to the minimum, through the magic
of simulated
annealing. (If you play around
with this Java
applet, cooling it slowly, you get a bit of a sense of how that works.)

In the original modularity papers, what Jörg and Stefan did was introduce an attractive interaction between two nodes if they were more tightly linked to each other than average, and a repulsive interaction if they were less tightly linked. (See their papers for a more exact statement.) This let them very quickly discover significant modular structure in very large networks. It also nicely adapts to more information-theoretic notions of "connection".

In the new paper, they employ essentially the same idea. Given a "role model" of blocks and their connections, links, or lacks of links, which follow what one expects from the model are energetically rewards; those which deviate from their assigned roles are energetically dis-favored. Optimizing the total energy then gives the best-fitting assignment of nodes to roles. (Community discovery re-emerges as a special case, when each block ideally connects only to itself.) This still leaves the problem of discovering the roles and their relations from the data; here they use what I can only describe as a very clever trick, which I will not attempt to explain in this space.

As an illustration, they apply this new method to analyzing a part of the international flow of commodities. Amusing though it is to see patterns of combined and uneven development pop out of adjacency matrices, I keep wondering what could be done with this technique and good data on brains...

Incidentally, while Jörg is a statistical physicist, Doug has been studying social networks, and doing anthropological fieldwork, for longer than I've been alive; so I'd very much like to know whether their paper would be a black dot or a white one in the graph showing the near-disconnection of the two approaches to networks.

**Update**, that afternoon: Just to clarify, block modeling,
and the notions of structural and regular equivalence underlying it, are
autonomous developments of sociology and anthropology in the 1970s and 1980s,
and they *do* in fact grow out of structuralist anthropology. In Doug's
words (from an e-mail which he was kind enough to let me quote):

The first definition that I know of for structural equivalence, which is what we measure, was in an MA thesis by Fran&ccdeil;ois Lorrain, 1968, Ecole des Hautes Etudes en Sciences Sociales, Paris, under Guilbaud and Lévi Strauss, showing up an English version in Parts I and II, 1968 and 1969 MS, Harvard, "Tools for the Formal Study of Networks." and in Lorrain's Harvard PhD thesis (1969). From there it travelled to Lorrain and Harrison C. White's 1971 article while Lorrain was a graduate student at Harvard ["Structural Equivalence in Social Networks", Journal of Mathematical Sociology1: 49--80], then to H. White et al under the name of blockmodeling... and on through a long series of sociological publications and those in related fields, only a few of which we cite in our paper. ** Lorrain was working on a more general algorithm of for "correlative classes" defined not in their own terms, or in terms of attributes, but in terms of relations that connect their elements, a concept found in Ossowski (1963, Class Structure in the Social Consciousness, Routledge and Kagan Paul). Now called "regular equivalence" as in my 1983 paper with Karl Reitz (a term linked to earlier work of Claude Flament), algorithms for computing regular equivalence (or "correlative") classes were also developed in the fields of logic and computer science (e.g.. by Higgens (1971), working with groupoids and category theory, similar to the work on Lorrain, and again by Stark (1972 in automata theory) under the name of dynamic logic (Harel et al. 2002). Mark and Masuch's (2003) article in Social Networks provides a review and further formalization.Significantly, the weightings used in Reichardt and my Potts' method for finding role model structures in a network of multiple relations can be adjusted to identify regular equivalence of positions.

So, never let it be said that structuralist anthropology
was *completely* barren of fruit; it's just the productive part had to
do with social organization and not with culture —
as Ernest Gellner used to say, who you
can marry, as opposed to what to wear to the wedding.

Posted at August 22, 2007 12:10 | permanent link