Stochastic Block Models

27 Feb 2017 16:30

Block models originated in the study of social networks: the idea was to divide the nodes of the network into distinct sets, or "blocks", where all nodes in the same block had the same pattern of connection to nodes in other blocks. This was originally a rather rigid notion, lending itself to some clever abstract algebra. The stochastic block model weakens this idea: the probability of an edge between two nodes just depends on which blocks they are in, and is independent across edges (given the assignment of nodes to blocks). They have proved to be a useful statistical model for community discovery, and, with infinitely many blocks, are part of the machinery of graph limits.

(If we get to see the assignment of nodes to blocks, this is an example of an exponential-family random graph model. If we do not, which is the usual case and certainly the one of most interest for community discovery, the distribution is a mixture of ERGMs.)