Stochastic Block Models
29 Sep 2024 15:07
Block models originated in the study of social networks: the idea was to divide the nodes of the network into distinct sets, or "blocks", where all nodes in the same block had the same pattern of connection to nodes in other blocks. This was originally a rather rigid notion, lending itself to some clever abstract algebra. The stochastic block model weakens this idea: the probability of an edge between two nodes just depends on which blocks they are in, and is independent across edges (given the assignment of nodes to blocks). They have proved to be a useful statistical model for community discovery, and, with infinitely many blocks, are part of the machinery of graph limits.
(If we get to see the assignment of nodes to blocks, this is an example of an exponential-family random graph model. If we do not, which is the usual case and certainly the one of most interest for community discovery, the distribution is a mixture of ERGMs.)
- Recommended, big pictures:
- Peter J. Bickel and Aiyou Chen, "A nonparametric view of network models and Newman-Girvan and other modularities", Proceedings of the National Academy of Sciences (USA) 106 (2009): 21068--21073 [See under Graph Limits and Infinite Exchangeable Arrays]
- Tom A. B. Snijders and Krzysztof Nowicki, "Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure", Journal of Classification 14 (1997): 75--100 [A deservedly widely-cited paper, but too often cited as though it introduced SBMs, which is clearly wrong if you read the paper; see below]
- Recommended, historically important:
- Stephen E. Fienberg and Stanley S. Wasserman, "Categorical Data Analysis of Single Sociometric Relations", Sociological Methodology 12 (1981): 156--192
- Stephen E. Fienberg, Michael M. Meyer and Stanley S. Wasserman, "Statistical Analysis of Multiple Sociometric Relations", Journal of the American Statistical Association 80 (1985): 51--67
- Paul W. Holland, Kathryn Blackmond Laskey and Samuel Leinhardt, "Stochastic blockmodels: First steps", Social Networks 5 (1983): 109--137
- Recommended, close-ups:
- Christopher Aicher, Abigail Z. Jacobs, Aaron Clauset, "Adapting the Stochastic Block Model to Edge-Weighted Networks", arxiv:1305.5782
- Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing, "Mixed membership stochastic blockmodels", arxiv:0705.4485
- Arash A. Amini, Aiyou Chen, Peter J. Bickel, Elizaveta Levina, "Pseudo-likelihood methods for community detection in large sparse networks", Annals of Statistics 41 (2013): 2097--2122, arxiv:1207.2340
- Peter Bickel, David Choi, Xiangyu Chang, and Hai Zhang, "Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels", Annals of Statistics 41 (2013): 1922--1943, arxiv:1207.0865
- Alain Celisse, Jean-Jacques Daudin, and Laurent Pierre, "Consistency of maximum-likelihood and variational estimators in the stochastic block model", Electronic Journal of Statistics 6 (2012): 1847--1899
- Kehui Chen, Jing Lei, "Network Cross-Validation for Determining the Number of Communities in Network Data", arxiv:1411.1715
- David S. Choi, Patrick J. Wolfe, Edoardo M. Airoldi, "Stochastic blockmodels with growing number of classes", Biometrika 99 (2012): 273--284, arxiv:1011.4644
- J.-J. Daudin, F. Picard and S. Robin, "A Mixture Model for Random Graphs", Statistics and Computing 18 (2008): 173--183
- Aurelien Decelle, Florent Krzakala, Cristopher Moore and Lenka Zdeborova, "Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications", Physical Review E 84 (2011): 066106, arxiv:1109.3041
- Wenjie Fu, Le Song, Eric P. Xing, "A State-Space Mixed Membership Blockmodel for Dynamic Network Tomography", arxiv:0901.0135
- Mark S. Handcock, Adrian E. Raftery and Jeremy Tantrum, "Model-Based Clustering for Social Networks" Journal of the Royal Statistical Society A 170 (2007): 301--354 [PDF preprint]
- Jake M. Hofman, Chris H. Wiggins, "A Bayesian Approach to Network Modularity", arxiv:0709.3512 [For "Bayesian", read "smoothed maximum likelihood". But nonetheless: cool.]
- Brian Karrer, M. E. J. Newman, "Stochastic blockmodels and community structure in networks", Physical Review 83 (2011): 016107, arxiv:1008.3926
- Vince Lyzinski, Daniel Sussman, Minh Tang, Avanti Athreya, Carey Priebe, "Perfect Clustering for Stochastic Blockmodel Graphs via Adjacency Spectral Embedding", arxiv:1310.0532
- Elchanan Mossel, Joe Neeman, Allan Sly, "Stochastic Block Models and Reconstruction", arxiv:1202.1499
- Jörg Reichardt and Douglas R. White, "Role models for complex networks", arxiv:0708.0958 [Discussion]
- Purnamrita Sarkar, Peter J. Bickel, "Role of Normalization in Spectral Clustering for Stochastic Blockmodels", arxiv:1310.1495
- Daniel L. Sussman, Minh Tang, Donniell E. Fishkind, Carey E. Priebe, "A consistent dot product embedding for stochastic blockmodel graphs", arxiv:1108.2228
- Yaojia Zhu, Xiaoran Yan, Cristopher Moore, "Oriented and Degree-generated Block Models: Generating and Inferring Communities with Inhomogeneous Degree Distributions", arxiv:1205.7009
- Modesty forbids me to recommend;
- Xiaoran Yan, Jacob E. Jensen, Florent Krzakala, Cristopher Moore, CRS, Lenka Zdeborova, Pan Zhang and Yaojia Zhu, "Model Selection for Degree-corrected Block Models", arxiv:1207.3994
- To read:
- Christopher Aicher, Abigail Z. Jacobs, Aaron Clauset, "Learning Latent Block Structure in Weighted Networks", arxiv:1404.0431
- Arash A. Amini, Elizaveta Levina, "On semidefinite relaxations for the block model", arxiv:1406.5647
- Christian Borgs, Jennifer Chayes, Julia Gaudio, Samantha Petti, Subhabrata Sen, "A large deviation principle for block models", arxiv:2007.14508
- Antoine Channarond, Jean-Jacques Daudin, Stéphane Robin, "Classification and estimation in the Stochastic Block Model based on the empirical degrees", Electronic Journal of Statistics 6 (2012): 2574--2601, arxiv:1110.6517
- Donniell E. Fishkind, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe, "Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown", arxiv:1205.0309
- Qirong Ho, Le Song, Eric Xing, "Evolving Cluster Mixed-Membership Blockmodel for Time-Evolving Networks", AISTATS 2011
- Qirong Ho, Ankur Parikh, Le Song, Eric Xing, "Multiscale Community Blockmodel for Network Exploration", AISTATS 2011
- Pierre Latouche, Etienne Birmelé, and Christophe Ambroise, "Overlapping stochastic block models with application to the French political blogosphere", Annals of Applied Statistics 5 (2011): 309--336, arxiv:0910.2098
- Jing Lei, "A Goodness-of-fit Test for Stochastic Block Models", arxiv:1412.4857
- Jing Lei, Alessandro Rinaldo, "Consistency of Spectral Clustering in Sparse Stochastic Block Models", Annals of Statistics 43 (2015): 215--237, arxiv:1312.2050
- Jing Lei, Lingxue Zhu, "A Generic Sample Splitting Approach for Refined Community Recovery in Stochastic Block Models", arxiv:1411.1469
- Aaron F. McDaid, Thomas Brendan Murphy, Nial Friel, Neil J Hurley, "Clustering in networks with the collapsed Stochastic Block Model", arxiv:1203.3083
- Tiago P. Peixoto
- "Parsimonious Module Inference in Large Networks", Physical Review Letters 110 (2013): 148701, arxiv:1212.4794
- "Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models", arxiv:1310.4378
- "Hierarchical block structures and high-resolution model selection in large networks", arxiv:1310.4377
- Lijun Peng, Luis Carvalho, "Bayesian Degree-Corrected Stochastic Block Models for Community Detection", arxiv:1309.4796
- Karl Rohe, Sourav Chatterjee, Bin Yu, "Spectral clustering and the high-dimensional Stochastic Block Model", Annals of Statistics 39 (2011): 1878--1915, arxiv:1007.1684
- Karl Rohe, Tai Qin, Haoyang Fan, "The Highest Dimensional Stochastic Blockmodel with a Regularized Estimator", arxiv:1206.2380
- Mahdi Shafiei, Hugh Chipman, "Mixed-Membership Stochastic Block-Models for Transactional Networks", arxiv:1010.1437
- Kevin S. Xu, Alfred O. Hero III, "Dynamic stochastic blockmodels: Statistical models for time-evolving networks", arxiv:1304.5974
- Pan Zhang, Florent Krzakala, Jörg Reichardt, Lenka Zdeborová, "Comparative Study for Inference of Hidden Classes in Stochastic Block Models", arxiv:1207.2328