Analysis of Network Data
24 Dec 2008 10:24
That is, of data on the form of networks --- I don't (as such) care about packet flow or other aspects of computer networks...
Community discovery is an important sub-topic.
See also: Complex networks; Social networks; Statistics in general; Statistics of structured data
- Recommended:
- Edo Airoldi, David M. Blei, Stephen E. Fienberg, Anna Goldenberg, Eric P. Xing and Alice X. Zheng (eds.), Statistical Network Analysis: Models, Issues, and New Directions [Disclaimer: contains one of my papers.]
- Aaron Clauset and Cristopher Moore, "Accuracy and Scaling Phenomena in Internet Mapping", cond-mat/0410059 = Physical Review Letters 94 (2005): 018701
- Aaron Clauset, Cristopher Moore and M. E. J. Newman, "Structural Inference of Hierarchies in Networks", physics/0610051
- Linton C. Freeman and Douglas R. White (2003), "Using Galois Lattices to Represent Network Data", Sociological Methodology 23: 127--146 [PDF reprint]
- Diego Garlaschelli and Maria I. Loffredo, "Maximum likelihood: extracting unbiased information from complex networks", cond-mat/0609015 [This is a much-needed corrective to the physics literature, but it makes it sound as though exponential families of random graphs were invented in 2004, and they're the first ones to apply maximum likelihood to network analysis. I'm sure, however, that these are inadvertent lapses. Definitely worth reading as a first glimpse of how to do parameter estimation correctly. Thanks to Dave Feldman for pointing it out to me.]
- Krista Gile and Mark S. Handcock, "Model-based Assessment of the Impact of Missing Data on Inference for Networks" [Working Paper 66, Center for Statistics and the Social Sciences, University of Washington (2006). PDF preprint.]
- Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris (eds.), "Statistical Modeling of Social Networks with 'statnet'", special volume (24) of the Journal of Statistical Software (2008)
- Steve Hanneke and Eric Xing, "Discrete Temporal Models for Social Networks", in Airoldi et al. (eds.) above [Extending exponential-family random graph models to dynamic networks. A very cool paper, making me extra proud to have taught Steve stochastic processes. PDF preprint]
- Peter D. Hoff, Adrian E. Raftery and Mark S. Handcock, "Latent Space Approaches to Social Network Analysis", Journal of the American Statistical Association 97 (2002): 1090--1098 [PDF preprint]
- David R. Hunter, Steven M. Goodreau and Mark S. Handcock, "Goodness of Fit of Social Network Models", Journal of the American Statistical Association 103 (2008): 248--258 [PDF]
- David R. Hunter and Mark S. Handcock, "Inference in curved exponential family models for networks", Journal of Computational and Graphical Statistics 15 (2006): 565--583 [PDF preprint]
- Roger Th. A. J. Leenders
- Structure and Influence: Statistical Models for the Dynamics of Actor Attributes, Network Structure and Their Interdependence [Review forthcoming]
- "Modeling social influence through network autocorrelation: constructing the weight matrix", Social Networks 24 (2002): 21--47 [Basically, part of chapter 3 of his Structure and Influence --- mostly section 3.3. PDF preprint]
- Manul Middendorf, Etay Ziv and Chris Wiggins, "Inferring Network Mechanisms: The Drosophila melanogaster Protein Interaction Network", q-bio.QM/0408010 [Machine learning meets complex networks: specifically, learning decision trees to accurately classify networks by the process which grew them. Neat.]
- M. E. J. Newman, Steven H. Strogatz and Duncan J. Watts, "Random graphs with arbitrary degree distributions and their applications", Physical Review E 64 (2001): 026118 = cond-mat/0007235 [Though they don't quite put it this way, these methods are very naturally employed to generate surrogate network data, which keeps the degree distribution of the original but is otherwise randomized.]
- Jörg Reichardt and Douglas R. White, "Role models for complex networks", arxiv:0708.0958
- Purnamrita Sarkar and Andrew W. Moore, "Dynamic Social Network Analysis using Latent Space Models", forthcoming in Advances in Neural Information Processing Systems 18 (NIPS 2005) [Abstract, link to PDF]
- John Scott, Social Network Analysis: A Handbook [Short introductory text. Good, but heavy on the sociology and light on the math.]
- Carsten Wiuf, Markus Brameier, Oskar Hagberg and Michael P. H. Stumpf, "A likelihood approach to analysis of network data", Proceedings of the National Academy of Sciences (USA) 103 (2006): 7566--7570 [My comments. Shorter: A nice piece of work, though limited to what they call "duplication attachment" models, a limitation which is not really made clear by the abstract.]
- Douglas R. White and Vincent Duquenne, eds. (1996), special issue on "Social Network and Discrete Structure Analysis", Social Networks 18: 169--318
- To read:
- Alexandre H. Abdo and A. P. S. de Moura, "Clustering as a measure of the local topology of networks", physics/0605235 ["... clustering coefficient ... insufficient [for] describing the local topology of very simple networks. ... an extension, the clustering profile. We show, both conceptually and through applications to well studied networks, that this measure is a more complete and robust measure of clustering. It imposes stringent constraints on theoretical growth models, specially on aspects of the network structure that play a central role in dynamics on networks. ... richer perspective [on] hierarchy, small-worlds and clusterization."]
- Aris Anagnostopoulos, Ravi Kumar and Mohammad Mahdian, "Influence and Correlation in Social Networks", in KDD 2008 [Thanks to Dr. Madian for a preprint]
- Pierre Baldi et al., Modeling the Internet and the Web: Probabilistic Methods and Algorithms
- Kim Baskerville and Maya Paczuski, "Subgraph ensembles and motif discovery using an alternative heuristic for graph isomorphism", Physical Review E 74 (2006): 051903
- Johannes Berg and Michael Läassig
- "Correlated random networks," cond-mat/0205589 = Physical Review Letters 89 (2002): 228701 [Exponential families of random graphs again]
- "Bayesian analysis of biological networks: clusters, motifs, cross-species correlations", q-bio.MN/0609050
- Stephen P. Borgatti, Kathleen M. Carley and David Krackhardt, "On the robustness of centrality measures under conditions of imperfect data", Social Networks 28 (2006): 124--136
- Andrea Capocci, G. Caldarelli and P. De Los Rios, "Quantitative description and modeling of real networks," cond-mat/0206336
- Peter J. Carrington, John Scott and Stanley Wasserman (eds.), Models and Methods in Social Network Analysis [Blurb]
- Vittoria Colizza, Alessandro Flammini, M. Angeles Serrano, Alessandro Vespignani, "Detecting rich-club ordering in complex network", physics/0602134
- Luciano da F. Costa, Francisco A. Rodrigues, Gonzalo Travieso and P. R. Villas Boas, "Characterization of complex networks: A survey of measurements", cond-mat/0505185
- Jacob G. Foster, David V. Foster, Peter Grassberger and Maya Paczuski, "Link likelihoods in random networks with fixed and partially fixed degree sequence", cond-mat/0610446
- O. Frank and D. Strauss, "Markov graphs", Journal of the American Statistical Association 81 (1986): 832--842
- Mark S. Handcock, "Assessing degenarcy in statistical models of social networks", CSSS working paper 39 (2003)
- Mark S. Handcock and Krista Gile, "Modeling Social Networks with Sampled or Missing Data", working paper 75 (2007)
- Robert A. Hanneman and Mark Riddle, Introduction to Social Network Methods [Online textbook, looks good.]
- P. W. Holland S. Leinhardt, "An exponential family of probability distributions for directed graphs", Journal of the American Statistical Association 76 (1981): 33--65
- Petter Holme, "Local symmetries in complex networks", cond-mat/0608695
- H. Jeong, Zoltan Neda and A.-L. Barabasi, "Measuring preferential attachment for evolving networks," cond-mat/0104131
- Rui Jiang, Zhidong Tu, Ting Chen and Fengzhu Sun, "Network motif identification in stochastic networks", Proceedings of the National Academy of Sciences (USA) 103 (2006): 9404--9409
- Eric D. Kolaczyk, David B. Chua, Marc Barthelemy, "Co-Betweenness: A Pairwise Notion of Centrality", arxiv:0709.3420
- Geuorgi Kossinets and Duncan J. Watts, "Empirical Analysis of an Evolving Social Network", Science 311 (2006): 88--90
- Vassilis Kostakos, Eamonn O'Neill, Alan Penn, "Brief encounter networks", 0709.0223 [Networks defined by brief transactions, rather than persistent ties.]
- Matthieu Latapy, "Theory and Practice of Triangle Problems in Very Large (Sparse (Power-Law)) Graphs", cs.DS/0609116 [Time- and space- efficiency of different algorithms for finding, counting and listing triangles]
- Matthieu Latapy and Clemence Magnien, "Measuring Fundamental Properties of Real-World Complex Networks", cs.NI/0609115 [How asymptotic are we?]
- Sang Hoon Lee, Pan-Jun Kim, and Hawoong Jeong, "Statistical properties of sampled networks", cond-mat/0505232
- Yoshiharu Maeno, Yukio Ohsawa, "Node discovery problem for a social network", arxiv:0710.4975
- Philippa Pattison, Algebraic Models for Social Networks [Blurb]
- Leonid Peshkin, "Structure induction by lossless graph compression", cs.DS/0703132
- Camille Roth, "Measuring Generalized Preferential Attachment in Dynamic Social Networks", nlin.AO/0507021 [Applies more generally than to social networks]
- J. Saramaki, M. Kivela, J.-P. Onnela, K. Kaski and J. Kertesz, "Generalizations of the clustering coefficient to weighted complex networks", cond-mat/0608670
- M. Angeles Serrano, Marian Boguna, Romualdo Pastor-Satorras, "Correlations in weighted networks", cond-mat/0609029
- John Skvoretz, Thomas J. Fararo and Filip Agnesessens, "Advances in biased net theory: definitions, derivations, and estimations", Social Networks 26 (2004): 113--139
- T. A. B. Snijders, "Markov chain Monte Carlo estimation of exponential random graph models", Journal of Social Structure 2 (2002)
- B. Söderberg, "General formalism for inhomogeneous random graphs", Physical Review E 66 (2002): 066121 [Apparently a rediscovery of exponential family/Markov random graphs]
- Statnet [Interesting methods for fitting reasonable exponential-family models to network data. Or at least, they sounded very cool when I heard Martina Morris talk about them.]
- D. Strauss, "On a general class of models for interaction", SIAM Review 28 (1986): 513--527 [Apparently an early paper on Markov random graphs]
- Michael P. H. Stumpf, P. J. Ingram, I. Nouvel and Carsten Wiuf, "Statistical model selection methods applied to biological networks", Transactions in Computational Systems Biology forthcoming (2005) = q-bio.MN/0506013
- Michael P. H. Stumpf and Carsten Wiuf, "Sampling properties of random graphs: the degree distribution", cond-math/0507345 = Physical Review E 72 (2005): 036118
- Michael P. H. Stumpf, Carsten Wiuf and Robert M. May, "Subnets of scale-free networks are not scale-free: Sampling properties of networks", PNAS 102 (2005): 4221--4224
- Fabien Viger, Alain Barrat, Luca Dall'Asta, Cun-Hui Zhang, Eric D. Kolaczyk, "Network Inference from TraceRoute Measurements: Internet Topology `Species'", cs.NI/0510007
- S. Wasserman and K. Faust, Social Network Analysis
- S. Wasserman and P. Pattison, "Logit models and logistic regression for social networks: I. An introduction to Markov random graphs and p*", Psychometrika 61 (1996): 401--426
- Sebastian Weber, Markus Porto, "Generation of arbitrarily two-point correlated random networks", arxiv:0708.4161
- Hal Whitehead, Analyzing Animal Societies: Quantitative Methods for Vertebrate Social Analysis [blurb]
- To write:
- CRS, "Homophily, Contagion, Confounding: Pick Any Three"
- CRS, "Indirect Inference of Network Growth Models"
- CRS and Shawn Mankad, "Statistical Properties of Aggregated Random Graphs"
