Analysis of Network Data

11 Aug 2021 14:42

That is, of data in the form of networks --- I don't (as such) care about packet flow or other aspects of computer networks...

Things I wish I knew how to do: bootstrap a network, non-parametrically. (The model with a fixed degree sequence is a start, but what's the equivalent of the block bootstraps used for time series, which preserve dependence? [Update, 2017: see my paper with Alden Green.]) Cross-validation on networks. (You could say that link prediction is leave-one-out CV, but how about k-fold CV? [Update, 2016: see the papers by Chen and Lei, and especially by Dabbs and Junker.]) Estimate a distribution over networks by somehow smoothing an adjacency matrix. Compare networks to say if they came from the same distribution [Update, 2015: See my paper with Dena Asta]. — These may or may not be aspects of a single problem.

Community discovery is an important sub-topic, and I like exponential family random graph models, stochastic block models and graph limits enough to give them their own notebooks. Many of the entries under graphical models are about figuring out the network of interaction between random variables from patterns of dependence across those variables.

Although many of the relevant papers appear in the journal Social Networks, published by Elsevier, a company known to also publish advertising disguised as peer-reviewed scientific journals (e.g., The Australasian Journal of Bone and Joint Medicine), I know of no particular reason to believe that their findings are actually meretricious propaganda on behalf of a paying client. It would, however, be better if the community would shift to a journal whose publisher did not pollute the process of scientific communication whenever it was profitable to do so.

See also: Complex networks; Community discovery; Exponential families of random graph models; Graph Theory; Graph Sampling Algorithms; Graph Spectra; Homophily vs. influence; Inferring networks from non-network data; Joint modeling of texts and networks; Network comparison; Political networks; Power laws (for questions about "scale-free" networks); Relational learning; Social networks; Statistics in general; Statistics of structured data; Visualizing network data

  • CRS, "Indirect Inference of Network Growth Models"
  • CRS and Shawn Mankad, "Statistical Properties of Aggregated Random Graphs"
  • Co-conspirators to be named later + CRS, "Smoothing Adjacency Matrices" [if we can figure out how to do it!]
  • Co-conspirators to be named later + CRS, "Network Comparisons"

  • Notebooks: