Analysis of Network Data

17 Apr 2024 00:08

That is, of data in the form of networks --- I don't (as such) care about packet flow or other aspects of computer networks...

Things I wish I knew how to do: bootstrap a network, non-parametrically. (The model with a fixed degree sequence is a start, but what's the equivalent of the block bootstraps used for time series, which preserve dependence? [Update, 2017: see my paper with Alden Green.]) Cross-validation on networks. (You could say that link prediction is leave-one-out CV, but how about k-fold CV? [Update, 2016: see the papers by Chen and Lei, and (especially) by Dabbs and Junker.]) Estimate a distribution over networks by somehow smoothing an adjacency matrix. [Update, 2016: see Lawrence Wang's thesis.] Compare networks to say if they came from the same distribution [Update, 2015: See my paper with Dena Asta]. --- These may or may not be aspects of a single problem.

Community discovery is an important sub-topic, and I like exponential family random graph models, stochastic block models and graph limits enough to give them their own notebooks. Many of the entries under graphical models are about figuring out the network of interaction between random variables from patterns of dependence across those variables.

--- Although many of the relevant papers appear in the journal Social Networks, published by Elsevier, a company known to also publish advertising disguised as peer-reviewed scientific journals (e.g., The Australasian Journal of Bone and Joint Medicine), I know of no particular reason to believe that their findings are actually meretricious propaganda on behalf of a paying client. It would, however, be better if the community would shift to a journal whose publisher did not pollute the process of scientific communication whenever it was profitable to do so.