Notebooks

Graph Spectra

02 Aug 2019 10:26

Yet Another Inadequate Placeholder: notes and references on the eigenvalue and eigenvector spectra of graphs, or matrices derived from graphs, and their applications, with links to community discovery being of particular interest. (Since this is a new notebook, lots of the relevant references are still scattered elsewhere.) A stub of a stab for explaining what it's about follows.

As you know, Babette, the spectrum of a matrix is the (multi-)set of its eigenvalues; the spectral properties of a matrix refer to these eigenvalues and associated eigenvectors. There are several matrices which can be defined for a graph or network, and so several objects which could be considered the graph's spectrum. The field of spectral graph theory as a whole, is mostly concerned with the spectrum of the Laplacian matrix, which requires a little elucidation.

In ordinary Euclidean space, when we say that some field "diffuses" over the space, we mean that the total quantity is conserved, and the rate of change of its density at any one point is proportional to the sum of the density's second derivatives over space: \[ \frac{\partial\rho}{\partial t}(x,y,z,t) = - \kappa \left( \frac{\partial^2 \rho}{\partial x^2}(x,y,z,t) + \frac{\partial^2 \rho}{\partial y^2}(x,y,z,t) + \frac{\partial^2 \rho}{\partial z^2}(x,y,z,t) \right) \] As Haldane said once, this is the mathematical way of saying "it oozes". The sum of the squared second partial derivatives on the right hand side is the "Laplacian" operator, written $\nabla^2$. This notion generalizes to other manifolds, as the Laplace-Beltrami operator, and the corresponding equation would express a conservative oozing along the manifold. It doesn't make sense to take derivatives when space consists of nodes in a graph, but one can take differences between neighboring points, and even second differences (like a second derivative), and the matrix which does is the graph's Laplacian. In symbols, if we write \( A \) for the adjacency matrix and \( D \) for the diagonal matrix giving each node's degree, then the Laplacian of an (undirected) graph is \[ L = D^{-1}(D-A) \] and this is the generator of a continuous-time random walk on the graph. Alternately, one considers the continuous-time diffusion equation, \[ \frac{d\rho}{dt} = -\kappa L \rho \] or its discrete-time analog, \[ \rho_{t+h} = (I-h\kappa L)\rho_t \] (Physically, the factor $\kappa$ expresses how fast the diffusion happens in real, clock time. By re-scaling time, however, one can always set $\kappa$ or $h\kappa$ to 1, so I'll do so.)

The conservation of the field $\rho$ is expressed by the fact that a constant vector, say the all-ones vector, is always an eigenvector of $L$, with eigenvalue $0$. (Equivalently, $I-L$'s leading eigenvalue is always $1$, and it, too, has the all-ones vector as an eigenvector.) The number of zero eigenvalues counts the number of connected components of the graph. (The eigenvectors are piece-wise constant on each component.) To keep things simple, I'll assume that there is only one connected component.

All of the eigenvalues of $L$ are non-negative, so we can number them $0=\lambda_0 < \lambda_1 \leq \lambda_2 \ldots \leq \lambda_n$. What do we learn from the rest of the eigen-spectrum? The eignevector with the first positive eigenvalue has both positive and negative entries (because it's orthogonal to the all-ones vector), and in fact divides the graph into two parts. More specifically, it divides the graph into two parts across which diffusion is very slow. To see this, consider an eigenvector expansion of some arbitrary initial state: \[ \rho_0 = \sum_{i=0}^{n-1}{a_i v_i} \] Now we evolve it forward in time: \[ \rho_t = \sum_{i=0}^{n-1}{a_i (I-L)^t v_i} \] Notice that every eigenvector of $L$ is also an eigenvector of $I-L$, with a simple relation between the eigenvalues: \[ (I-L) v_i = v_i - Lv_i = v_i - \lambda_i v_i = (1-\lambda_i) v_i \] So \[ \rho_t = \sum_{i=0}^{n-1}{a_i (1-\lambda_i)^t v_i} \] As $t \rightarrow \infty$, diffusion washes away any inequalities in the field, because the contribution of every eigenvector except $ v_0 $ , the all-ones vector, gets exponentially suppressed $(1-\lambda_i)^t \rightarrow 0$ . But the slowest decay mode is $ v_1 $ --- that's the pattern over the graph which is least affected by diffusion, and survives the longest.

See also: Analysis of network data; Complex networks; Epidemic models; Graph limits; Graph theory (general); Math I ought to learn; Topology and Synchronization


Notebooks: