Statistics with Structured Data

27 Feb 2017 16:30

A lot of statistical theory and methods are developed for fairly unstructred data --- each data-point is just that, a single unarticulated point in some space. What to do when observations have complicated internal structure?

One option: Break the structures into a collection of dependent unstructured observations, i.e., each observation is a realization of a non-trivial stochastic process. Examples: multivariate analysis (including its grown-up version, graphical models), time series, spatial statistics, network data analysis. Time series and spatial statistics are much better developed than network statistics not least because the dependency structures there are much simpler. Directed networks give us essentially arbitrary binary relations; hypergraphs arbitrary relational structures. This threatens (or promises) to bring up all sorts of issues from logic.

Can one do inference on the relational structure of complex observations? How? (Grammatical inference, for instance? Community discovery?)

See also: Data Mining; Machine Learning, Statistical Inference and Induction; Statistics on Manifolds