Chaos, Complexity and Inference: 2009 Syllabus
See this earlier post
or the course homepage for
more. This should be an RSS feed for this page, so you
can follow updates, which will generally be posted after lectures.
- Final exam
- take-home, due 12 May
- Lecture 28, (Thursday, 30 April):
Agents 4: A real-world example of agents on networks
- Notes
- Hedstrom and Aberg, "Quantitative research, agent-based
modelling and theories of the social", ch. 6 (pp. 114--144) in
Hedstrom, Dissecting the Social [PDF]; (*) Guttorp, ch. 4
- Lecture 27, (Tuesday, 28 April):
Networks 5: Community Discovery
- No slides
- Readings: Clauset, Moore and
Newman, arxiv:0811.0484
(code); Girvan and
Newman, arxiv:cond-mat/0112110; Hofman
and Wiggins, arxiv:0709.3512;
Newman, arxiv:physics/0605087;
Reichardt and
Bornholdt, arxiv:cond-mat/0603718;
- Reichardt and White, arxiv:0708.0958 [discussion]
- Lecture 26, (Thursday, 23 April):
Complex networks 4: inference for network models
- Slides
- Reading: Hanneke and Xing, "Discrete Temporal Models of Social Networks"
[PDF];
Handcock and Jones, "Likelihood-based inference for stochastic models of sexual network formation"
[PDF];
Hunter, Goodreau and Handcock, "Goodness of Fit of Social Network Models"
[PDF];
Middendorf, Ziv and Wiggins, "Inferring Network Mechanisms:
The Drosophila melanogaster Protein Interaction
Network", arxiv:q-bio/0408010;
Newman, "Structure and Function", sections IV B and V;
Newman, Strogatz and Watts, "Random graphs with arbitrary degree distributions and their applications", arxiv:cond-mat/0007235;
Wiuf, Brameier, Hagberg and Stumpf, "A likelihood approach to analysis of network data", Proceedings of the National Academy of Sciences (USA) 103 (2006): 7566--7570 [discussion]
- Lecture 25, (Tuesday, 21 April):
Agents 3: social complexity
- Slides
- Reading: Flake, ch. 17; Miller and Page, chs. 10--11; Krugman, ch. 6;
Salganik, Dodds and Watts,
"Experimental study of inequality and unpredictability in an artificial cultural market", Science 311 (2006):854--856; (*) Skyrms and Pemantle, "A Dynamic Model of Social Network Formation", arxiv:math.PR/0404101
- Lecture 24, (Thursday, 9 April):
Complex networks 3: contagion on networks
- slides
- Reading: Guttorp, sec. 2.11; Newman, "Structure and Function", sec. VIII;
Davis, Trapman, Leirs, Begon and Heesterbeek, "The abundance threshold for plague as a critical percolation phenomenon", Nature 454 (2008): 634--637; Bell, Maiden, Munoz-Solomando and Reddy, "'Mind control experiences' on the Internet: Implications for the psychiatric diagnosis of delusions", Psychopathology 39 (2006): 87--91 [PDF];
- Smith and Novella, "HIV Denial in the Internet Era", PLoS Medicine 4:8 (2007): e256; (*) Kenah and Robins, "Second look at the spread of epidemics on networks", arxiv:q-bio.QM/0610057
- Lecture 23, (Tuesday, 7 April):
Agents 2: collective phenomena and self-organization
- Slides
- Reading: Flake, ch. 16; Miller and Page, ch. 9; Krugman, introduction and ch. 1; Healy, Walking to School; Perlstein, The
Meaning of Box 722
- Lecture 22, (Thursday, 2 April):
Agent-based models 1
- Slides
- Reading: Miller and Page, chs. 6 and 7; Flake, ch. 12
- Lecture 21, (Tuesday, 30 March):
Complex networks 2: growth models
- Slides
- Reading: Newman, "Structure and function", sec. VII
- Lecture 20, (Thursday, 26 March):
Complex networks 1: basics, network properties
- Slides
- Definition of networks and basic terms. Pictures, emphasizing sex and
death. Bipartite networks and the Galois lattice construction. The small
world problem. What makes a network complex? The Erdos-Renyi model: first
properties and weaknesses.
- Watts, "The 'New' Science of Networks", Annual Review of Sociology 30 (2004): 243--270
- Newman, "The Structure and Function of Complex Networks", SIAM
Review 45 (2003): 167--256, arxiv:cond-mat/0303516
(through sec. VI, but skipping or skimming IV B and V)
- Lecture 19, (Tuesday, 24 March):
Inference from simulations 2: direct and indirect estimation
- Slides
and R
- The method of simulated moments. Example with the logistic map. Issues
with the MSM. The trade-off between tractable models and good models. The
indirect inference trick: don't match moments, match a bad-but-tractable model.
Examples of indirect inference. Convergence properties.
- Reading: Smith, "Indirect Inference"; Kendall et al., "Population
Cycles in the Pine Looper Moth: Dynamical Tests of Mechanistic Hypotheses";
(*) Gourieroux, Monfort and Renault, "Indirect Inference"
[JSTOR]
- Lecture 18, (Thursday, 19 March):
Inference from simulations 1
- The virtues and limits of simulations, especially stochastic
simulations. Active testing: deliberately trying to break your model to gain
(or lose!) confidence in it. Bootstrapping once again.
- Reading: ch. 5 and appendix B of Miller and Page; Miller, "Active Nonlinear Tests (ANTs) of Complex Simulation Models"
- Lecture 17, (Tuesday, 17 March):
Inference in general: error statistics and severe testing
- Slides
- Errors and reliable inference. Statistics as principle argument: mostly
arguments that certain errors are (or are not) absent or small, unless we're
very unlucky. Kinds of errors in statistical inferences. Construction of
reliable hypotheses: confidence sets, "probably approximately correct", more.
Severe testing. Null hypotheses. Testing for mis-specification, especially
by looking at residuals. The two chief world systems.
- Reading: Mayo and Cox, "Frequentist statistics as a theory of inductive inference",
arxiv:math.ST/0610846;
Mayo and Spanos, "Methodology in Practice: Statistical Misspecification
Testing" [PDF];
idem, "Severe Testing as a Basic Concept in a Neyman-Pearson Philosophy of Induction" [journal];
Spanos, "Curve-Fitting, the Reliability of Inductive Inference and the Error-Statistical Approach" [journal]
- Lecture 16, (Thursday, 5 March):
Heavy-tailed distributions 4: Comparing models
- Slides, R
- Goodness-of-fit tests: Glivenko-Cantelli, "the fundamental theorem
of statistics"; the Kolmgorov-Smirnov test; Monte Carlo version of the K-S
test when parameters are estimated from data; general remarks on
goodness-of-fit. Relative distribution methods for comparing distributions:
their virtues; using relative distributions to check power laws; HTTP
file size example. Vuong's likelihood-ratio test for model selection.
The flight of the albatross. Zombie-hunting.
- Readings: Clauset et al. continued; Handcock and Morris, "Relative Distribution Methods";
Freckleton and Sutherland, "Do power laws imply self-regulation?"; idem, "Do in-hospital waiting lists show
self-regulation?"
(thanks to Nick Watkins for point out those papers); Edwards et al.
"Revisiting Lévy flight search patterns
of wandering albatrosses, bumblebees and
deer";
(*) Vuong, "Likelihood Ratio Tests for Model Selection and Non-Nested
Hypotheses"
- Lecture 15 (Tuesday, 3 March)
Heavy-tailed distributions 3: Estimation
- Slides
- The right way to do it: maximum likelihood estimation for power laws;
properties of the MLE. Nonparametric bootrstrapping for error estimation. The
wrong way to do it: linear regression on a log-log plot; why it should never be
used. Determining the scaling range. Examples with city sizes.
Non-parametric density estimation, and special considerations for heavy-tailed
distributions.
- Readings: Clauset, Shalizi and Newman, "Power law distributions in
empirical
data", arxiv:0706.1062; (*)
Markovitch and
Krieger, "Nonparametric
estimation of long-tailed density functions and its application to the analysis
of World Wide Web traffic"
- Lecture 14, (Thursday, 26 February):
Heavy-tailed distributions 2: how they arise
- Slides
- "How the distributions got their tails". Deflating explanations:
measuring ex instead of x;
measuring x1/a instead of x; the central
limit theorem for multiplicative terms. Everyday explanations: exponential
growth from random times; why isn't Acoma, N.M., not the largest city in the
US?; the Yule-Simon mechanism for random growth. Exciting and mysterious
explanations: why physicists expect everything to follow a Gaussian, except at
phase transitions; self-organized criticality; a skeptical note.
- Readings: Newman, "Power laws", section IV; Krugman, ch. 3 and the last part of ch. 8; Mitzenmacher, "A Brief History of Generative Models for Power Law and Lognormal Distributions"; video of Mitzenmacher giving a talk on this material; Sornette, "Mechanism for Powerlaws without
Self-Organization", arxiv:cond-mat/0110426
- Lecture 13, (Tuesday, 24 February):
Heavy-tailed distributions 1: what they are
- Slides and R
- Highly skewed and heavy tailed distributions: definitions. Real-data
examples: words, cities, money, papers, phone calls, the Internet, earthquakes,
wars, terrorism, solar flares ("die, puny humans!"). Models: the pure power
laws (Pareto and Zipf distributions). Properties of power laws: scale-freedom,
80-20, order statistics. Inequality of condition in America. Modifications:
generalized Pareto, Yule-Simon. Non-power-law distributions: truncated power
laws, log-normal, stretched exponential/Weibull distribution.
- Newman, "Power laws, Pareto distributions and Zipf's
law", arxiv:cond-mat/0412004
(through section III)
- Lecture 12, (Thursday, 19 February):
Self-organization 2
- Slides
- Readings: Shalizi, Klinkner and Haslinger, "Quantifying Self-Organization
with Optimal
Predictors", arxiv:nlin.AO/0409024; Shalizi,
Haslinger, Rouquier, Klinkner and Moore, "Automatic Filters for the Detection
of Coherent Structure in Spatiotemporal
Systems", arxiv:nlin.CG/0508001
- Lecture 11, (Tuesday, 17 February):
Cellular automata 2: excitable media; mechanisms and abstractions
- Slides
- The heart as an assemblage of excitable elements. Excitable media as an
abstract mechanism. Philosophical excursus on "abstraction" and "mechanism".
Examples of excitable media. Chemical oscillators: Belousov-Zhabotinsky type.
Chemical oscillators: Turing type. City growth as Turing morphogenesis. Slime
mold morphogenesis. The spiral ice-canyons of Mars.
Cellular automata models of excitability: Greenberg-Hastings, cyclic CA. Some
math for these models.
- Readings: Fisch, Gravner and Griffeath, "Threshold-range scaling of excitable
cellular automata" [PDF]; idem, "Cyclic Cellular Automata in Two Dimensions" [zipped PostScript];
Greenberg and Hastings, "Spatial Patterns for Discrete Models of Diffusion
in Excitable Media" [JSTOR];
Griffeath, "Self-Organization of Random Cellular Automata: four snapshot"
[zipped
PostScript]; Krugman, ch. 2 and first part of ch. 8; ch. 4 of Guttorp
- Lecture 10, (Thursday, 12 February): Cellular automata 1
- Slides
- What cellular automata are; basic elements. Examples: diffusion-limited
aggregation, lattice gases, majority vote, Life. Mechanical
self-reproduction.
- Readings: Flake, ch. 15; Miller and Page, ch. 8
- Lecture 9, (Tuesday, 10 February): Self-organization 1, some examples
- Slides
- Levels of description; micro/macro distinctions, aggregation. Emergence of
higher levels from lower levels; how to be a good reductionist.
Self-organization. Excursus into thermodynamic entropy and the second law of
thermodynamics. How self-organization is compatible with the second law. Some
examples from fluids. Classroom demo with fluids.
- Reading: Miller and Page, ch. 4
- Office of Charles and Ray Eames, Powers of Ten, narration by Philip Morrison
- Lecture 8, (Thursday, 5 February): Randomness and determinism
- Slides
- Data compression and description length. Algorithmic information content
as the length of the shortest effective description. Random sequences are
incompressible; incompressible sequences have all the testable properties of
randomness. "Any signal distinguishable from noise is insufficiently
compressed." Algorithmic information content is uncomputable. Dynamical
systems as sources of randomness; Brudno's theorem connecting algorithmic
information content to entropy rate and so to Lyapunov exponents. "Any
determinism distinguishable from randomness is insufficiently
complicated."
- Side-note: Algorithmic Information Content and Marginal Entropies
- Readings: Flake, ch. 14; Smith, ch. 11; Poincaré, "Chance", from Science and Method [PDF]
- Lecture 7 (Tuesday, 3 February): Information Theory
- Slides
- Entropy of random variables, interpretations (randomness, variability,
uncertainty, coding). Mutual information, statistical independence, Markov
properties. Relative entropy, a.k.a. Kullback-Leibler divergence. Relative
entropy measures how easy it is to tell two distributions apart: statistical
uses in sampling, large deviations, estimation (via Fisher information).
Divergence is is negative expected log likelihood; maximum likelihood attempts
to minimize divergence. Entropy rates measure dynamical randomness; connection
between metric entropy rate, topological entropy rate, and Lyapunov exponents.
Asymptotic equipartition; convergence of log likelihoods to their expected
values.
- Reading: Feldman, "Information Theory" [PDF]
- M.C. Hawking, "Entropy",
from Fear of a Black Hole
[lyrics; mp3 (radio-safe Brief History of
Rhyme version)]
- Ray and Charles
Eames, A
Communications Primer
(via Britta Gustafson)
- Lecture 6 (Thursday, 29 January): Statistical Inference for Discrete
Random Sequences
- Slides
- Inference for Markov chains: the likelihood function; the maximum
likelihood estimate of the transition matrix; its convergence; MLE for
parameterized Markov chains. Error estimation and hypothesis testing for
Markov chains. Parametric bootrstrapping. Higher-order Markov chains. Random
functions of Markov chains; stochastic finite automata and their inference.
Variable length Markov chains. CSSR.
- Handout: Maximum Likelihood Estimation
for Markov Chains
- Readings :Guttorp, 2.7--2.9 and 2.12
(I, II, III);
Smith, ch. 10; Foulkes, "A Class of Machines Which Determine the Statistical Structure of
a Sequence of Characters" [PDF]; (*) Smith, "The Maintenance of Uncertainty" [PDF]
Lecture 5 (Tuesday, 27 January): Symbolic Dynamics
- Slides
- Discrete coarse-grainings of the continuous state space. Symbol sequences
determined by initial conditions. The shift map; moving the complexity from
the update rule to the space. Motivation: "Continuous math is hard; let's go
discretize." Refinement of partitions under dynamics. Generating partitions;
discretizing continuous dynamics without loss of information. Symbolic
dynamics as discrete stochastic processes; the full logistic map as the
Bernoulli process. Allowed and forbidden words. The topological entropy rate;
how many things can your process do? Formal languages as tools for describing
discrete sequences ("we have ways of making your dynamics talk"). Regular
expressions and their grammars. Regular expressions and their machines
("circles and arrows and a paragraph on the back of each one"). Examples of
languages and machines from the logistic map. Finite-type vs. sofic
processes.
- Handout: More on the topological entropy rate
- Readings: Daw, Finney and Tracy, "A review of symbolic analysis of experimental data" [reprint]
- Lecture 4 (Thursday, 22 January): Attractor Reconstruction and Prediction
- Slides,
lorenz-generator.pl
- "Geometry from a time series": how "time-delay embedding" and Takens's
theorem let us reconstruct attractors without knowing equations. An example
with the Lorenz system. Attractor reconstruction as a form of inference.
Change of coordinates, change of observables. Choice of reconstruction
settings, false nearest neighbor method. Prediction in the reconstructed state
space. Nearest neighbor prediction; an example with the Lorenz system.
Picking predictive settings via cross-validation.
- Side-notes: "Smooth change of coordinates"; Nonlinear predictors
- Readings: Kantz and Schreiber, chs. 3--4; Smith, chs. 7--9.
- Lecture 3 (Tuesday, 20 January): Attractors and Mixing
- Slides and R
- Attractors as generalizations of fixed points and limit cycles. Geometry
of attractors: where do trajectories go? The Hénon map, our first
multi-dimensionsional dynamical system. Multiple dimensions and memory.
Stretching and folding again. Stable and unstable directions. The Lorenz
equations, our first continuous-time dynamical system. Lyapunov exponents to
measure stability and instability. Probability on attractors. Invariant distributions again. Subtleties of convergence to the invariant distribution. Mixing
and decay of correlations. A central limit theorem for mixing processes.
CLT behavior of the logistic map.
- Readings: Flake, ch. 11; Miller and Page, chs. 1--3; Smith, chs. 4--6.
- Lecture 2 (Thursday, 15 January): More dynamics
- Slides and R
- Stability of fixed points and cycles. Bifurcations; bifurcation diagrams;
the period-doubling route to chaos; periodic windows within chaos. Devaney's
definition of chaos; how it implies sensitive dependence; stretching and
folding. The Arnold cat map. Intermittency. More ergodicity: the evolution
of densities and the Frobenius-Perron operator; time averages; the world's
simplest ergodic theorem.
- Readings: Flake, ch. 10 and sec. 11.1; Guttorp, ch. 1; Smith,
chs. 1--3.
- The Arnold Cat Map Movie (starring Marlowe the Cat, directed by Evelyn Sander)
- Lecture 1 (Tuesday, 13 January): Introduction to the course; starting with
dynamics
- Slides and R
- Administrivia. What is a model? What is a simulation? What is a
dynamical system? What is chaos? First acquaintance with the logistic map.
Fixed points and cycles. Exploring some properties of chaos.
Corrupting the Young;
Complexity;
Enigmas of Chance
Posted at June 16, 2009 09:24 | permanent link