Links, Pleading to be Dumped
Attention conservation notice: Yet more cleaning out of
to-be-blogged bookmarks, with links of a more technical nature than
last time. Contains log-rolling promotion of work by
friends, acquaintances, and senior colleagues.
Pleas for Attention
Wolfgang
Beirl raises
an interesting question in statistical mechanics: what is " the
current state-of-the-art if one needs to distinguish a weak 1st order phase
transition from a 2nd order transition with lattice simulations?" (This is
presumably unrelated to
Wolfgang's diabolical
puzzle-picture.)
Maxim Raginsky's new blog, The Information Structuralist. Jon Wilkin's new blog,
Lost in Transcription.
Jennifer Jacquet's long-running blog, Guilty Planet.
Larry Wasserman has started a
new wiki for inequalities in
statistics and machine learning; I contributed an entry
on Markov's
inequality. Relatedly:
Larry's lecture notes for
intermediate statistics, starting with Vapnik-Chervonenkis theory. (It
really does make more sense that way.)
Pleas for Connection
Sharad Goel on birds
of a feather shopping together, on the basis of data set that sounds really
quite incredible. "It's perhaps tempting to conclude from these results that
shopping is contagious .... Though there is probably some truth to that claim,
establishing such is neither our objective nor justified from our analysis."
(Thank you!)
Mark Liberman on
the Wason selection
test. There is I feel something quite deep here for ideas that connect the
meaning of words to their use, or, more operationally, test whether someone
understands a concept by their ability to use it; but I'm not feeling equal to
articulating this.
What it's like being
a bipolar writer.
What it's like being
a schizophrenic
neuroscientist (the latter via Mind Hacks).
Pleas for Correction
The
Phantom of Heilbronn, in which the combined police forces of Europe spend
years chasing a female serial killer, known solely from DNA evidence, only to
find that it's all down to contaminated cotton swabs from a single supplier.
Draw your own morals for data mining and the national surveillance state. (Via
arsyed on delicious.)
Herbert Simon
and Paul
Samuelson take turns, back in 1962
beating
up on Milton Friedman's "Methodology of Positive Economics", an essay whose
exquisite awfulness is matched only by its malign influence. (This is a very
large scan of a xerox copy, from the CMU
library's online
collection of Simon's personal files.) Back in July, Robert
Solow testified
before Congress on "Building a Science of Economics for the Real World"
(via Daniel McDonald).
To put it in "shorter Solow" form: I helped invent macroeconomics, and let me
assure you that this was not what we had in mind. Related,
James Morley on
DSGEs (via Brad DeLong).
Pleas for Scholarly Attention
This brings us to the paper-link-dump portion of the program.
- James K. Galbraith, Olivier Giovanni and Ann J. Russo, "The
Fed's Real Reaction Function: Monetary Policy, Inflation, Unemployment,
Inequality — and Presidential Politics", University of Texas Inequality
Project working
paper 42, 2007
- A crucial posit of the kind of models Solow and Morley complain about above
is that the central bank acts as a benevolent (and
far-sighted) central
planner. Concretely, they generally assume that the central bank follows
some version of the Taylor
Rule, which basically says "keep both the rate of inflation and the rate of
real economic growth steady". What Galbraith et al. do is look at what
actually predicts the Fed's actions. The Taylor Rule works much less
well, it turns out, than the assumption that Fed policy is a tool of class and
partisan struggle. It would amuse me greatly to see what happens in something
like the Kydland-Prescott
model with this reaction function.
- Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, Patrik O. Hoyer, "Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity", Journal of Machine Learning Research 11
(2010): 1709--1731
- The Galbraith et al. paper, like a great deal of modern
macroeconometrics, uses a structural vector autoregression. The usual ways of
estimating such models have a number of drawbacks — oh, I'll just turn it
over to the abstract. "Analysis of causal effects between continuous-valued
variables typically uses either autoregressive models or structural equation
models with instantaneous effects. Estimation of Gaussian, linear structural
equation models poses serious identifiability problems, which is why it was
recently proposed to use non-Gaussian models. Here, we show how to combine the
non-Gaussian instantaneous model with autoregressive models. This is
effectively what is called a structural vector autoregression (SVAR) model, and
thus our work contributes to the long-standing problem of how to estimate
SVAR's. We show that such a non-Gaussian model is identifiable without prior
knowledge of network structure. We propose computationally efficient methods
for estimating the model, as well as methods to assess the significance of the
causal influences. The model is successfully applied on financial and brain
imaging data." (Disclaimer: Patrik is an acquaintance.)
- Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard
Schölkopf, Gert R.G. Lanckriet, "Hilbert Space Embeddings and Metrics on
Probability
Measures", Journal
of Machine Learning Research
11 (2010): 1517--1561
- There's been a lot of work
recently on representing probability distributions by representing them
as points in Hilbert spaces, because really, who doesn't love a
Hilbert space? (One
can see this as both the long-run recognition
that Wahba was on to something
profound when she realized
that splines
became much more
comprehensible
in reproducing-kernel
Hilbert spaces, and the influence of
the kernel trick
itself.) But there are multiple ways to do this, and it would be nicest if we
could chose a representation which has useful probabilistic properties
--- distance in the Hilbert space should be zero only when the distributions
are the same, and for many purposes it would be even better if the distance in
the Hilbert space "metrized" weak
convergence, a.k.a. convergence in distribution. This paper gives
comprehensible criteria for these properties to hold in a lot of important
domains.
- Robert Haslinger, Gordon Pipa and Emery Brown, "Discrete Time Rescaling
Theorem: Determining Goodness of Fit for Discrete Time Statistical Models of
Neural Spiking", Neural
Computation 22 (2010): 2477--2506
- A broad principle in statistics is that if you have found the right model,
whatever the model can't account for should look e completely structureless.
One expression of this is the bit
of folklore
in information
theory that an optimally compressed signal is indistinguishable from pure
noise (i.e.,
a Bernoulli
process with p=0.5). Another manifestation is residual checking in
regression models: to the extent there are patterns in your residuals, you are
missing systematic effects. One can make out a good case that this is a better
way of comparing models than just asking which has smaller residuals.
For example, Aris Spanos argues
(Philosophy of
Science 74 (2007):
1046--1066; PDF
preprint) that looking for small residuals might well lead one to prefer a
Ptolemaic model for the motion of Mars to that of Kepler, but the Ptolemaic
residuals are highly systematic, while Kepler's are not.
- Getting this idea into a usable form for a particular kind of data requires
knowing what "structureless noise" means in that context.
For point processes,
"structureless noise" is
a homogeneous Poisson
process, where events occur at a constant rate per unit time, and nothing
ever alters the rate. If you have another sort of point process, and you know
the intensity function, you can use that to transform the original point
process into something that looks just like a homogeneous Poisson process, by
"time-rescaling" --- you stretch out the distance between points when the
intensity is high, and squeeze them together where the intensity is low, to
achieve a constant density of
points. (Details.)
This forms the basis for
a very cute
goodness-of-fit test for point processes, but only in continuous time. As
you may have noticed, actual continuous-time observations are rather
scarce; we almost always have data with a finite time resolution. The usual
tactic has been to hope that the time bins are small enough that we can pretend
our observations are in continuous time, i.e., to ignore the issue. This paper
shows how to make the same trick work in discrete time, with really minimal
modifications. (Disclaimer: Rob is an old friend and frequent
collaborator, and two of the co-authors on the original time-rescaling paper
are senior faculty in my department.)
And now, back to work.
Manual trackback: Beyond Microfoundations
Linkage;
Enigmas of Chance;
The Dismal Science;
Minds, Brains, and
Neurons;
Physics;
Networks;
Commit a Social Science;
Incestuous Amplification
Posted at September 04, 2010 11:05 | permanent link