Attention conservation notice: I have no taste.
This monograph addresses the problem of "real-time" curve fitting in the presence of noise, from the computational and statistical viewpoints. Specifically, we examine the problem of nonlinear regression where observations $ \{Y_n: n= 1, 2, \ldots \} $ are made on a time series whose mean-value function $ \{ F_n(\theta) \} $ is known except for a finite number of parameters $ (\theta_1, \theta_2, \ldots \theta_p) = \theta^\prime $. We want to estimate this parameter. In contrast to the traditional formulation, we imagine the data arriving in temporal succession. We require that the estimation be carried out in real time so that, at each instant, the parameter estimate fully reflects all of the currently available data.(It's not all time series though: section 7.8 sketches applying the idea to experiments and estimating response surfaces.) Accordingly, most of the book is about coming up with ways of designing the $ a_n $ to ensure consistency, i.e., $ t_n \rightarrow \theta $ (in some sense), especially $ a_n $ sequences which are themselves very fast to compute.
The conventional methods of least-squares and maximum-likelihood estimation ... are inapplicable [because] ... the systems of normal equations that must be solved ... are generally so complex that it is impractical to try to solve them again and again as each new datum arrives.... Consequently, we are led to consider estimators of the "differential correction" type... defined recursively. The $ (n+1) $st estimate (based on the first $ n $ observations) is defined in terms of the $ n $th by an equation of the form \[ t_{n+1} = t_n + a_n[Y_n - F_n(t_n)] \] where $ a_n $ is a suitably chosen sequence of "smoothing" vectors.
Update, next day: added a link to Simon's comment on "continuity of approximation", and deleted an excessive "very". 4 September: replaced Simon link with one which should work outside CMU, fixed an embarrassing typo.
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Enigmas of Chance; Writing for Antiquity; Tales of Our Ancestors; Philosophy; The Dismal Science; Physics; Networks; Pleasures of Detection, Portraits of Crime; The Beloved Republic; Afghanistan and Central Asia
Posted at August 31, 2015 23:59 | permanent link
For the first time, I will be teaching a section of the course which is the pre-requisite for my spring advanced data analysis class. This is an introduction to linear regression modeling for our third-year undergrads, and others from related majors; my section is currently eighty students. Course materials, if you have some perverse desire to read them, will be posted on the class homepage twice a week.
This course is the first one in our undergraduate sequence where the students have to bring together probability, statistical theory, and analysis of actual data. I have mixed feelings about doing this through linear models. On the one hand, my experience of applied problems is that there are really very few situations where the "usual" linear model assumptions can be maintained in good conscience. On the other hand, I suspect it is usually easier to teach people the more general ideas if they've thoroughly learned a concrete special case first; and, perhaps more importantly, whatever the merits of (e.g.) Box-Cox transformations might actually be, it's the sort of thing people will expect statistics majors to know...
Addendum, later that night: I should have made it clear in the
first place that my syllabus is, up through the second exam, ripped
off borrowed with gratitude
from Rebecca Nugent, who
has taught
401 outstandingly for many years.
Update, since people have asked for it, links here (see the course page for the source files for lectures):
Posted at August 31, 2015 13:52 | permanent link
Attention conservation notice: Facile moral philosophy, loosely tied to experimental sociology.
Via I forget who, Darius Kazemi explaining "How I Won the Lottery". The whole thing absolutely must be watched from beginning to end.
Kazemi is, of course, absolutely correct in every particular. What he says in his talk about art goes also for science and scholarship. Effort, ability, networking — these can, maybe, get you more tickets. But success is, ultimately, chance.
I say this not just because it resonates with my personal experience, but because of actual experimental evidence. In a series of very ingenious experiments, Matthew Salganik, Peter Dodds and Duncan Watts have constructed "artificial cultural markets" — music download sites where they could manipulate how (if at all) previous consumers' choices fed into the choices of those who came later. In one setting, for example, people saw songs listed in order of decreasing popularity, but when you came to the website you were randomly assigned to one of a number of sub-populations, and you only saw popularity within your sub-population. Simplifying somewhat (read the papers!), what Salganik et al. showed is that while there is some correlation in popularity across the different experimental sub-populations, it is quite weak. Moreover, as in the real world, the distribution of popularity is ridiculously heavy tailed (and skewed to the right): the same song can end up dominating the charts or just scraping by, depending entirely on accidents of chance (or experimental design).
In other words: lottery tickets.
If one has been successful, it is very tempting to think that one deserves it, that this is somehow reward for merit, that one is somehow better than those who did not succeed and were not rewarded. The moral to take from Kazemi, and from Salganik et al., is that while those who have won the lottery are more likely to have done something to get multiple tickets than those who haven't, they are intrinsically no better than many losers. How, then, those who find themselves holding winning tickets should act is another matter, but at the least they oughtn't to delude themselves about the source of their good fortune.
Posted at August 04, 2015 23:11 | permanent link