Machine Learning, Statistical Inference and Induction

15 Aug 2023 09:15

There's a place where AI, statistics and epistemology-methodology converge, or want to anyhow. "Machine learning" is the AI label: how do we make a machine that can find and learn the regularities in a data set? (If the data set is really, really big, and we care mostly about making practically valuable predictions, this becomes data mining, or "knowledge discovery in databases," KDD.) The statisticians ask very similar questions about model-fitting and hypothesis-testing. The epistemologists are mired in the problem of induction, and "inference to the best explanation" (a phrase, I am told by Kenny Easwaran, coined by Gilbert Harman; link below). The fields over-lap in the most crazy-quilt and arbitrary way: I've heard university librarians arguing over whether specific books should go to the engineering or the philosophy library, for instance.

The connection to neuroscience and cognitive science is plain: how on Earth do human beings, and other critters, actually learn? Given that there are many different strategies, which ones do organisms use, and why, and are they good ones? (It's entirely possible that we've gotten locked in to inefficient learning strategies; then the question becomes whether or not they can be improved.) Studying learning by organisms lets us test theories of learning-in-the-abstract, and vice versa: if we had, say, a good proof that a certain learning scheme simply would not work, we'd know that animals don't use it.

One fairly strong result seems to be that tabulae rasae don't work: you've got to give the machine/baby/scientist some hints, or restrict the field of possible hypotheses initially, or you'll never get anywhere. This was at least implicit in Hume, and I believe the other classical empiricists as well, but they don't seem to have been restrictive enough to account for the way we actually do learn. Natural selection is the obvious candidate for having restricted our hypothesis-set, and for having designed our learning mechanisms.

My positivist temperament can hardly help being pleased by this "attempt to introduce the experimental method of reasoning into moral subjects," which, as data mining, has massive industrial applications. My real interest in this isn't, for once, philosophical. Instead, I want to be able to quantify, or at the very least characterize, self-organization, which means I need a good way of automatically finding patterns or regularities in data-sets. For someone who's got the computational mechanics gospel, this means "inferring statistical complexity," and that means the automated construction of abstract-machine or formal-language models of data-sets. (Alternately: Figuring out how natural things compute.) And doing that well means addressing all the issues people in these areas address, so I figure I ought to just steal from them.

See also: Causality; collective cognition; clustering; conformal prediction; ensemble methods; grammatical inference; graphical models; learning in games; learning theory; the minimum description length principle; model selection; neural nets; Occam's razor; recommender systems and collaborative filtering; scientific thinking; sequential decision-making; statistics with structured data; time series; and universal prediction algorithms now get their own notebooks; other topics also need to be spun off from this one.