Notebooks

Grammatical Inference

Last update: 07 Dec 2024 23:58
First version: 29 September 2001; major expansion, 6 April 2009

Meaing: inferring the rules of a formal language (its grammar) from examples, positive or negative. I'm mostly interested in the positive case, since I want to describe physical processes as though they were formal languages or (what is equivalent) automata.

--- A result which has recently come to vex me is the fact that even finite automata are, in the computational learning theory sense, hard to learn. It's widely believed, but not proved, that common cryptographic systems are hard to crack, meaning that there are no efficient, polynomial-time algorithms for them (unless you have the key, of course...). It turns out that the ability to learn polynomially-sized deterministic finite automata in polynomial time would imply the ability to defeat RSA in polynomial time, which is a pretty good indication that it can't be done. (See Kearns and Vazirani for a very nice discussion.) Initial results were about uniform distributions over words, but it turns out further that this holds even when the distribution of words is generated by a stochastic automaton of the same form.

This is extremely annoying to me, because I want to learn stochastic automata from time series, and to do so in polynomial time (if not better). One possibility is that the time-complexity is just bad, in the worst case, and there is nothing to be done about this. (I fear this might be true.) This would not, however, address the scaling of prediction error and confidence with sample size, which is really what interests me. The other possibility is that I am interested in a rather different set-up than this, one where it's crucial that, as $n$ grows, we see the continuation of a single sample path from a fixed automaton. (In the language, every allowed word is the prefix of infinitely many other allowed words.) Or: the experiment can go on forever. (In fact I'm often interested in the situation where the experiment could have been running forever, so every word is the suffix of infinitely many words.) I think, though I am not sure, that the ability to infer these languages quickly would not cause cryptographic horrors, because I think this restriction breaks the crypto-to-DFA mapping.


Notebooks: