Notebooks

## Transducers

03 Mar 2004 16:20

The basic idea of a transducer is that it turns one sort of quantity, its inputs, into another, its outputs. The general case to consider is one where both inputs and outputs are time-series, and the current output is a function not just of the current input but of the whole history of previous inputs (when the output is a "functional" of the input series). One way of representing this is to say that the transducer itself has internal or hidden states. The output is a function of the state of the transducer, and that hidden state is in turn a function of the input history. We don't have to do things this way, but it can potentially give us a better handle on the structure of the transduction process, and especially on the way the transducer stores and processes information --- what (to anthropomorphize a little) it cares about in the input and remembers about the past.

Finite-state transducers are a computer science idea; they also call them "sequential machines," though I don't see why that name wouldn't also apply to many other things they study. In this case both the input and the output series consist of sequences of symbols from (possibly distinct) finite alphabets. Moreover there are only a finite number of internal states. The two main formalisms (which can be shown to be equivalent) differ only on whether the next output is a function of the current state alone, or is a function of the current state and the next input. While you can describe a huge chunk of electronic technology using this formalism, CS textbooks are curiously reluctant to give it much attention.

Part of my dissertation involved working the computational-mechanics voodoo on transducers. Just as in the case of time series, one can construct optimal, minimal models of the transduction process from joint data on the inputs and outputs. These models tell us directly about the causal/computational structure of the process, admittedly in a somewhat abstract form; they are also, statistically speaking, minimal sufficient statistics which can be recursively estimated. I fondly imagine that there are big chunks of biology these things could help us understand, such as neural coding and cellular signal transduction.

Things to figure out: What is the work on inferring internal states (in continuous systems) from inputs? From inputs and outputs? Applications of grammatical inference.

Recommended (needs organizing/subdivision):
• John Carroll and Darrell Long, Theory of Finite Automata [ch. 7 discusses transducers]
• J. H. Conway, Regular Algebra and Finite Machines
• J. Hartmanis and R. E. Stearns, Algebraic Structure Theory of Sequential Machines
• Fred Rieke, David Warland, Rob de Ruyter van Steveninck, and William Bialek, Spikes: Exploring the Neural Code [Review: Cells That Go ping, or, The Value of the Three-Bit Spike]
• Wesley Salmon, Scientific Explanation and the Causal Structure of the World [Salmon's "statistical relevance basis" is essentially the optimal model for memoryless transduction, though he doesn't put it like that all]
• Yoram Singer, "Adaptive Mixtures of Probabilistic Transducers", Neural Computation 9 (1997): 1711--1733
• Naftali Tishby, Fernando C. Pereira and William Bialek, "The Information Bottleneck Method," physics/0004057
• Norbert Wiener, Nonlinear Problems in Random Theory [Technique for estimating transducer functionals which does not involve hidden states. Mathematically demanding but rewarding; see Spikes for a simplified presentation.]
• Wei Biao Wu, "Nonlinear system theory: Another look at dependence", Proceedings of the National Academy of Sciences 102 (2005): 14150--14154 ["we introduce previously undescribed dependence measures for stationary causal processes. Our physical and predictive dependence measures quantify the degree of dependence of outputs on inputs in physical systems. The proposed dependence measures provide a natural framework for a limit theory for stationary processes. In particular, under conditions with quite simple forms, we present limit theorems for partial sums, empirical processes, and kernel density estimates. The conditions are mild and easily verifiable because they are directly related to the data-generating mechanisms."]