Notebooks
http://bactra.org/notebooks
Cosma's NotebooksenTransducers
http://bactra.org/notebooks/2004/03/03#transducers
The basic idea of a transducer is that it turns one sort of quantity, its
inputs, into another, its outputs. The general case to consider is one where
both inputs and outputs are time-series, and the current output is a function
not just of the current input but of the whole history of previous inputs (when
the output is a "functional" of the input series). One way of representing
this is to say that the transducer itself has internal or hidden states. The
output is a function of the state of the transducer, and that hidden state is
in turn a function of the input history. We don't <em>have</em> to do things
this way, but it can potentially give us a better handle on the structure of
the transduction process, and especially on the way the transducer stores and
processes information --- what (to anthropomorphize a little) it cares about in
the input and remembers about the past.
<P><em>Finite-state transducers</em> are a computer science idea; they also
call them "sequential machines," though I don't see why that name wouldn't also
apply to many other things they study. In this case both the input and the
output series consist of sequences of symbols from (possibly distinct) finite
alphabets. Moreover there are only a finite number of internal states. The
two main formalisms (which can be shown to be equivalent) differ only on
whether the next output is a function of the current state alone, or is a
function of the current state and the next input. While you can describe a
huge chunk of electronic technology using this formalism, CS textbooks are
curiously reluctant to give it much attention.
<P>Part of my <a href="../thesis/">dissertation</a> involved working
the <a href="computational-mechanics.html">computational-mechanics</a> voodoo
on transducers. Just as in the case of time series, one can construct optimal,
minimal models of the transduction process from joint data on the inputs and
outputs. These models tell us directly about the causal/computational
structure of the process, admittedly in a somewhat abstract form; they are
also, statistically speaking,
minimal <a href="sufficient-statistics.html">sufficient statistics</a> which
can be recursively estimated. I fondly imagine that there are big chunks of
biology these things could help us understand, such
as <a href="neural-coding.html">neural coding</a>
and <a href="signal-transduction.html">cellular signal transduction</a>.
<P><em>Things to figure out:</em> What is the work on inferring internal states
(in continuous systems) from inputs? From inputs and outputs? Applications
of <a href="grammatical-inference.html">grammatical inference</a>.
<P>See also:
<a href="control.html">Control Theory</a>
<ul>Recommended (needs organizing/subdivision):
<li>John Carroll and Darrell Long, <cite>Theory of Finite
Automata</cite> [ch. 7 discusses transducers]
<li>J. H. Conway, <cite><a href="http://bactra.org/weblog/algae-2016-02.html#conway">Regular Algebra and Finite Machines</a></cite>
<li>J. Hartmanis and R. E. Stearns, <cite>Algebraic Structure Theory of
Sequential Machines</cite>
<li>Fred Rieke, David Warland, Rob de Ruyter van Steveninck, and
William Bialek, <cite>Spikes: Exploring the Neural Code</cite> [<a
href="../reviews/spikes/">Review: Cells That Go <em>ping,</em> or, The Value of
the Three-Bit Spike</a>]
<li>Wesley Salmon, <cite>Scientific Explanation and the Causal
Structure of the World</cite> [Salmon's "statistical relevance basis" is
essentially the optimal model for memoryless transduction, though he doesn't
put it like that all]
<li>Yoram Singer, "Adaptive Mixtures of Probabilistic Transducers", <a
href="http://neco.mitpress.org/cgi/content/abstract/9/8/1711"><cite>Neural
Computation</cite> <strong>9</strong> (1997): 1711--1733</a>
<li>Naftali Tishby, Fernando C. Pereira and William Bialek, "The
Information Bottleneck Method," <a
href="http://arxiv.org/abs/physics/0004057">physics/0004057</a>
<li><a href="wiener.html">Norbert Wiener</a>, <cite>Nonlinear Problems
in Random Theory</cite> [Technique for estimating transducer functionals which
does not involve hidden states. Mathematically demanding but rewarding; see
<cite>Spikes</cite> for a simplified presentation.]
<li>Wei Biao Wu, "Nonlinear system theory: Another look at
dependence", <a
href="http://dx.doi.org/10.1073/pnas.0506715102"><cite>Proceedings of the
National Academy of Sciences</cite> <strong>102</strong> (2005):
14150--14154</a> ["we introduce previously undescribed dependence measures for
stationary causal processes. Our physical and predictive dependence measures
quantify the degree of dependence of outputs on inputs in physical systems. The
proposed dependence measures provide a natural framework for a limit theory for
stationary processes. In particular, under conditions with quite simple forms,
we present limit theorems for partial sums, empirical processes, and kernel
density estimates. The conditions are mild and easily verifiable because they
are directly related to the data-generating mechanisms."]
</ul>
<ul>Modesty forbids me to recommend:
<li>CRS and James P. Crutchfield, "Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in Memoryless Transduction", <a href="https://doi.org/10.1142/S0219525902000481"><cite>Advances in Complex Systems</cite> <strong>5</strong> (2002): 91--95</a>, <a href="http://arxiv.org/abs/nlin/0006025">arxiv:nlin/0006025</a>
</ul>
<ul>To read:
<li>C. Lee Giles, B. G. Horne and T. Lin, "Learning a Class of
Large Finite State Machines with a Recurrent Neural Network,"
Technical Report UMIACS-TR-94-94, Institute for Advanced Computer Studies,
University of Maryland-College Park (on-line through NCSTRL)
<li>André Kempe, "Look-Back and Look-Ahead in the Conversion of
Hidden Markov Models into Finite State Transducers" <a
href="http://arxiv.org/abs/cmp-lg/9802001">cmp-lg/9802001</a>
<li>Wolfgang Maass and Eduardo D. Sontag, "Neural Systems as Nonlinear
Filters," <citE>Neural Computation</cite> <strong>12</strong> (2000):
1743--1772
<li>Mehryar Mohri, "Compact Representations by Finite-State
Transducers," <a href="http://arxiv.org/abs/cmp-lg/9407003">cmp-lg/9407003</a>
<li>Christian W. Omlin and C. Lee Giles, "Extraction of Rules
from Discrete-Time Recurrent Neural Networks," Technial Report
UMIACS-TR-95-94, Institute for Advanced Computer Studies,
University of Maryland-College Park (on-line through NCSTRL)
<li>M. Pawlak, Z. Hasiewicz and P. Wachel, "On Nonparametric
Identification of Wiener
Systems", <a href="http://dx.doi.org/10.1109/TSP.2006.885684"><cite>IEEE
Transactions on Signal Processing</cite> <strong>55</strong> (2007):
482--492</a>
<li>David Pico and Francisco Casacuberta, "Some Statistical-Estimation
Methods for Stochastic Finite-State Transducers", <a href="http://dx.doi.org/10.1023/A:1010880113956"><cite>Machine
Learning</cite> <strong>44</strong> (2001): 121--141</a>
<li>Martin Schetzen, <cite>The Volterra and Wiener Theories of
Nonlinear Systems</cite>
<li>Wojciech Skut, "Incremental Construction of Minimal Acyclic
Sequential Transducers from Unsorted Data", <a
href="http://arxiv.org/abs/cs.CL/0408026">cs.CL/0408026</a>
<li>Peter N. Yianlos, <cite>Topics in Computational Hidden State
Modeling,</cite> Ph.D. Dissertation, CS Dept., Princeton 1997 (on-line
through NCSTRL)
</ul>