CSSR has been implemented and successfully tested on a range of different discrete time series and sequential data streams. This page provides the full source code, released under the GNU Public License, documentation, and links to scientific papers using the CSSR algorithm.

The theory behind the algorithm was developed by Cosma Shalizi and Kristina Klinkner, under the sponsorship of James Crutchfield. The code for this implementation was written by Kristina Klinkner. For full credits, see the comments to the code, and the paper "Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences".

CSSR-v0.1.1.tar.gz (268k) is the complete
C++ source code for CSSR versions 0.1.1, plus documentation and license. The
last update was on **2 May 2008**.

As mentioned, CSSR is released under the GNU Public License. You may want to read the Copyright, license and warranty statement (plain text, 16k) before downloading. (A copy is included in the distribution.) Please note especially that CSSR is provided with ABSOLUTELY NO WARRANTY.

The ReadMe file is available in gzipped
Postscript, PDF,
and HTML. (All these versions, plus plain ASCII, are
included in the distribution.) The documentation was last updated on
**2 May 2008**. The ReadMe gives a brief overview of what
CSSR does and how it works, describes usage, offers some hints on parameters,
and explains the known issues with this implementation of the basic
algorithm.

There is a thorough explanation of the theory behind CSSR in the paper "Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences" (Cosma Rohilla Shalizi and Kristina Lisa Shalizi, pp. 504--511 in Max Chickering and Joseph Halpern (eds.), Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (Arlington, Virginia: AUAI Press, 2004 = arxiv:cs.LG/0406011). For reasons of space, that paper omits certain details, which can be found in the technical report "An Algorithm for Pattern Discovery in Time Series" (Cosma Rohilla Shalizi, Kristina Lisa Shalizi and James P. Crutchfield, Santa Fe Institute Working Paper 02-10-060 = arxiv:cs.LG/0210025).

The following instructions are for Unix-based systems, and are included in the ReadMe file.

Download CSSR-v0.1.1.tar.gz. When gunzipped and untarred, this will produce a directory called CSSR-v0.1.1, containing all the necessary header and source code files, a copy of the documentation, the release license, and a make file. Running make inside that directory will produce an executable, which should be moved to someplace in command path. On most Unix systems, the following sequence of commands will create the executable and put it in the your bin directory, usually part of your command path.

gunzip CSSR-v0.1.1.tar.gz

tar xvf CSSR-v0.1.1.tar

cd CSSR-v0.1.1

make

cp CSSR ~/bin/

The code has been successfully compiled with gcc 3.1, 3.3 and 4.0 on Macintosh OS X (10.2, 10.3 and 10.4, respectively), with gcc 3.2 on Linux (Red Hat 9), and with Microsoft Visual C++ on Windows 98 and Windows XP. (If using Visual C++, be sure to set the project type to Windows Console Application.) If you get CSSR to work on a different system, please let us know.

On some systems, compilation may produce warnings about escape sequences or the use of deprecated headers. These can be safely ignored.

As mentioned, the principal paper on CSSR is "Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences", arxiv:cs.LG/0406011. This describes the algorithm and its theoretical basis in detail, gives results on time complexity and on the convergence on the reconstructed model, and compares its performance to other approaches, including the EM algorithm and cross-validation. Some details omitted from the "Blind Construction" paper can be found in the preliminary technical report, "An Algorithm for Pattern Discovery in Time Series". The latest version at arxiv.org, arxiv:cs.LG/0210025, supersedes all earlier drafts.

**If you use CSSR in a scientific publication**, please cite
the "Blind Construction" paper. A sample BibTeX entry:

@inproceedings{CSSR-UAI-2004,

author = "Cosma Rohilla Shalizi and Kristina Lisa Klinkner",

title = "Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences",

editor = "Max Chickering and Joseph Y. Halpern",

booktitle = "Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI 2004)",

publisher = "AUAI Press",

address = "Arlington, Virginia",

year = 2004,

pages = "504--511",

url = "http://arxiv.org/abs/cs.LG/0406011"}

Alphabetical by author.

- Fabio Boschetti
- "Mapping the complexity of ecological models", Ecological Complexity forthcoming (2007) [PDF]
- Nicolas Brodu
- "Quantifying the Effect of Learning on Recurrent Spiking Neurons", forthcoming in International Joint Conference on Neural Networks [IJCNN] 2007 [PDF]
- Jean-Philippe Cointet, Emmanuel Faure and Camille Roth
- "Intertemporal topic correlations in online media", in Proceedings of the International Conference on Weblogs and Social Media [ICWSM] (Boulder, CO, USA: 2007) [PDF]
- Davis S. Friedlander, Isanu Chattopadhayay, Asok Ray, Shashi Phoha and Noah Jacobson
- "Anomaly Prediction in Mechanical System Using Symbolic Dynamics", in Proceedings of the 2003 American Control Conference, Denver, CO, 4--6 June 2003 (2003)
- David S. Friedlander, Shashi Phoha and Richard Brooks
- "Determination of Vehicle Behavior based on Distributed Sensor Network Data", in Franklin T. Luk (ed.), Advanced Signal Processing Algorithms, Architectures, and Implementations XIII (Bellingham, WA: SPIE, 2003)
- Kristina Lisa Klinkner, Cosma Rohilla Shalizi and Marcelo F. Camperi
- "Measuring Shared Information and Coordinated Activity in Neuronal Networks", pp. 667--674 in Yair Weiss, Bernhard Sch¨lkopf and John C. Platt (eds.), Advances in Neural Information Processing Systems 18 [NIPS 2005] (Cambridge, MA: MIT Press, 2006) = arxiv:q-bio.NC/0506009
- Dmitry Nerukh, George Karvounis and Robert C. Glen
- "Complexity of classical dynamics of molecular systems.
I. Methodology", Journal of Chemical Physics
**117**(2002): 9611--9617 [PDF] - "Complexity of classical dynamics of molecular systems. II. Finite
statistical complexity of a water--Na+ system", Journal of Chemical
Physics
**117**(2002): 9618--9622 [PDF] - "Quantifying the Complexity of Chaos in Multibasin Multidimensional
Dynamics of Molecular Systems", Complexity
**10:2**(2004): 40--46 [PDF] - "Dynamic Complexity of Chaotic Transitions in High-Dimensional Classical Dynamics: Leu-Enkephalin Folding", in Computational Life Sciences II, pp. 129--140 (Berlin: Springer Verlag, 2006) [PDF]
- Muntsa Padró and Lluís Padró
- "Applying a Finite Automata Acquisition Algorithm to Named Entity Recognition", in Proceedings of 5th International Workshop on Finite-State Methods and Natural Language Processing [FSMNLP'05] (2005) [PDF]
- "A Named Entity Recognition System Based on a Finite Automata Acquisition Algorithm", Procesamiento del Lenguaje Natural
**35**(2005): 319--326 [PDF] - "Approaching Sequential NLP Tasks with an Automata Acquisition Algorithm", in Proceedings of International Conference on Recent Advances in NLP [RANLP'05] (2005) [PDF]
- "ME-CSSR: an Extension of CSSR using Maximum Entropy Models", in Proceedings of Finite State Methods for Natural Language Processing (FSMNLP) 2007 [PDF]
- "Studying CSSR Algorithm Applicability on NLP Tasks", Procesamiento del Lenguaje Natural
**39**(2007): 89--96 [PDF] - Joongwoo Brian Park, Jeong Won Lee, Jae-Suk Yang, Hang-Hyun Jo and Hie-Tae Moon
- "Complexity analysis of the stock market", Physica A
**379**(2007): 179--187 = arxiv:physics/0607283 - Asok Ray
- "Symbolic dynamic analysis of complex systems for anomaly
detection", Signal Processing
**84**(2004): 1115--1130 - Cosma Rohilla Shalizi, Marcelo F. Camperi and Kristina Lisa Klinkner
- "Discovering Functional Communities in Dynamical Networks", pp. 140--157 in Edo Airoldi et al. (eds.), Statistical Network Analysis: Models, Issues and New Directions [proceedings of an ICML 2006 workshop] (New York: Springer-Verlag, 2007) = arxiv:q-bio.NC/0609008
- Dowman P. Varn and James P. Crutchfield
- "From Finite to Infinite Range Order via Annealing: The Causal Architecture
of Deformation Faulting in Annealed Close-Packed Crystals", Physics
Letters A
**324**(2004): 299--307 = arxiv:cond-mat/0307296

Please write us to let us know if you use CSSR in an application (especially
a paper), compile it on a new platform, modify it, wish to be kept informed of
future developments, or have a bug or other problem to report. However, before
sending bug reports, please read the documentation carefully, and if possible
the paper too, so that you're sure what you're experiencing *is* a bug.
Since CSSR is released under the Gnu Public License, feel free to write your
own fix for the bug, and tell us about it!

Created 7 August 2003 by CRS