I'd known about this book for quite some time, and browsed in it long ago, but never actually read it until this year. It's a really incredible piece of work.
Propp set out to identify the basic elements of the plots of Russian fairy tales, working at a level of abstraction where "it does not matter whether a dragon kidnaps a princess or whether a devil makes off with either a priest's or a peasant's daughter". He came up with 31 such "functions". Just listing them (chapter 3) has a certain folkloric quality:
At this point, Propp observes, the tale can more or less begin over again, with the transition from the first "move" to the second being initiated by a new act of villainy, typically "Ivan's brothers steal his prize, and throw him into a chasm" ($* A$). This leads to $C$-$G$ again.
Each abstract function has, naturally, a great many more concrete sub-types (e.g., seven distinct variants of pursuit, ranging from $Pr^1$, "the pursuer flies after the hero", to $Pr^7$, "He tries to gnaw through the tree in which the hero is taking refuge").
Based on extensive study of the corpus of Russian fairytales, Propp claims that the initial functions, designated by Greek letters, are less essential than the ones designated by Roman letters. In fact, in what I take to be the central finding of the book (ch. IX, sec. D, pp. 104--105), he claims that all the tales in the corpus belong to four, and only four, categories:
As he remarks after making these claims,
To the variable scheme \[ ABC\uparrow DEFG \frac{HJIK\downarrow Pr-Rs o L}{LMJNK\downarrow Pr-Rs} Q Ex TUW \] are subject all the tales of our material: moves with $H-I$ develop according to the upper branch; moves with $M-N$ develop according to the lower branch; moves with both pairs first follow the upper part and then, without coming to an end, develop following the lower offshoot; moves without either $H-I$ or $M-N$ develop by bypassing the distinctive elements of each. [p. 105]
What I find so astonishing here is that this is a formal grammar, though propounded many years before that notion emerged in linguistics, logic and computer science. Specifically, it is a formal grammar which generates fairytale plots. Propp realized this, and used the schema to create new fairy tales [1] (unfortunately, not recorded). A basic principle of formal language theory is that a schema which generates all and only the valid strings of a language can also be used to recognize whether a string belongs to that language; Propp implicitly grasped this, and argued on this basis that some non-fairy-tales in his corpus were more properly classed with the fairy tales.
It's especially noteworthy to me that Propp's schema is a regular grammar, i.e., at the lowest level of the Chomsky hierarchy. These correspond to the regular expressions familiar to programmers, to finite-state machines, and to (functions of) Markov chains. The production rules would be something like \[ \begin{eqnarray*} Story & \rightarrow & ActI Act2 Act3\\ Act1 & \rightarrow & ABC\uparrow DEFG\\ Act2 &\rightarrow & (Struggle | 0) (Task | 0)\\ Act3 & \rightarrow & Q Ex TUW\\ Struggle & \rightarrow & HJIK\downarrow Pr-Rs o L\\ Task & \rightarrow & LMJNK\downarrow Pr-Rs\\ \end{eqnarray*} \] using $|$ as usual to represent alternatives, and $0$ to represent a null story element. There would, then, have to be further production rules where abstract villainy, pursuit, marking of the hero, etc., are differentiated into their more concrete types.
In a pure regular grammar, which choice gets made at each application of a production rule is totally independent of the choices made at every other application of a rule. (This is because regular languages are a sub-type of "context free" languages, and is what gives both kinds of language their madlibs flavor.) Propp is at some pains to argue (pp. 109--113) that this is very, very nearly true of fairytales. The exceptions are few enough that they could, I think, be handled within the finite-state, regular-grammar framework, by expanding the set of non-terminal symbols a little.
To sum up, Propp did grammatical induction on fairytales by hand, in 1928, and came up with a regular language.
Naturally, I have questions.
I am sure that folklorists must have tackled questions like this, and I would very much appreciate pointers to the literature.
[1] Propp, pp. 111--112, on how his conclusions
"may also be verified experimentally":
He immediately goes on:
It is possible to artificially create new plots of an unlimited
number. All of these plots will reflect the basic scheme, while they may not
resemble one another. In order to create a tale artificially, one may take any
$A$, then one of the possible $B$'s then a $C\uparrow$, followed by absolutely
any $D$, than an $E$, then one of the possible $F$'s, then any $G$, and so on.
In doing this, any element may be dropped (except possibly for $A$ or $a$), or
repeated three times, or repeated in various forms. If one then distributes
functions according to the dramatis personae of the tale's supply or by
following one's own taste, these schemes come alive and become tales. Of
course, one must also keep motivations, connections, and other auxiliary
elements in mind.
Unfortunately, Propp provides no samples of tales generated in this manner. If
they have survived, it would be very interesting to read them. (Dundes,
in his introduction to the 2nd English edition, mentions "programm[ing] a computer" to do this, but I haven't tracked
down that reference (Alan Dundes, "On Computers and Folklore", Western
Folklore 24 (1965): 185--189).
The application of these conclusions to folk creation naturally
requires great caution. The psychology of the storyteller and the psychology
of his creative work as a part of the over-all psychology of creation must be
studied independently. But it is possible to assume that the basic, vivid
moments of our essentially very simple scheme also play the psychological role
of a kind of root.
Again, this cries out for follow-up study, which may well have been done.
^.
184 pp., bibliography, index
As Morfologija skazki, Leningrad, 1928; translated by Svatava Pirkova-Jakobson, Indiana University Press, 1958; second edition, revised by Louis A. Wagner and with an introduction by Alan Dundes, Austin: University of Texas Press, 1968, ISBN 9780292783768