Linguistics

Last update: 19 May 2025 17:50
First version: Before 13 March 1995

Yet Another Inadequate Placeholder.

Things I want to learn more about: statistical language processing; pragmatics; semantics; "functional grammar".

Agent-based models of language change warrant their own notebook.

Query on the reliability of historical linguistics. A large part of historical linguistics consists of reconstructing languages which have left no written records, by means of extant or recorded descendants. The paradigm, as it were, is the reconstruction of proto-Indo-European from the recorded Indo-European languages. Accompanying such reconstructions, historical linguists also postulate regular rules for how the sounds in words in the ancestral language changed into different sounds in corresponding words in the descendant languages; similarly for other features of the language, like grammatical rules, conjugations, etc. (You could simply think of these as correspondence rules between the extant languages, without necessarily invoking an ancestor, if you liked, though the ancestor is a very natural hypothesis.) Now, obviously, I'm not competent to critique any of this, but I would like to know if the reliability of linguists at performing such reconstructions, and discovering correspondences, has ever been systematically tested. One test would be to give linguists corpora from related languages whose common ancestor is well-known, and see how well they could reconstruct that ancestor. (E.g., give them the modern Romance languages, and see how close they get to Latin.) Alternately, we could give them samples from languages which are actually unrelated, but tell them they are all connected, and see if they nonetheless come up with regular sound-change patterns and so forth. Has anyone ever done anything like these tests?

Update, 29 March 2005: John O'Neil writes to tell me that both the tests I describe above are, in fact, common exercises in graduate classes in historical and comparative linguistics! He doesn't know of any statistical studies on this kind of thing, however. Also, I am ashamed to learn that the immediate ancestor of the extant Romance languages was not, in fact, literary Latin but "proto-Romance", which had already, e.g., lost noun declensions. (Ashamed, because I should have known that.) I also should take this opportunity to stress that I am not skeptical about the reliability of mainstream historical linguistics in general, just curious if we can quantify that reliability, and about how general ideas about error and the growth of knowledge apply here.

Update, 20 September 2007: Brendan Shean points me to a very neat project on doing actual statistical inference for sound-change rules, and ultimately for linguistic phylogenetic trees. See Bouchard-Cote et al. below.

William H. Calvin and Derek Bickerton, Lingua ex Machina: Reconciling Darwin and Chomsky with the Human Brain
Noam Chomsky
- "A Review of B. F. Skinner's Verbal Behavior," Language 35 (1959): 26--58 [online]
- Syntactic Structures
Randy Allen Harris, The Linguistics Wars [I have read the first edition and recommend it unreservedly, and there is now a second which I am eager to get into]
Zellig Harris, Language and Information [Interesting old review by Bruce Nevin. My comments.]
Ray Jackendoff, Foundations of Language: Brain, Meaning, Grammar, Evolution [Review by Andrew Carstairs-McCarthy in American Scientist; my review: The Object-Oriented Turn in Generative Grammar]
LanguageLog
Mark Liberman and Geoffrey K. Pullum, Far from the Madding Gerund: And Other Dispatches from Language Log
Stephen Pinker, The Language Instinct
Stephen Pinker and Ray Jackendoff, "The Faculty of Language: What's Special about It?", Cognition 95 (2005): 201--236 [preprint]
Dan Sperber and Deirdre Wilson, Relevance: Cognition and Communication

Steven Abney, "Statistical Methods and Linguistics," in Judith Klavans and Philip Resnik (eds.), The Balancing Act: Combining Symbolic and Statistical Approaches to Language (1996) [PDF; Abney's other papers]
Alexandre Bouchard-Côté, Percy Liang, Thomas Griffiths, and Dan Klein, "A Probabilistic Approach to Diachronic Phonology", conference on Empirical Methods on Natural Language Processing 2007 [free PDF, slides]
Catherine Emmott, Narrative Comprehension: A Discourse Perspective
John Goldsmith, review of Bruce Nevin (ed.), The Legacy of Zellig Harris, in Language 81 (2005): 719--736 [PDF. Recommended as an interesting introduction to Harris. Makes the important connection to the minimum description length principle. Thanks to Prof. Goldsmith for letting me know about his paper.]
John McWhorter, Word on the Street
Neil Mercer, Words and Minds: How We Use Language to Think Together
Thomas B. Pepinsky, "On Whorfian Socioeconomics", SSRN/33123347
Fernando Pereira, "Formal grammar and information theory: together again?", Philosophical Transactions of the Royal Society 358 (2000): 1239--1253 [PDF preprint; commentary from Mark Liberman]
Geoffrey K. Pullum
- The Great Eskimo Vocabulary Hoax, and Other Essays
- "Ideology, Power, and Linguistic Theory" [PDF]

Stephen G. Alter, Darwinism and the Linguistic Image: Language, Race, and Natural Theology in the Nineteenth Century
N. Asher and A. Lascarides, Logics of Conversation
R. Harald Baayen, Analyzing Linguistic Data: A Practical Introduction to Statistics Using R
Mark C. Baker, The Atoms of Language: The Mind's Hidden Rules of Grammar
Robert F. Barsky, Zellig Harris: From American Linguistics to Socialist Zionism
Derek Bickerton
Diane Blakemore, Relevance and Linguistic Meaning: The Semantics and Pragmatics of Discourse Markers
Andreas Blume, "A Learning-Efficiency Explanation of Structure in Language", Theory and Decision 57 (2004): 265--285
Robert Andrew Blust, 101 Problems and Solutions in Historical Linguistics: A Workbook
Rens Bod, Beyond Grammar: An Experience-based theory of language [Free online]
Rens Bod, Jennifer Hay and Stefanie Jannedy (eds.), Probabilistic Linguistics
Ted Briscoe (ed.), Linguistic Evolution Through Language Acquisition: Formal and Computational Models
Penelope Brown and Stephen C. Levinson, Politeness: Some universals in language usage
Joan Bybee, Language, Usage and Cognition
Nick Chater, Alexander Clark, John A. Goldsmith, and Amy Perfors, Empiricism and Language Learnability
Gennaro Chierchia, Meaning and Grammar: An Introduction to Semantics
Morten H. Christiansen and Nick Chater, Creating Language: Integrating Evolution, Acquisition, and Processing
Herbert H. Clark, [Using Language
Anthony Corbeill, Sexing the World: Grammatical Gender and Biological Sex in Ancient Rome
Ewa Dabrowska, Language, Mind, and Brain: Some Psychological and Neurological Constraints on Theories of Grammar
T. Deacon
Lukasz Debowski, "Hilberg's Law and Its Links with Guiruad's Law", cs.CL/0507022 ["Hilberg (1990) supposed that finite-order excess entropy of a random human text is proportional to the square root of the text length. Assuming that Hilberg's hypothesis is true, we derive Guiraud's law, which states that the number of word types in a text is greater than proportional to the square root of the text length. Our derivation is based on some mathematical conjecture in coding theory and on several experiments suggesting that words can be defined approximately as the nonterminals of the shortest context-free grammar for the text."]
Peter Ford Dominey, "From Sensorimotor Sequence to Grammatical Construction: Evidence from Simulation and Neurophysiology", Adaptive Behavior 13 (2005): 347--361 [Very cool, if it's right: "... describes a functional trajectory from sensorimotor sequence learning to the learning of grammatical constructions in language. ... review of the functional neurophysiology of the cortex and basal ganglia ... as background for a neural network model of this system in sensorimotor sequence learning. Sequential behavior ... defined in terms of serial, temporal and abstract structure. The resulting neuro-computational framework ... account[s] for observed sequence learning .... framework naturally extends to grammatical constructions as form-to-meaning mappings. Predictions ... concerning parallels in language and cognitive sequence processing are tested against behavioral and neurophysiological observations in humans, resulting in a refinement of the allocation of model functions to subdivisions of Broca's area. From a functional perspective this analysis will provide insight into the relation between the coding structure in human languages, and constraints derived from the underlying neurophysiological computational mechanisms." PDF preprint]
Umberto Eco, The Search for the Perfect Language
N. J. Enfield, Linguistic Epidemiology: Semantics and Grammar of Language Contact in Mainland Southeast Asia
Peter Gärdenfors, The Geometry of Meaning: Semantics Based on Conceptual Spaces
Adele Goldberg
- Constructions at Work: The Nature of Generalization in Language
- Explain Me This: Creativity, Competition, and the Partial Productivity of Constructions
John A. Goldsmith and Bernard Laks, Battle in the Mind Fields
Arthur C. Graesser, Keith K. Millis and Rolf A. Zwaan, "Discourse Comprehension," Annual Review of Psychology 48 (1997) 163--89
Simon J. Greenhill, Chieh-Hsi Wu, Xia Hua, Michael Dunn, Stephen C. Levinson, and Russell D. Gray, "Evolutionary dynamics of language systems", Proceedings of the National Academy of Sciences (USA) 114 (2017): E8822--E8829
Maria Teresa Guasti, Language Acquisition: The Growth of Grammar
Patricia Hanna and Bernard Harrison, Word and World: Practice and the Foundations of Language
Zellig Harris
- "A Theory of Language Structure", American Philosophical Quarterly 13 (1976): 237--255 [JSTOR]
- "Grammar on Mathematical Principles", Journal of Linguistics 14 (1978): 1--20 [JSTOR]
- "The Structure of Science Information", Journal of Biomedical Informatics 35 (2002): 215--221
Arturo Hernandez, Ping Li and Brian MacWhinney, "The emergence of competing modules in bilingualism", Trends in Cognitive Sciences 9 (2005): 220--225
Kathy Hirsh-Pasek and Roberta Michnick Golinkoff, The Origins of Grammar: Evidence from Early Language Comprehension
John C. L. Ingram, Neurolinguistics: An Introduction to Spoken Language Processing and its Disorders
Ray Jackendoff, A User's Guide to Thought and Meaning
Edward L. Keenan and Lawrence S. Moss, Mathematical Structures in Languages
Dan Klein and Christopher D. Manning, "Natural language grammar induction with a generative constituent-context model", Pattern Recognition 38 (2005): 1407--1419
Chris Knight et al. (eds.), The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form
Paul Kroger, Analyzing Grammar: An Introduction
Patricia K. Kuhl, "Early Language Acquisition: Cracking the Speech Code", Nature Reviews Neuroscience 5 (2004): 831--843
Stephen C. Levinson, Presumptive Meanings: The Theory of Generalized Conversational Implicature
John Arthur Lucy, Grammatical Categories and Cognition: A Case Study of the Linguistic Relativity Hypothesis
Margaret Masterman, Language, Cohesion and Form
James D. McCawley, Everything that Linguists Have Always Wanted to Know about Logic --- but Were Ashamed to Ask
Janet L. McDonald, "Language Acquisition: The Acquisition of Linguistic Structure in Normal and Special Populations", Annal Review of Psychology 48 (1997): 215--2141
Bob McMurray, "Defusing the Childhood Vocabulary Explosion", Science 317 (2007): 631
John McWhorter, The Power of Babel
Takashi Morita, Hiroki Koda, "Superregular grammars do not provide additional explanatory power but allow for a compact analysis of animal song", arxiv:1811.02507
Adilson E. Motter, Alessandro P. S. de Moura, Ying-Cheng Lai, and Partha Dasgupta, "Topology of the conceptual network of language," Physical Review E 65 (2002): 065102(R), cond-mat/0206530
Andrea Moro, The Boundaries of Babel: The Brain and the Enigma of Impossible Languages
Salikoko S. Mufwene, The Ecology of Language Evolution [Review by Danny Yee]
Frederick J. Newmeyer, Language Form and Language Function
Frederick J. Newmeyer and Laurel B. Preston (eds.), Measuring Grammatical Complexity
Johanna Nichols, Linguistic Diversity in Time and Space [In the words of a correspondent: "looked at a number of features of languages throughout the world, and argued that their distribution correlates to each other and to a possible initial migration of humans around the world"]
Partha Niyogi, The Computational Nature of Language Learning and Evolution
Prashant Parikh, The Use of Language
Stephen Pinker
- The Stuff of Thought
- Words and Rules
Geoffrey K. Pullum and Barbara C. Scholz
- "Empirical assessment of stimulus poverty arguments", The Linguistic Review 19 (2002): 9--50
- "Contrasting applications of logic in natural language syntactic description" in Petr Hajek, Luis Valdes-Villanueva, and Dag Westerstahl (eds.), Logic, Methodology and Philosophy of Science: Proceedings of the Twelfth International Congress, pp. 481--503 [pdf]
Geoffrey K. Pullum and James Rogers, "Animal Pattern-Learning Experiments: Some Mathematical Background" [PDF preprint]
Friedemann Pulvermuller, The Neuroscience of Language: On Brain Circuits of Words and Serial Order
Christian Ramiro, Mahesh Srinivasan, Barbara C. Malt, and Yang Xu, "Algorithms in the historical emergence of word senses", Proceedings of the National Academy of Sciences (USA) 115 (2018): 2323--2328
Nikolaus Ritt, Selfish Sounds and Linguistic Evolution: A Darwinian Approach to Language Change
David Rose, "A Systemic Functional Approach to Language Evolution", Cambridge Archaeological Journal 16 (2006): 73--96
Deb Roy, "Grounding words in perception and action: computational insights", Trends in Cognitive Sciences 9 (2005): 389--396 [I heard Roy talk about his work at the "predictive knowledge" workshop at ICML 2005; it seemed very cool, but left me wanting details...]
P. Thomas Schoenemann, "Syntax as an Emergent Characteristic of the Evolution of Semantic Complexity", Minds and Machines 9 (1999): 309--346
Thomas Schürmann, Peter Grassberger, "The predictability of letters in written English", Fractals 4 (1996): 1--5, arxiv:0710.4516 [Shades of Zellig Harris]
Ann Senghas, Sotaro Kita, and Asli Özyürek, "Children Creating Core Properties of Language: Evidence from an Emerging Sign Language in Nicaragua", Science 305 (2004): 1779--1782
John Taylor, Cognitive Grammar
Geoff Thompson, Introducing Functional Grammar
Michael Tomasello, Constructing a Language: A Usage-Based Theory of Languagge Acquisition
Peter Trudgill, Sociolinguistic Typology: Social Determinants of Linguistic Complexity
Deirdre Wilson and Dan Sperber, Meaning and Relevance
Florian Wolf and Edward Gibson, Coherence in Natural Language: Data Structures and Applications ["The biggest step forward" in discourse research "since Aristotle" --- Mark Liberman]
Damian H. Zanette, "Demographic growth and the distribution of language sizes", arxiv:0710.1511