Linguistics
03 Apr 2024 09:33Yet Another Inadequate Placeholder.
Things I want to learn more about: statistical language processing; pragmatics; semantics; "functional grammar".
Agent-based models of language change warrant their own notebook.
Query on the reliability of historical linguistics. A large part of historical linguistics consists of reconstructing languages which have left no written records, by means of extant or recorded descendants. The paradigm, as it were, is the reconstruction of proto-Indo-European from the recorded Indo-European languages. Accompanying such reconstructions, historical linguists also postulate regular rules for how the sounds in words in the ancestral language changed into different sounds in corresponding words in the descendant languages; similarly for other features of the language, like grammatical rules, conjugations, etc. (You could simply think of these as correspondence rules between the extant languages, without necessarily invoking an ancestor, if you liked, though the ancestor is a very natural hypothesis.) Now, obviously, I'm not competent to critique any of this, but I would like to know if the reliability of linguists at performing such reconstructions, and discovering correspondences, has ever been systematically tested. One test would be to give linguists corpora from related languages whose common ancestor is well-known, and see how well they could reconstruct that ancestor. (E.g., give them the modern Romance languages, and see how close they get to Latin.) Alternately, we could give them samples from languages which are actually unrelated, but tell them they are all connected, and see if they nonetheless come up with regular sound-change patterns and so forth. Has anyone ever done anything like these tests?
Update, 29 March 2005: John O'Neil writes to tell me that both the tests I describe above are, in fact, common exercises in graduate classes in historical and comparative linguistics! He doesn't know of any statistical studies on this kind of thing, however. Also, I am ashamed to learn that the immediate ancestor of the extant Romance languages was not, in fact, literary Latin but "proto-Romance", which had already, e.g., lost noun declensions. (Ashamed, because I should have known that.) I also should take this opportunity to stress that I am not skeptical about the reliability of mainstream historical linguistics in general, just curious if we can quantify that reliability, and about how general ideas about error and the growth of knowledge apply here.
Update, 20 September 2007: Brendan Shean points me to a very neat project on doing actual statistical inference for sound-change rules, and ultimately for linguistic phylogenetic trees. See Bouchard-Cote et al. below.
See also: Analogy and Metaphor; Cognitive Science; Collective Cognition; Grammatical Inference; Narratives; Rhetoric; Semiotics; Structuralism; Text Mining
- Recommended, big picture:
- William H. Calvin and Derek Bickerton, Lingua ex Machina: Reconciling Darwin and Chomsky with the Human Brain
- Noam Chomsky
- "A Review of B. F. Skinner's Verbal Behavior," Language 35 (1959): 26--58 [online]
- Syntactic Structures
- Randy Allen Harris, The Linguistics Wars [I have read the first edition and recommend it unreservedly, and there is now a second which I am eager to get into]
- Zellig Harris, Language and Information [Interesting old review by Bruce Nevin. My comments.]
- Ray Jackendoff, Foundations of Language: Brain, Meaning, Grammar, Evolution [Review by Andrew Carstairs-McCarthy in American Scientist; my review: The Object-Oriented Turn in Generative Grammar]
- LanguageLog
- Mark Liberman and Geoffrey K. Pullum, Far from the Madding Gerund: And Other Dispatches from Language Log
- Stephen Pinker, The Language Instinct
- Stephen Pinker and Ray Jackendoff, "The Faculty of Language: What's Special about It?", Cognition 95 (2005): 201--236 [preprint]
- Dan Sperber and Deirdre Wilson, Relevance: Cognition and Communication
- Recommended, close-ups (very misc.):
- Steven Abney, "Statistical Methods and Linguistics," in Judith Klavans and Philip Resnik (eds.), The Balancing Act: Combining Symbolic and Statistical Approaches to Language (1996) [PDF; Abney's other papers]
- Alexandre Bouchard-Côté, Percy Liang, Thomas Griffiths, and Dan Klein, "A Probabilistic Approach to Diachronic Phonology", conference on Empirical Methods on Natural Language Processing 2007 [free PDF, slides]
- Catherine Emmott, Narrative Comprehension: A Discourse Perspective
- John Goldsmith, review of Bruce Nevin (ed.), The Legacy of Zellig Harris, in Language 81 (2005): 719--736 [PDF. Recommended as an interesting introduction to Harris. Makes the important connection to the minimum description length principle. Thanks to Prof. Goldsmith for letting me know about his paper.]
- John McWhorter, Word on the Street
- Neil Mercer, Words and Minds: How We Use Language to Think Together
- Thomas B. Pepinsky, "On Whorfian Socioeconomics", SSRN/33123347
- Fernando Pereira, "Formal grammar and information theory: together again?", Philosophical Transactions of the Royal Society 358 (2000): 1239--1253 [PDF preprint; commentary from Mark Liberman]
- Geoffrey K. Pullum
- The Great Eskimo Vocabulary Hoax, and Other Essays
- "Ideology, Power, and Linguistic Theory" [PDF]
- To read:
- Stephen G. Alter, Darwinism and the Linguistic Image: Language, Race, and Natural Theology in the Nineteenth Century
- N. Asher and A. Lascarides, Logics of Conversation
- R. Harald Baayen, Analyzing Linguistic Data: A Practical Introduction to Statistics Using R
- Mark C. Baker, The Atoms of Language: The Mind's Hidden Rules of Grammar
- Robert F. Barsky, Zellig Harris: From American Linguistics to Socialist Zionism
- Derek Bickerton
- Diane Blakemore, Relevance and Linguistic Meaning: The Semantics and Pragmatics of Discourse Markers
- Andreas Blume, "A Learning-Efficiency Explanation of Structure in Language", Theory and Decision 57 (2004): 265--285
- Robert Andrew Blust, 101 Problems and Solutions in Historical Linguistics: A Workbook
- Rens Bod, Beyond Grammar: An Experience-based theory of language [Free online]
- Rens Bod, Jennifer Hay and Stefanie Jannedy (eds.), Probabilistic Linguistics
- Ted Briscoe (ed.), Linguistic Evolution Through Language Acquisition: Formal and Computational Models
- Penelope Brown and Stephen C. Levinson, Politeness: Some universals in language usage
- Joan Bybee, Language, Usage and Cognition
- Nick Chater, Alexander Clark, John A. Goldsmith, and Amy Perfors, Empiricism and Language Learnability
- Gennaro Chierchia, Meaning and Grammar: An Introduction to Semantics
- Morten H. Christiansen and Nick Chater, Creating Language: Integrating Evolution, Acquisition, and Processing
- Herbert H. Clark, [Using Language
- Anthony Corbeill, Sexing the World: Grammatical Gender and Biological Sex in Ancient Rome
- Ewa Dabrowska, Language, Mind, and Brain: Some Psychological and Neurological Constraints on Theories of Grammar
- T. Deacon
- Lukasz Debowski, "Hilberg's Law and Its Links with Guiruad's Law", cs.CL/0507022 ["Hilberg (1990) supposed that finite-order excess entropy of a random human text is proportional to the square root of the text length. Assuming that Hilberg's hypothesis is true, we derive Guiraud's law, which states that the number of word types in a text is greater than proportional to the square root of the text length. Our derivation is based on some mathematical conjecture in coding theory and on several experiments suggesting that words can be defined approximately as the nonterminals of the shortest context-free grammar for the text."]
- Peter Ford Dominey, "From Sensorimotor Sequence to Grammatical Construction: Evidence from Simulation and Neurophysiology", Adaptive Behavior 13 (2005): 347--361 [Very cool, if it's right: "... describes a functional trajectory from sensorimotor sequence learning to the learning of grammatical constructions in language. ... review of the functional neurophysiology of the cortex and basal ganglia ... as background for a neural network model of this system in sensorimotor sequence learning. Sequential behavior ... defined in terms of serial, temporal and abstract structure. The resulting neuro-computational framework ... account[s] for observed sequence learning .... framework naturally extends to grammatical constructions as form-to-meaning mappings. Predictions ... concerning parallels in language and cognitive sequence processing are tested against behavioral and neurophysiological observations in humans, resulting in a refinement of the allocation of model functions to subdivisions of Broca's area. From a functional perspective this analysis will provide insight into the relation between the coding structure in human languages, and constraints derived from the underlying neurophysiological computational mechanisms." PDF preprint]
- Umberto Eco, The Search for the Perfect Language
- N. J. Enfield, Linguistic Epidemiology: Semantics and Grammar of Language Contact in Mainland Southeast Asia
- Peter Gärdenfors, The Geometry of Meaning: Semantics Based on Conceptual Spaces
- Adele Goldberg, Constructions at Work: The Nature of Generalization in Language
- John A. Goldsmith and Bernard Laks, Battle in the Mind Fields
- Arthur C. Graesser, Keith K. Millis and Rolf A. Zwaan, "Discourse Comprehension," Annual Review of Psychology 48 (1997) 163--89
- Simon J. Greenhill, Chieh-Hsi Wu, Xia Hua, Michael Dunn, Stephen C. Levinson, and Russell D. Gray, "Evolutionary dynamics of language systems", Proceedings of the National Academy of Sciences (USA) 114 (2017): E8822--E8829
- Maria Teresa Guasti, Language Acquisition: The Growth of Grammar
- Patricia Hanna and Bernard Harrison, Word and World: Practice and the Foundations of Language
- Zellig Harris
- "A Theory of Language Structure", American Philosophical Quarterly 13 (1976): 237--255 [JSTOR]
- "Grammar on Mathematical Principles", Journal of Linguistics
14 (1978): 1--20 [JSTOR] - "The Structure of Science Information", Journal of Biomedical Informatics 35 (2002): 215--221
- Arturo Hernandez, Ping Li and Brian MacWhinney, "The emergence of competing modules in bilingualism", Trends in Cognitive Sciences 9 (2005): 220--225
- Kathy Hirsh-Pasek and Roberta Michnick Golinkoff, The Origins of Grammar: Evidence from Early Language Comprehension
- John C. L. Ingram, Neurolinguistics: An Introduction to Spoken Language Processing and its Disorders
- Ray Jackendoff, A User's Guide to Thought and Meaning
- Edward L. Keenan and Lawrence S. Moss, Mathematical Structures in Languages
- Dan Klein and Christopher D. Manning, "Natural language grammar induction with a generative constituent-context model", Pattern Recognition 38 (2005): 1407--1419
- Chris Knight et al. (eds.), The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form
- Paul Kroger, Analyzing Grammar: An Introduction
- Patricia K. Kuhl, "Early Language Acquisition: Cracking the Speech Code", Nature Reviews Neuroscience 5 (2004): 831--843
- Stephen C. Levinson, Presumptive Meanings: The Theory of Generalized Conversational Implicature
- John Arthur Lucy, Grammatical Categories and Cognition: A Case Study of the Linguistic Relativity Hypothesis
- Margaret Masterman, Language, Cohesion and Form
- James D. McCawley, Everything that Linguists Have Always Wanted to Know about Logic --- but Were Ashamed to Ask
- Janet L. McDonald, "Language Acquisition: The Acquisition of Linguistic Structure in Normal and Special Populations", Annal Review of Psychology 48 (1997): 215--2141
- Bob McMurray, "Defusing the Childhood Vocabulary Explosion", Science 317 (2007): 631
- John McWhorter, The Power of Babel
- Takashi Morita, Hiroki Koda, "Superregular grammars do not provide additional explanatory power but allow for a compact analysis of animal song", arxiv:1811.02507
- Adilson E. Motter, Alessandro P. S. de Moura, Ying-Cheng Lai, and Partha Dasgupta, "Topology of the conceptual network of language," Physical Review E 65 (2002): 065102(R), cond-mat/0206530
- Andrea Moro, The Boundaries of Babel: The Brain and the Enigma of Impossible Languages
- Salikoko S. Mufwene, The Ecology of Language Evolution [Review by Danny Yee]
- Frederick J. Newmeyer, Language Form and Language Function
- Frederick J. Newmeyer and Laurel B. Preston (eds.), Measuring Grammatical Complexity
- Johanna Nichols, Linguistic Diversity in Time and Space [In the words of a correspondent: "looked at a number of features of languages throughout the world, and argued that their distribution correlates to each other and to a possible initial migration of humans around the world"]
- Partha Niyogi, The Computational Nature of Language Learning and Evolution
- Prashant Parikh, The Use of Language
- Stephen Pinker
- The Stuff of Thought
- Words and Rules
- Geoffrey K. Pullum and Barbara C. Scholz
- "Empirical assessment of stimulus poverty arguments", The Linguistic Review 19 (2002): 9--50
- "Contrasting applications of logic in natural language syntactic description" in Petr Hajek, Luis Valdes-Villanueva, and Dag Westerstahl (eds.), Logic, Methodology and Philosophy of Science: Proceedings of the Twelfth International Congress, pp. 481--503 [pdf]
- Geoffrey K. Pullum and James Rogers, "Animal Pattern-Learning Experiments: Some Mathematical Background" [PDF preprint]
- Friedemann Pulvermuller, The Neuroscience of Language: On Brain Circuits of Words and Serial Order
- Christian Ramiro, Mahesh Srinivasan, Barbara C. Malt, and Yang Xu, "Algorithms in the historical emergence of word senses", Proceedings of the National Academy of Sciences (USA) 115 (2018): 2323--2328
- Nikolaus Ritt, Selfish Sounds and Linguistic Evolution: A Darwinian Approach to Language Change
- David Rose, "A Systemic Functional Approach to Language Evolution", Cambridge Archaeological Journal 16 (2006): 73--96
- Deb Roy, "Grounding words in perception and action: computational insights", Trends in Cognitive Sciences 9 (2005): 389--396 [I heard Roy talk about his work at the "predictive knowledge" workshop at ICML 2005; it seemed very cool, but left me wanting details...]
- P. Thomas Schoenemann, "Syntax as an Emergent Characteristic of the Evolution of Semantic Complexity", Minds and Machines 9 (1999): 309--346
- Thomas Schürmann, Peter Grassberger, "The predictability of letters in written English", Fractals 4 (1996): 1--5, arxiv:0710.4516 [Shades of Zellig Harris]
- Ann Senghas, Sotaro Kita, and Asli Özyürek, "Children Creating Core Properties of Language: Evidence from an Emerging Sign Language in Nicaragua", Science 305 (2004): 1779--1782
- John Taylor, Cognitive Grammar
- Geoff Thompson, Introducing Functional Grammar
- Michael Tomasello, Constructing a Language: A Usage-Based Theory of Languagge Acquisition
- Peter Trudgill, Sociolinguistic Typology: Social Determinants of Linguistic Complexity
- Deirdre Wilson and Dan Sperber, Meaning and Relevance
- Florian Wolf and Edward Gibson, Coherence in Natural Language: Data Structures and Applications ["The biggest step forward" in discourse research "since Aristotle" --- Mark Liberman]
- Damian H. Zanette, "Demographic growth and the distribution of language sizes", arxiv:0710.1511