Ling 235: Quantitative and
Probabilistic Explanation in Linguistics |
Course Syllabus |
(updated 2005/01/04) |
This is a tentative syllabus and is subject to change (hit reload!).
Date |
Topic |
Out |
Due |
Week 1 |
|
|
|
Wednesday, 5 Jan 05 |
An Introduction and an Example. |
|
|
Linguistics: What motivates probabilistic approaches and statistical methodology in linguistics? Problems of categoricity. The greater explanatory power of probabilistic models. Some examples. Statistics: Exploratory Data Analysis (EDA). Introduction to SPSS. A case study on a sociolinguistics dataset. Supplemental readings: Steven Abney. 1996. Statistical
Methods and Linguistics. In: Judith Klavans and Philip Resnik (eds.), The
Balancing Act: Combining Symbolic and Statistical Approaches to Language.
The MIT Press, Christopher D. Manning. 2002. Probabilistic Syntax. In Rens Bod,
Jennifer Hay, and Stefanie Jannedy (eds.), Probabilistic Linguistics,
MIT Press, 2003. |
|
|
|
Week 2 |
|
|
|
Monday, 10 Jan 05 |
Basic concepts in probability and the idea of building probabilistic models for linguistic explanation. |
HW #1 |
|
Statistics: The sociolinguistics example continued: Model building in SPSS (building a VARBRUL/logistic regression model of the data). Probability intro: counting, basic probability laws, maximum likelihood; discrete distributions; joint and conditional probability. Hypothesis tests. Supplemental readings: John Goldsmith. 2001. Probability for linguists. Microsoft
Word or converted
to HTML. or Christopher Manning and
Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing.
Chapter 2, pp. 39-54, 60-68, 72-76. or Rice, John A. Mathematical Statistics and Data Analysis. 2nd edition. Duxbury Press, 1995. |
|
|
|
Wednesday, 12 Jan 05 |
Active vs. passive variation. Modeling the choice with logistic regression (a.k.a. Varbrul) |
|
|
Linguistics: E. Judith Weiner and William Labov. 1983. Constraints on the agentless passive. Journal of Linguistics 19: 29-58. Statistics: Finish up intro on model building in SPSS. Data visualization: scatterplots. A tiny bit about logistic regression models. |
|
|
|
Week 3 |
|
|
|
Monday, 17 Jan 05 |
Martin Luther King Day - no class |
|
|
|
|
|
|
Wednesday, 19 Jan 05 |
Domain minimization, contingency table statistics. Grammatical weight and ambiguity avoidance. |
HW #2 |
HW #1 |
Linguistics: Wasow, Thomas Postverbal Behavior. CSLI Publications. 2002. Chapter 2. Statistics: More on contingency tables. Independence. Hypothesis testing redux. The chi-squared test. |
|
|
|
Week 4
(and 5) |
|
|
|
Monday, 24 Jan 05 |
Statistics on contingency tables and linguistic parallelism |
|
|
Linguistics: Parallelism in coordination (Roger’s handout). Statistics: Fisher's exact test. Likelihood ratios: log odds ratios, and G2 test. Samples and statistical inference, estimating parameters, the method of maximum likelihood, maximum likelihood for multinomial cell probabilities. Supplemental Readings: Frazier,
Lyn, Alan Munn and Charles Clifton (2000) "Processing coordinate
structures", Journal of Psycholinguistic Research |
|
|
|
Wednesday, 26 Jan 05 |
Probabilistic grammars. Constructing models and examining their goodness of fit. Comparing models. |
HW #3 |
HW #2 |
Linguistics: Suppes, Patrick. 1970. Probabilistic grammars for natural languages. Synthèse 22: 95-116. Supplemental readings: Roland and Jurafsky. Verb Sense and Verb
Subcategorization Probabilities. CUNY 1998. |
|
|
|
Week 6 |
|
|
|
Monday, 7 Feb 05 |
Rest of Suppes discussion and Linear regression models. |
HW #4 |
HW #3 |
Statistics:Mean, median, and variance. Linear regression: simple and multiple linear regression. |
|
|
|
Wednesday, 9 Feb 05 |
Gradience in grammaticality. Magnitude estimation. |
|
|
Linguistics: Magnitude Estimation for linguistic data Bard, Ellen Gurman, Robertson, Dan, and Sorace, Antonella. 1996. Magnitude Estimation of Linguistic Acceptability. Language 72: 32-68. Supplemental Sorace,
A. (2000)."Gradients in auxiliary selection with intransitive
verbs". Language 76: 859-890. Keller,
Frank and Antonella Sorace. 2003. Gradient
Auxiliary Selection and Impersonal Passivization in German: An Experimental
Investigation. Journal of Linguistics 39:1, 57-108. Keller,
Frank and Ash Asudeh. 2001. Constraints
on Linguistic Coreference: Structural vs. Pragmatic Factors. In Johanna
D. Moore and Keith Stenning, eds., Proceedings of the 23rd Annual
Conference of the Cognitive Science Society, 483-488. |
|
Talk to Roger or Chris about final project! |
|
Week 7 |
|
|
|
Monday, 7 Feb 05 |
Conditional probabilistic syntax. Determining systemic choices: Optimality Theory and Stochastic Optimality Theory |
HW #5 |
HW #4 |
Linguistics: Christopher D. Manning. 2002. Probabilistic Syntax. In Rens Bod, Jennifer Hay, and Stefanie Jannedy (eds.), Probabilistic Linguistics, MIT Press, 2003. Section 8.5. Statistics: Stochastic optimality theory. Boersma and Hayes intro. Boersma, How we learn. Paul Boersma. 1999. Optimality-Theoretic learning in the Praat program. IFA Proceedings 23: 17-35; (ROA 380) |
|
|
|
Wednesday, 9 Feb 05 |
Argument realization and Stochastic Optimality Theory. |
|
|
Linguistics: Joan Bresnan, Shipra Dingare, and Christopher D. Manning. Soft
Constraints Mirror Hard Constraints: Voice and Person in English and Lummi.
Proceedings of the LFG01 Conference, pp. 13-32, Supplemental readings: Joan
Bresnan and Tatiana Nikitina. 2003. "On the Gradience of
the Dative Alternation". Draft of May 7, 2003. |
|
|
|
Week 8 |
|
|
|
Monday, 14 Feb 05 |
Logistic regression models of systemic choice |
HW #6 |
HW #5 |
Statistics: Logistic regression. Sankoff, D. 1988. Variable rules. In U. Ammon, Fred L. Ramsey and Daniel W. Schafer. 1997. The Statistical Sleuth: A Course in Methods of Data Analysis. Belmont, CA: Duxbury Press, chapter 20, pp. 564-583. Supplemental readings: Labov, William. 1969. Contraction, deletion and inherent
variability of the English copula. Language 45, 715-62, extract. |
|
|
|
Wednesday, 16 Feb 05 |
Logistic regression models in linguistics reprise |
|
|
Linguistics: Arnold, Jennifer, Thomas Wasow, Ash Asudeh,
and Peter Alrenga. Avoiding
Attachment Ambiguities: the role of Constituent Ordering. Journal of
Memory and Language 55.1: 55-70. 2004. Supplemental readings: Lohse,
Barbara , John Hawkins, and Thomas Wasow. Processing Domains in English
Verb-Particle Constructions. Language 80.2: 238-261. 2004 |
|
|
|
Week 9 |
|
|
|
Monday, 21 Feb 05 |
More on logistic regression. Interaction effects. |
|
HW #6 |
Linguistics: Roland, Douglas, Jeffrey L. Elman, Victor S. Ferreira (in press). Why
is that? Structural prediction and ambiguity resolution in a very large
corpus of English sentences. Cognition. Supplemental readings: Temperley, David. 2003. Ambiguity Avoidance in
English Relative Clauses. Language 79: 464-84. Race,
D. S. & MacDonald, M.C. (2003). The use of "that" in the
production and comprehension of object relative clauses. Proceedings of
the 25th Annual Meeting of the Cognitive Science Society. |
|
|
|
Wednesday, 23 Feb 05 |
Constraint interactions. Classification accuracy. Evaluating model fit.
|
|
Project outline |
Linguistics and statistics: Robert Sigley. 2003.
The
importance of interaction
effects.
Language Variation and
Change. |
|
|
|
Week 10 |
|
|
|
Monday, 28 Feb 05 |
Model comparisons: stochastic OT and logistic
regression.
|
|
|
Linguistics: Gerhard Jäger and Anette Rosenbach. 2004. The
winner takes it all - almost. Cumulativity in grammatical variation, manuscript,
University of Supplemental Anette
Rosenbach: Aspects of iconicity and economy in the choice between the s-genitive
and the of-genitive in English. To appear in B. Mondorf and G.
Rohdenburg (eds), Determinants of Grammatical Variation in English. Mouton de Gruyter. Altenberg.
The Genitive v. the of-Construction. |
|
|
|
Wednesday, 2 Mar 05 |
Model comparisons: stochastic OT and
logistic regression. Decision tree or so-called
"analogic" models.
|
|
|
Linguistics: Ernestus, Mirjam Theresia Constantia, and Harald R.
Baayen. 2004. Predicting the Unpredictable:
Interpreting Neutralized Segments in Dutch. Language 79(1). |
|
|
|
Week 11 (i.e., we won't get to this!) |
|
|
|
Monday, 7 Mar 05 |
More model comparisons.
|
|
|
Linguistics: Sarah Benor and Roger Levy. 2004. The Chicken or the Egg?
A probabilistic analysis of English binomials. Draft. http://www.stanford.edu/~rog/papers/binomials.pdf |
|
|
|
Wednesday, 9 Mar 05 |
Wrap up. |
|
|
|
|
|
|
The End |
|
Final paper |