The primacy of graded grammaticality Markus Bader & Jana Häussler Konstanz & Potsdam KogWis 2010 Potsdam, 05.10.2010 Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 1 / 19
Introduction Introduction: Aims and Questions General aim: Developing a model of linguistic judgments in order to......establish a firm basis for linguistic theory...contribute to the ongoing debate about grammar and language use Specific questions: How do gradient ratings of grammaticality relate to binary grammaticality judgments? How does grammaticality relate to frequency of usage? Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 2 / 19
Experimental Material Experimental Material: Ditransitive verbs Empirical domain of our investigations: Ditransitive verbs in German (1)... dass er dem Mann ein that he.nom the.dat man a.acc...that he sent a book to the man. Buch book schickte. sent Advantage of ditransitive verbs: Argument alternations that are subject to verb-specific restrictions in a gradual way Optionality of the dative object Compatibility with the so-called bekommen passive Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 3 / 19
Experimental Material Experimental Material: Optionality of dative object Dropping the dative object: (2)... dass er dem Mann ein Buch schickte. that he.nom the.dat man a.acc book sent...that he sent a book. (3)?... dass er dem Mann ein Buch anvertraute. that he.nom the.dat man a.acc book entrusted...that he entrusted a book. Experimental results and corpus counts (Bader & Häussler, submitted): The option of omitting the dative object is a gradient, verb-specific property Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 4 / 19
Experimental Material Experimental Material: Bekommen passive Bekommen passive: the dative object becomes the subject of bekommen ( to get ) (4)... dass der Mann das Buch that the.nom man the.acc book...that the man was sent the book. (5)?... dass der Mann das Buch that the.nom man the.acc book...that the man was stolen the book. geschickt sent gestohlen stolen bekam. got bekam. got Linguistic literature: bekommen passive sentences with verbs like stehlen are often presented as fully grammatical (without a? or a * ). Experimental results and corpus counts (Bader & Häussler, submitted): verbs like stehlen are not fully acceptable in the bekommen passive. Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 5 / 19
Experimental Material Experimental Material: Regular passive Regular passive: unrestricted with regard to ditransitive verbs as considered here (6)... dass dem Mann das Buch that the.dat man the.nom book...that the book was sent to the man. geschickt sent wurde. was (7)... dass dem Mann das Buch gestohlen wurde. that the.dat man the.nom book stolen was...that the book was stolen from the man. Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 6 / 19
Experimental Material (10) Bekommen passive dass der Sohn letztes Jahr (von dem Vermieter) das Haus vererbt bekam. that the son last year by the landlord the house left got the son was left the house last year (by the landlord). Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 7 / 19 Experimental Material: Summary 120 verbs each in two sentences, for a total of 240 sentences 3 2 design: - Structure(active/ regular passive/ bekommen passive) - Nr. of Arguments (3 vs. 2) (8) Active dass der Vermieter letztes Jahr (dem Sohn) das Haus vererbte. that the landlord last year the son the house left that the landlord left the house to the son last year. (9) Regular passive dass dem Sohn letztes Jahr (von dem Vermieter) das Haus vererbt wurde. that the son last year by the landlord the house left was that the house was left to the son last year (by the landlord).
Experiment 1 and 2 Experiment 1 and 2: Procedure Experiment 1: Magnitude Estimation First, a reference item is presented to which the participant assigns an arbitrary numeric value (> 0). All further items are judged in proportion to the reference item on a continuous numerical scale. Each individual data point is divided by the reference value and the resulting ratio is log-transformed. Experiment 2: Speeded Grammaticality Judgments Word-by-word presentation in the middle of the screen Presentation time for each word: ca. 300 400 ms End-of-sentence judgments with a deadline of 2000 ms Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 8 / 19
Experiment 1 and 2 Experiment 1 and 2: Results Table : Mean percentages of judgments grammatical (Standard errors by subjects). Active Regular passive Bekommen passive 3 Args. 88 (2.3) 92 (1.5) 81 (2.9) 2 Args. 77 (2.8) 94 (1.1) 76 (3.2) Table : Mean ME scores (Standard errors by subjects). Active Regular passive Bekommen passive 3 Args..28 (.038).26 (.035).23 (.034) 2 Args..24 (.042).31 (.041).18 (.042) Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 9 / 19
Experiment 1 and 2 Experiment 2: Verb-specific variability in grammaticality Active, 3 Args Regular Passive, 3 Args Bekommen Passive, 3 Args 120 120 Rank Active, 2 Args Rank Regular Passive, 2 Args 120 Rank Bekommen Passive, 2 Args 120 120 120 Rank Rank Rank Figure : Rank-ordered distribution of mean percentages of grammatical judgments for the 120 verbs used in Experiment 2. Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 10 / 19
From gradient to binary judgments From gradient to binary judgments: Correlations All 720 data points (120 verbs in 6 conditions; Kendall s τ = 0.42) SGJ: % correct 0.2 0.0 0.1 0.2 0.3 0.4 0.5 ME scores 120 data points (verbs) per condition (Kendall s τ from 0.19 to 0.55) SGJ: Active SGJ: Active 2 Arguments 0.2 0.0 0.2 0.4 ME: Active 3 Arguments 0.2 0.0 0.2 0.4 ME: Active SGJ: Regular Passive SGJ: Regular Passive 2 Arguments 0.2 0.0 0.2 0.4 ME: Regular Passive 3 Arguments 0.2 0.0 0.2 0.4 ME: Regular Passive SGJ: Recipient Passive SGJ: Recipient Passive 2 Arguments 0.2 0.0 0.2 0.4 ME: Recipient Passive 3 Arguments 0.2 0.0 0.2 0.4 ME: Recipient Passive Figure : SGJ results plotted against ME results Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 11 / 19
From gradient to binary judgments From gradient to binary judgments: Regression model Do gradient grammaticality scores predict binary judgments? Logistic regression with mixed-effect modeling: results of Experiment 2 (SGJ2) as predicted variable results of Experiment 1 (ME) as predictor variable participants and items as random effects Results of logistic regression: ME scores are a highly significant predictor of SGJ results Somers C = 0.82 (n = 8640) Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 12 / 19
From gradient to binary judgments From gradient to binary judgments: Model fit observed predicted 18 39 80 123 444 194 256 950 2169 4367 observed predicted 0.2 0.1 0.0 0.1 0.2 0.3 0.4 ME score 0.2 0.1 0.0 0.1 0.2 0.3 0.4 ME score Figure : Observed and fitted SGJ results plotted against observed ME results (R 2 = 0.94) Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 13 / 19
The relationship between grammaticality and frequency Grammaticality and frequency: corpus details Can the experimental results be reduced to corpus-derived frequency measures? The dewac corpus described in Baroni et al. (2009) was analyzed: The dewac corpus is a huge corpus of German built by web crawling. It contains 1,278,177,539 tokens of text tagged for part of speech Various verb-specific frequency measures were derived from the dewac corpus, including: p(dative object): the probability of a ditransitive verb to occur with an overt dative object bigram ratio: bigram frequency for a verb (participle + auxiliary) divided by the verb s lemma frequency Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 14 / 19
The relationship between grammaticality and frequency Grammaticality and frequency: correlations Table : Rank correlations (Kendall s tau) between experimental grammaticality scores (SGJ) and different frequency measures. Active Regular passive Bekommen passive 3 Args 2 Args 3 Args 2 Args 3 Args 2 Args p(dative object).09 -.32**.07.03 -.03.10 Bigram ratios -.17**.-08 -.02.10.23**.36** Summary: The grammaticality-frequency correlations are far from perfect: - High grammaticality despite low frequency occurs often - High frequency despite low grammaticality occurs rarely Conclusion: Frequency cannot predict grammaticality Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 15 / 19
The relationship between grammaticality and frequency From grammaticality to language use Hypothesis: Grammaticality determines language use, not the other way round. The probability of a sentence s n can be modeled as follows: (11) p(s n ) = f(grammaticality[s n ], real world context[s n ], linguistic context[s n ], performance[s n ]) Here, we consider only two factors: grammaticality: estimated from our experiment real world-context: approximated by overall verb frequency The remaining two factors are left out: performance: not relevant for our sentences linguistic context: relevant, but not yet coded Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 16 / 19
The relationship between grammaticality and frequency From grammaticality to language use Active Regular passive Bekommen passive Bigram frequency 1e+01 1e+03 1e+05 1e+02 1e+04 1e+06 Lemma frequency Active 1e+02 1e+04 1e+06 Lemma frequency Bigram frequency 1e+01 1e+03 1e+05 5 50 500 5000 Bigram frequency 5 50 500 5000 Regular passive 1e+02 1e+04 1e+06 Lemma frequency Bigram frequency Bigram frequency 1 5 50 500 Bekommen passive Bigram frequency 1 5 50 500 Figure : Bigram frequency plotted against verb frequency (upper row) and against experimental grammaticality scores (lower row). Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 17 / 19
The relationship between grammaticality and frequency From grammaticality to language use Table : Results of Poisson regression with bigram frequency as predicted variable and either grammaticality alone, verb frequency alone or grammaticality and verb frequency together. Active Regular passive Bekommen Passive Null deviance 5701182 959280 19741 Reduction R 2 Reduction R 2 Reduction R 2 Grammaticality 2505.00 8016.00 5907.19 Frequency 5492666.95 734056.57 3567.12 Grammaticality & Frequency 5493190.95 734365.56 10508.47 Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 18 / 19
The relationship between grammaticality and frequency Conclusion Binary grammaticality judgments can be derived directly from gradient judgments. Grammaticality is not determined by frequency but is rather among the factors determining frequency. Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 19 / 19
The relationship between grammaticality and frequency Bader, M. & Häussler, J. (submitted). Frequency and grammaticality: A case study on ditransitive verbs in German. Manuscript submitted for publication, University of Konstanz. Baroni, M., Bernardini, S., Ferraresi, A. & Zanchetta, E. (2009). The WaCky Wide Web: A collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation Journal 23, 209 226. Bader/Häussler (Konstanz/Potsdam) The primacy of graded grammaticality KogWis 2010 19 / 19