Predicting verb production in argument structure constructions Afra Alishahi Ad Backus ICLC-13, Newcastle, 22 July 2015
Argument structures [Goldberg et al., 2004] Emergent abstract constructions Generalizations over particular verb usages Verb-centered categories
Verbs within argument structures In natural language categories, some items are more readily accessible than others [Higgins, 1996; Tversky & Kahneman, 1973] Same for verbs within constructions: some verbs are learned earlier, come to mind first, and are produced more frequently [Goldberg et al., 2004; Ellis & Ferreira-Junior, 2009] What affects this mental organization?
Verbs within argument structures Suggestions for verbs in constructions: Distributional factors a) frequency [Goldberg et al., 2006; Theakston et al., 2004, etc.] b) association strength [Gries & Wulff, 2009] Semantics [Ninio, 1999; Theakston et al., 2004, etc.] More examples from related literature: Utterance-final frequency / salience [Naigles & Hoff-Ginsberg, 1998] Diversity of syntactic environment [Naigles & Hoff-Ginsberg, 1998] Phonetic form [McDonald et al., 1993] Word length [McDonald et al., 1993]
Input-related determinants of construction learning [Ellis, O'Donnell, & Römer, 2014a, 2014b]: Determinants of learning : (1) verb frequency (2) strength of association between verb and construction (ΔP) (3) semantic centrality Both L1 and L2 speakers
Design: Experiments we are going to show you 20 phrases with gaps in them and ask you to spend one minute for each of them entering all the words you might use to fill the gap [Ellis et al., 2014a, p. 76]
Design: Experiments we are going to show you 20 phrases with gaps in them and ask you to spend one minute for each of them entering all the words you might use to fill the gap [Ellis et al., 2014a, p. 76] he across the... it of the...
Design: Experiments we are going to show you 20 phrases with gaps in them and ask you to spend one minute for each of them entering all the words you might use to fill the gap [Ellis et al., 2014a, p. 76] he across the... it of the...
Design: Experiments he across the... Participant 1 Participant 2 Participant 3 went ran ran came came looked ran jumped leaned
Design: Experiments he across the... Participant 1 Participant 2 Participant 3 Verb Frequency went ran ran run 3 came came looked come 2 ran jumped leaned look 1 lean 1 jump 1 go 1
Design Corpus analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Frequency of use Verb1 Verb2 Verb1 Verb2...
1. Frequency Frequency of verbs in a certain argument structure construction E.g., prepositional dative (transfer) construction: He it to someone. give 1000 show 150 send 50 lend... 10
2. Association strength How strong is the association between a verb and a construction? Construction X give 100 Construction Y give 100
2. Association strength How strong is the association between a verb and a construction? Construction X give 100 other verbs 120 Construction Y give 100 other verbs 900 Other constructions 500 Other constructions 500
2. Association strength How strong is the association between a verb and a construction? ΔP (construction verb) = a/(a+b) c/(c+d) Construction X Other constructions give a c other verbs b d
3. Meaning centrality How central, or prototypical, is the verb meaning for a construction? [Ellis et al., 2014a]
Overview
Main finding [Ellis et al., 2014a,b] Frequency of verb production in a construction is affected by: verb frequency in this construction association strength between the two centrality of verb meaning
Main finding [Ellis et al., 2014a,b] Frequency of verb production in a construction is affected by: verb frequency in this construction association strength between the two centrality of verb meaning 1. The original experimental design has certain disadvantages. 2. The set of predictors may not be the best one.
Design: disadvantages Corpus analysis Experimental setup
Design: disadvantages Corpus analysis Idea: predict speakers' linguistic output from their input Individual differences? Experimental setup
Design: disadvantages Corpus analysis Idea: predict speakers' linguistic output from their input Individual differences? Experimental setup Speakers produce verbs in a specific order Order reflects preferences?
Predictors: critical overview Verb construction joint frequency Association strength ΔP (construction verb) Verb semantic centrality
Predictors: critical overview Verb construction joint frequency Verb marginal (overall) frequency also important? [Ambridge et al., 2015] Cotext-free and cotextual entrenchment [Schmid & Küchenhoff, 2013] Association strength ΔP (construction verb) Verb semantic centrality
Predictors: critical overview Verb construction joint frequency Verb marginal (overall) frequency also important? [Ambridge et al., 2015] Cotext-free and cotextual entrenchment [Schmid & Küchenhoff, 2013] Association strength ΔP (construction verb) Two measures at the same model: ΔP and joint frequency There are also alternative measures: Attraction [Schmid, 2000] Verb semantic centrality
Predictors: critical overview Verb construction joint frequency Verb marginal (overall) frequency also important? [Ambridge et al., 2015] Cotext-free and cotextual entrenchment [Schmid & Küchenhoff, 2013] Association strength ΔP (construction verb) Two measures at the same model: ΔP and joint frequency There are also alternative measures: Attraction [Schmid, 2000] Verb semantic centrality Often confounded with frequency, its effect is questioned in acquisition literature [Theakston et al., 2004]
Overview
Computational model
Computational model
Computational model The bear gives you the ball!
Computational model The bear gives you the ball
Computational model The bear gives you the ball Daddy's coming home!
Computational model The bear gives you the ball Daddy's coming home
Computational model The bear gives you the ball Daddy's coming home Grandma sent you some cookies. John passed you the ball! Mr. Rich donated us a thousand dollars.
Computational model Grandma sent you some cookies The bear gives you the ball Mr. Rich donated us a thousand dollars Daddy's coming home John passed you the ball
Computational model Grandma sent you some cookies The bear gives you the ball Mr. Rich donated us a thousand dollars Daddy's coming home John passed you the ball Predicate meaning cause to receive Number of arguments 3 Word order X verb Y Z Argument meanings {human}; {human}; {object} Argument roles Giver; Recipient; Theme
Computational model Ditransitive transfer construction Daddy's coming home
Computational model Ditransitive transfer construction... Resultative construction
Computational model Ditransitive transfer construction... Resultative construction Meine Schwester lieh mir Geld. (My sister lent me some money.)
Computational model Ditransitive transfer construction... Resultative construction Meine Schwester lieh mir Geld. (My sister lent me some money.)
Computational model Ditransitive transfer construction... Resultative construction Das Geld gab ich meiner Mutter. (I gave the money to my mother.)
Computational model Ditransitive transfer construction... Resultative construction Das Geld gab ich meiner Mutter Das Geld gab ich meiner Mutter. (I gave the money to my mother.)
Computational model L2 mixed L1 L2 L1
Elicited production task He across ARG2 She ARG2 ARG3 It on ARG2 50 constructions (patterns) in total
Outline 1. Simulate the original experiments of [Ellis et al., 2014a,b] 2. Address the methodological issues. 3. Seek for a better set of predictors.
1. Replication
Overview
Simulation 1: replicating L1 Predictor frequency association centrality Coefficient β 0.41 0.43 0.11 p-value <.001 <.001.003
Simulation 1: replicating L1 Predictor frequency association centrality Coefficient β 0.41 0.43 0.11 p-value <.001 <.001.003
Simulation 1: replicating L2 Predictor frequency association centrality Coefficient β 0.41 0.49 0.08 p-value <.001 <.001.133
Simulation 1: replicating L2 Predictor frequency association centrality Coefficient β 0.41 0.49 0.08 p-value <.001 <.001.133
2. Methodological issues
Methodological improvements Corpus analysis Idea: predict speakers' linguistic output from their input Individual differences? Analyzing individual input data. Experimental setup Speakers produce verbs in a specific order Order reflects preferences?
Design Corpus analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Frequency of use Verb1 Verb2 Verb1 Verb2...
Design Individual input analysis Corpus analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Frequency of use Verb1 Verb2 Verb1 Verb2...
Design Individual input analysis Corpus analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Frequency of use Verb1 Verb2 Verb1 Verb2...
Design Individual input analysis Corpus analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Frequency of use Verb1 Verb2 Verb1 Verb2...
Methodological improvements Corpus analysis Idea: predict speakers' linguistic output from their input Individual differences? Analyzing individual input data. Experimental setup Speakers produce verbs in a specific order Order reflects preferences? Probability of production of each verb.
Design: Experiments he across the... Participant 1 Participant 2 Participant 3 Verb Frequency went ran ran run 3 came came looked come 2 ran jumped leaned look 1 lean 1 jump 1 go 1
Design: Experiments he across the... Participant 1 Participant 2 Participant 3 Verb Frequency went 0.7 ran 0.6 ran 0.4 run 3 came 0.2 came 0.3 looked 0.4 come 2 ran 0.1 jumped 0.1 leaned 0.2 look 1 lean 1 jump 1 go 1
Design: Experiments he across the... Participant 1 Participant 2 Participant 3 Verb Frequency went 0.7 ran 0.6 ran 0.4 run 3 came 0.2 came 0.3 looked 0.4 come 2 ran 0.1 jumped 0.1 leaned 0.2 look 1 lean 1 jump 1 go 1
Methodological improvements Individual input analysis Variables Items Construction Constr1 Constr2 Verb Frequency Association Centrality Experiments Probability Frequency of use Verb1 Verb2 Verb1 Verb2...
Overview
Overview
Simulation 2: improving method Predictor frequency association centrality Coefficient β 0.23 0.21 0.06 Significance *** *** ***
Simulation 2: improving method Predictor frequency association centrality Coefficient β 0.23 0.41 0.21 0.43 0.06 0.11 Significance *** *** ***
Simulation 2: improving method Predictor frequency association centrality Coefficient β 0.23 0.41 0.21 0.43 0.06 0.11 Significance *** *** ***
3. Refining the model
A. Which frequency counts? A. Joint frequency verb construction He it to someone. give 1000 show 150 send 50 lend... 10 B. Absolute verb frequency be 1,000,000 have 500,000 do 250,000 say... 200,000 Both types of frequency may be important [Schmid, 2010; Ambridge et al., 2015]
B. Which association measure? Joint verb construction frequency: ΔP contingency: Attraction: There is some support for all of these measures [Ellis, 2006; Divjak, 2008; Schmid & Küchenhoff, 2013; Blumenthal-Dramé, 2012]
Overview
Overview
Model comparison
Best model The original prediction model ranked rather low (7 out of 12) The best prediction model includes: Marginal verb frequency Joint verb construction frequency Attraction Semantic centrality Predictor F (marginal) F (joint) Attraction Centrality Coefficient β 0.19 0.11 0.26 Significance *** *** *** 0.01
Particular constructions
Conclusions Our model to a certain extent replicates the original experimental results. The independent effect of marginal verb frequency supports the distinction between cotext-free and cotextual entrenchment. The simultaneous use of two association measures may be justified for large data sets, but not for individual constructions. Attraction is the best predictor in our data. The impact of centrality is low in our data, and even lower after refining the model. Form-based 'constructions' may not be the best unit for such an analysis.
References Alishahi, A., & Stevenson, S. (2008). A computational model for early argument structure acquisition. Cognitive Science, 32(5), 789-834. Ambridge, B., Kidd, E., Rowland, C. F., & Theakston, A. L. (2015). The ubiquity of frequency effects in first language acquisition. Journal of Child Language, 42(02), 239-273. Ellis, N. C., O Donnell, M. B., & Römer, U. (2014). The processing of verb-argument constructions is sensitive to form, function, frequency, contingency, and prototypicality. Cognitive Linguistics, 25(1), 55 98. Goldberg, A. E. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Matusevych, Y., Alishahi, A., & Backus, A. (2015). Distributional determinants of learning argument structure constructions in first and second language. In Proceedings of CogSci-2015. Matusevych, Y., Alishahi, A., & Backus, A. (n.d.). The impact of first and second language exposure on learning second language constructions. Manuscript submitted for pubication. Schmid, H.-J. (2015). A framework for understanding linguistic entrenchment and its psychological foundations in memory and automatization. In Entrenchment, memory and automaticity. The psychology of linguistic knowledge and language learning. Theakston, A. L., Lieven, E. V., Pine, J. M., & Rowland, C. F. (2004). Semantic generality, input frequency and the acquisition of syntax. Journal of Child Language, 31(01), 61-99.
Learning scenario L1 exposure Mixed L1 + L2 exposure Test
Representing language knowledge 1. Distribution Input properties distribution of verbs within a certain construction Open task [Ellis et al., 2014]
Representing language knowledge 1. Distribution 2. Proficiency score Input properties Input properties distribution of verbs within a certain construction proficiency score for verbs within a certain construction Open task Closed task [Ellis et al., 2014] [Goldschneider & DeKeyser, 2001]
Evaluating language knowledge 1. Distribution (elicited production) Giver verb Recipient Theme The bear you the ball! 1. Verb production probability
Evaluating language knowledge 1. Distribution 2. Proficiency score (elicited production) (comprehension) Giver verb Recipient Theme The bear you the ball! Giver verb Recipient Theme The bear gives you the ball!? 1. Verb production probability 2. Verb comprehension score
Formal model 1. Find most likely construction for a given frame: 2. For this, use prior and conditional probability: 3. Prior probability = entrenchment: 4. Conditional probability = similarity in terms of each feature:
An example frame I ate a tuna sandwich.