Psych 156A/ Ling 150: Psychology of Language Learning

Similar documents
Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Using computational modeling in language acquisition research

Language Development: The Components of Language. How Children Develop. Chapter 6

LEXICAL CATEGORY ACQUISITION VIA NONADJACENT DEPENDENCIES IN CONTEXT: EVIDENCE OF DEVELOPMENTAL CHANGE AND INDIVIDUAL DIFFERENCES.

The Four Principal Parts of Verbs. The building blocks of all verb tenses.

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Lexical category induction using lexically-specific templates

Sight Word Assessment

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Jack Jilly can play. 1. Can Jack play? 2. Can Jilly play? 3. Jack can play. 4. Jilly can play. 5. Play, Jack, play! 6. Play, Jilly, play!

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Using dialogue context to improve parsing performance in dialogue systems

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds

2 months: Social and Emotional Begins to smile at people Can briefly calm self (may bring hands to mouth and suck on hand) Tries to look at parent

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

A Case Study: News Classification Based on Term Frequency

Context Free Grammars. Many slides from Michael Collins

Vocabulary Usage and Intelligibility in Learner Language

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

CS Machine Learning

Running head: DELAY AND PROSPECTIVE MEMORY 1

Construction Grammar. University of Jena.

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Experience Corps. Mentor Toolkit

Today we examine the distribution of infinitival clauses, which can be

Developing Grammar in Context

Unit 8 Pronoun References

Language Acquisition Chart

BASIC ENGLISH. Book GRAMMAR

Quiz for Teachers. by Paul D. Slocumb, Ed.D. Hear Our Cry: Boys in Crisis

Case study Norway case 1

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Merbouh Zouaoui. Melouk Mohamed. Journal of Educational and Social Research MCSER Publishing, Rome-Italy. 1. Introduction

AN ANALYSIS OF GRAMMTICAL ERRORS MADE BY THE SECOND YEAR STUDENTS OF SMAN 5 PADANG IN WRITING PAST EXPERIENCES

Part I. Figuring out how English works

November 2012 MUET (800)

We are going to talk about the meaning of the word weary. Then we will learn how it can be used in different sentences.

Lecturing Module

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Lancaster Lane CP School. The Importance of Motor Skills

Degeneracy results in canalisation of language structure: A computational model of word learning

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach

The following information has been adapted from A guide to using AntConc.

Corpus Linguistics (L615)

AQUA: An Ontology-Driven Question Answering System

Constructing Parallel Corpus from Movie Subtitles

A Pumpkin Grows. Written by Linda D. Bullock and illustrated by Debby Fisher

THE VERB ARGUMENT BROWSER

The Evolution of Random Phenomena

Teaching a Laboratory Section

Loughton School s curriculum evening. 28 th February 2017

IMPROVING SPEAKING SKILL OF THE TENTH GRADE STUDENTS OF SMK 17 AGUSTUS 1945 MUNCAR THROUGH DIRECT PRACTICE WITH THE NATIVE SPEAKER

Tracy Dudek & Jenifer Russell Trinity Services, Inc. *Copyright 2008, Mark L. Sundberg

Advanced Grammar in Use

Compositional Semantics

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

SAMPLE PAPER SYLLABUS

Conversation Starters: Using Spatial Context to Initiate Dialogue in First Person Perspective Games

5 Day Schedule Paragraph Lesson 2: How-to-Paragraphs

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

Cross Language Information Retrieval

Speech Emotion Recognition Using Support Vector Machine

Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs

Coast Academies Writing Framework Step 4. 1 of 7

A Comparison of Two Text Representations for Sentiment Analysis

Characteristics of the Text Genre Realistic fi ction Text Structure

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

UNDERSTANDING DECISION-MAKING IN RUGBY By. Dave Hadfield Sport Psychologist & Coaching Consultant Wellington and Hurricanes Rugby.

Cognitive Development Facilitator s Guide

Phenomena of gender attraction in Polish *

UDL AND LANGUAGE ARTS LESSON OVERVIEW

Using a Native Language Reference Grammar as a Language Learning Tool

The suffix -able means "able to be." Adding the suffix -able to verbs turns the verbs into adjectives. chewable enjoyable

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

Course Outline for Honors Spanish II Mrs. Sharon Koller

Conteúdos de inglês para o primeiro bimestre. Turma 21. Turma 31. Turma 41

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Psychology and Language

Tutoring First-Year Writing Students at UNM

Extending Learning Across Time & Space: The Power of Generalization

Shockwheat. Statistics 1, Activity 1

The Acquisition of English Grammatical Morphemes: A Case of Iranian EFL Learners

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.

Getting Started with Deliberate Practice

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

TEAM-BUILDING GAMES, ACTIVITIES AND IDEAS

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Prewriting: Drafting: Revising: Editing: Publishing:

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

ELP in whole-school use. Case study Norway. Anita Nyberg

EAGLE: an Error-Annotated Corpus of Beginning Learner German

CORPUS ANALYSIS CORPUS ANALYSIS QUANTITATIVE ANALYSIS

Aviation English Training: How long Does it Take?

Rule Learning With Negation: Issues Regarding Effectiveness

Python Machine Learning

L1 and L2 acquisition. Holger Diessel

Classroom Activities/Lesson Plan

Function Tables With The Magic Function Machine

Transcription:

Psych 156A/ Ling 150: Psychology of Language Learning Lecture 6 Words III - Grammatical Categories Announcements Lecture notes from last time corrected & posted (there was an error in one of the slides on recall and precision) Pick up HW1 Be working on HW2 and the review questions for words Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. Examples of different categories in English: noun = goblin, kitten, king, girl Examples of how nouns are used: I like that goblin. Kittens are adorable. A king said that no girls would ever solve the Labyrinth. Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. Examples of different categories in English: verb = like, are, said, solve, stand Examples of how verbs are used: I like that goblin. Kittens are adorable. A king said that no girls would ever solve the Labyrinth. Sarah was standing very close to him. 1

Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. Examples of different categories in English: adjective = silly, adorable, brave, close Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. Examples of different categories in English: preposition = near, through, to Examples of how adjectives are used: I like the silliest goblin. Kittens are so adorable. The king said that only brave girls would solve the Labyrinth. Sarah was standing very close to him. Examples of how prepositions are used: I like the goblin near the king s throne. The king said that no girls would get through the Labyrinth. Sarah was standing very close to him. Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. Grammatical Categorization Computational Problem: Identify the grammatical category of a word (such as noun, verb, adjective, preposition, etc.) This will tell you how this word is used in the language, and will allow you to recognize other words that belong to the same category since they will be used the same way. This is a DAX. DAX =?? He is SIBing. SIB =?? This is a DAX. DAX = noun He is SIBing. SIB = verb He is very BAV. BAV =?? He should sit GAR the other dax. GAR =?? He is very BAV. BAV = adjective He should sit GAR the other dax. GAR = preposition 2

Categorization: How? How might children initially learn what categories words are? Idea 1: Deriving Categories from Semantic Information = Semantic Bootstrapping Hypothesis (Pinker 1984) Children can initially determine a word s category by observing what kind of entity in the world it refers to. objects, substance = noun (goblins, glitter) property = adjective (shiny, stinky) action = verb (steal, sing) The word s meaning is then linked to innate grammatical category knowledge (nouns are objects/substances, verb are actions, adjectives are properties) Semantic Bootstrapping Hypothesis: Problem Mapping rules are not perfect Ex: not all action-like words are verbs bouncy, a kick action-like meaning, but they re not verbs Ex: not all property-like words are adjectives is shining, it glitters seem to be referring to properties, but these aren t adjectives Idea 2: Distributional Learning Categorization: How? Children can initially determine a word s category by observing the linguistic environments in which words appear. Noun Kittens are adorable. Verb Sarah was standing very close to him. I like the silliest goblin. Adjective The king said that no girls would get through the Labyrinth. Preposition Are children sensitive to distributional information? Children are sensitive to the distributional properties of their native language when they re born (Shi, Werker, & Morgan 1999). 15-16 month German infants can determine novel words are nouns, based on the distributional information around the novel words (Höhle et al. 2004) 18-month English infants can track distributional information like is-ing to signal that a word is a verb (Santelmann & Jusczyk 1998) 3

Is distributional information enough? How do we know in child-directed speech (which is the linguistic data children encounter) (1) What distributional information children should pay attention to? (2) If the available distributional information will actually correctly categorize words? What data should children pay attention to? question is how the learner is to know which environments are important and which should be ignored. Distributional analyses that consider all the possible relations among words in a corpus of sentences would be computationally unmanageable at best, and impossible at worst. One idea: local contexts by showing that local contexts are informative, these findings suggested a solution to the problem of there being too many possible environments to keep track of: focusing on local contexts might be sufficient. Frequent Frames Idea: What categorization information is available if children track frequent frames? Frequent frame: X Y where X and Y are words that frame another word and appear frequently in the child s linguistic environment Examples: the is can him the king is can trick him the goblin is can help him the girl is can hug him Samples of Child-Directed Speech Data representing child s linguistic environment: 6 corpora of child-directed speech from the CHILDES database, which contains transcriptions of parents interacting with their children. Corpus (sg.), corpora (pl). = a collection of data [from Latin body, a body of data] 4

Defining Frequent Definition of frequent for frequent frames: Frames appearing a certain number of times in a corpus The principles guiding inclusion in the set of frequent frames were that frames should occur frequently enough to be noticeable, and that they should also occur enough to include a variety of intervening words to be categorized together. a pilot analysis with a randomly chosen corpus, Peter, determined that the 45 most frequent frames satisfied these goals and provided good categorization. Set of frequent frames = 45 most frequent frames Defining Frequent Example of deciding which frames were frequent: Frame How often it occurred in the corpus (1) the is 600 times (2) a is 580 times (3) she it 450 times (45) they him 200 times (46) we have 199 times These frames considered frequent Testing the Categorization Ability of Frequent Frames Determining the success of frequent frames Try out frequent frames on a corpus of child-directed speech. Frame (1): the is Transcript: the radio is in the waybut the doll isand the teddy is radio, doll, teddy are placed into the same category by the is Precision = # of words identified correctly as Category within frame # of words identified as Category within frame Recall = # of words identified correctly as Category within frame # of words that should have been identified as Category Frame (13): you it Transcript: you draw it so that he can see it you dropped it on purpose!so he hit you with it draw, dropped, with are placed into the same category by you it 5

Determining the success of frequent frames Determining the success of frequent frames Precision = # of words identified correctly as Category within frame # of words identified as Category within frame Recall = # of words identified correctly as Category within frame # of words that should have been identified as Category Frame: you it Category: draw, dropped, with (similar to Verb so compare to Verb) # of words correctly identified as Verb = 2 (draw, dropped) # of words identified as Verb = 3 (draw, dropped, with) Precision for you it = 2/3 Precision = # of words identified correctly as Category within frame # of words identified as Category within frame Recall = # of words identified correctly as Category within frame # of words that should have been identified as Category Frame: you it Category: draw, dropped, with (similar to Verb so compare to Verb) # of words correctly identified as Verb = 2 (draw, dropped) # of words should be identified as Verb = all verbs in corpus (play, sit, draw, dropped, ran, kicked, ) Determining the success of frequent frames Some actual frequent frame results Frame: you it Precision = # of words identified correctly as Category within frame # of words identified as Category within frame Recall = # of words identified correctly as Category within frame # of words that should have been identified as Category Frame: you it Category: draw, dropped, with (similar to Verb so compare to Verb) # of words correctly identified as Verb = 2 # of words should be identified as Verb = 100 Recall = 2/100 (much smaller number) Category includes: put, want, do, see, take, turn, taking, said, sure, lost, like, leave, got, find, throw, threw, think, sing, reach, picked, get, dropped, seen, lose, know, knocked, hold, help, had, gave, found, fit, enjoy, eat, chose, catch, with, wind, wear, use, took, told, throwing, stick, share, sang, roll, ride, recognize, reading, ran, pulled, pull, press, pouring, pick, on, need, move, manage, make, load, liked, lift, licking, let, left, hit, hear, give, flapped, fix, finished, drop, driving, done, did, cut, crashed, change, calling, bring, break, because, banged 6

Some actual frequent frame results How successful frequent frames were Frame: the is Category includes: moon, sun, truck, smoke, kitty, fish, dog, baby, tray, radio, powder, paper, man, lock, lipstick, lamb, kangaroo, juice, ice, flower, elbow, egg, door, donkey, doggie, crumb, cord, clip, chicken, bug, brush, book, blanket, Mommy Precision: Above 90% for all corpora (high) = very good! Interpretation: When a frequent frame clustered words together into category, they often did belong together. (Nouns were put together, verbs were put together, etc.) Recall: Around 10% for all corpora (very low) = maybe not as good Interpretation: A frequent frame made lots of little clusters, rather than being able to cluster all the words into one category. (So, there were lots of Noun-ish clusters, lots of Verb-ish clusters, etc.) Getting better recall Getting better recall How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the is the was a is that is dog dog dog cat cat cat goblin goblin king king king king girl teddy girl teddy How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the is the was a is that is dog dog dog cat cat cat goblin goblin king king king king girl teddy girl teddy What about putting clusters together that have a certain number of words in common? 7

Getting better recall Getting better recall How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the is, the was a is that is dog dog cat cat goblin goblin king king king girl girl teddy teddy How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the is/was a is that is dog dog cat cat goblin goblin king king king girl girl teddy teddy Getting better recall Getting better recall How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the is/was, a is that is dog goblin cat cat goblin king king girl teddy teddy How could we form just one category of Verb, Noun, etc.? Observation: Many frames overlap in the words they identify. the/a is/was that is dog goblin cat cat goblin king king girl teddy teddy 8

Observation: Many frames overlap in the words they identify. the/a/that is/was dog teddy cat goblin king girl Getting better recall How could we form just one category of Verb, Noun, etc.? Recall goes up to 91% (very high) = very good! Precision stays above 90% (very high) = very good! Recap Frequent frames are non-adjacent co-occurring words with one word in between them. (ex: the is) They are likely to be information young children are able to track, based on experimental studies. When tested on realistic child-directed speech, frequent frames do very well at grouping words into clusters which are very similar to actual grammatical categories like Noun and Verb. Frequent frames could be a very good strategy for children to use. Wang & Mintz 2008: Simulating children using frequent frames the frequent frame analysis procedure proposed by Mintz (2003) was not intended as a model of acquisition, but rather as a demonstration of the information contained in frequent frames in child-directed speechmintz (2003) did not address the question of whether an actual learner could detect and use frequent frames to categorize words Wang & Mintz 2008: Simulating children using frequent frames This paper addresses this question with the investigation of a computational model of frequent frame detection that incorporates more psychologically plausible assumptions about the memor[y] resources of learners. Computational model: a program that simulates the mental processes occurring in a child. This requires knowing what the input and output are, and then testing the algorithms that can take the given input and transform it into the desired output. 9

Wang & Mintz (2008): Considering Children s Limitations Considerations (1) Children possess limited memory and cognitive capacity and cannot track all the occurrences of all the frames in a corpus. (2) retention is not perfect: infrequent frames may be forgotten. The Model s Operation (1) Only 150 frame types (and their frequencies) are held in memory (2) Forgetting function: frames that have not been encountered recently are less likely to stay in memory than frames that have been recently encountered (1) Child encounters an utterance (e.g. You read the story to mommy. ) (2) Child segments the utterance into frames: You read the story to mommy. (1) You X the (2) read X story (3) the X to (4) story X mommy Frames: you the, read story, the to, story mommy If memory is not full, a newly-encountered frame is added to the memory and its initial activation is set to 1. If memory is not full, a newly-encountered frame is added to the memory and its initial activation is set to 1. you the 1.0 Processing Step 1 Processing Step 1 (you the) 10

The forgetting function is simulated by the activation for each frame in memory decreasing by 0.0075 after each processing step. you the 0.9925 When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. read story 1.0 you the 0.9925 Forgetting function Processing Step 2 (read story) When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. read story 0.9925 you the 0.9850 When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. the to 1.0 read story 0.9925 you the 0.9850 Forgetting function Processing step 3 (the to) 11

When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. the to 0.9925 read story 0.9850 you the 0.9775 When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. story mommy 1.0 the to 0.9925 read story 0.9850 you the 0.9775 Forgetting function Processing step 4 (story mommy) When a new frame is encountered, the updating depends on whether the memory is already full or not. If it is not and the frame has not already been encountered, the new frame is added to the memory with activation 1. story mommy 0.9925 the to 0.9850 read story 0.9775 you the 0.9700 If the frame is already in memory because it was already encountered, activation for that frame increases by 1. story mommy 0.9925 the to 0.9850 read story 0.9775 you the 0.9700 Forgetting function Processing step 5: (you the) 12

If the frame is already in memory because it was already encountered, activation for that frame increases by 1. If the frame is already in memory because it was already encountered, activation for that frame increases by 1. story mommy 0.9925 the to 0.9850 read story 0.9775 you the 1.9700 you the 1.9700 story mommy 0.9925 the to 0.9850 read story 0.9775 Processing step 5: (you the) Processing step 5: (you the) If the frame is already in memory because it was already encountered, activation for that frame increases by 1. Eventually, since the memory only holds 150 frames, the memory will become full. you the 1.9625 story mommy 0.9850 the to 0.9775 read story 0.9700 Forgetting function story mommy 4.6925 the to 3.9850 read story 3.9700 you the 2.6925 she him 0.9850 we it 0.7500 after processing step 200 13

At this point, if a frame not already in memory is encountered, it replaces the frame with the least activation, as long as that activation is less than 1.0. At this point, if a frame not already in memory is encountered, it replaces the frame with the least activation, as long as that activation is less than 1.0. story mommy 4.6925 the to 3.9850 read story 3.9700 you the 2.6925 she him 0.9850 we it 0.7500 Processing step 201: because said story mommy 4.6925 the to 3.9850 read story 3.9700 you the 2.6925 she him 0.9850 we it 0.7500 Processing step 201: because said At this point, if a frame not already in memory is encountered, it replaces the frame with the least activation, as long as that activation is less than 1. Eventually, however, all the frames in memory will have been encountered often enough that their activations are greater than 1. story mommy 4.6925 the to 3.9850 read story 3.9700 you the 2.6925 because said 1.0000 she him 0.9850 Processing step 201: because said story mommy 9.6925 the to 8.9850 read story 8.9700 you the 5.6925 we her 3.9700 she him 2.9850 after processing step 5000 14

At this point, no change is made to memory since the new frame s activation of 1 would be less than the least active frame in memory. The forgetting function is then invoked. story mommy 9.6925 the to 8.9850 read story 8.9700 you the 5.6925 we her 3.9700 she him 2.9850 Processing step 5001 (because him) story mommy 9.6850 the to 8.9775 read story 8.9625 you the 5.6850 we her 3.9625 she him 2.9775 Forgetting function Wang & Mintz (2008): How the model did Using same corpora for input as Mintz (2003) (6 from CHILDES: Anne, Aran, Even, Naomi, Nina, Peter) The model s precision was above 0.93 for all six corpora. This is very good! When the model decided a word belonged in a particular category (Verb, Noun, etc.) it usually did. Wang & Mintz (2008): Conclusions our model demonstrates very effective categorization of words. Even with limited and imperfect memory, the learning algorithm can identify highly informative contexts after processing a relatively small number of utterances, thus yield[ing] a high accuracy of word categorization. It also provides evidence that frames are a robust cue for categorizing words. 15

Wang & Mintz (2008): Recap While Mintz (2003) showed that frequent frame information is useful for categorization, it did not demonstrate that children - who have constraints like limited memory and cognitive processing power - would be able to effectively use this information. Questions? Wang & Mintz (2008) showed that a model using frequent frames in a psychologically plausible way (that is, a way that children might identify and use frequent frames) was able to have the same success at identifying the grammatical category that a word is. 16