Explaining lexical frequency effects: a critique and an alternative account

Similar documents
Ideology and corpora in two languages. Rachelle Freake Queen Mary, University of London

Proceedings of Meetings on Acoustics

Testing claims of a usage-based phonology with Liverpool English t-to-r 1

Mandarin Lexical Tone Recognition: The Gating Paradigm

A study of speaker adaptation for DNN-based speech synthesis

Probabilistic Latent Semantic Analysis

Seminar - Organic Computing

An Empirical and Computational Test of Linguistic Relativity

Figuration & Frequency: A Usage-Based Approach to Metaphor

Metadiscourse in Knowledge Building: A question about written or verbal metadiscourse

Probabilistic principles in unsupervised learning of visual structure: human data and a model

While you are waiting... socrative.com, room number SIMLANG2016

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Introduction to Simulation

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Evolution of Symbolisation in Chimpanzees and Neural Nets

Phonological encoding in speech production

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Probability and Statistics Curriculum Pacing Guide

The Strong Minimalist Thesis and Bounded Optimality

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

(Sub)Gradient Descent

Python Machine Learning

Lecture 1: Machine Learning Basics

School of Innovative Technologies and Engineering

Rhythm-typology revisited.

The Language of Football England vs. Germany (working title) by Elmar Thalhammer. Abstract

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Discourse markers and grammaticalization

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Intermediate Computable General Equilibrium (CGE) Modelling: Online Single Country Course

CSC200: Lecture 4. Allan Borodin

English Language and Applied Linguistics. Module Descriptions 2017/18

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Progressive Aspect in Nigerian English

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Case of the Department of Biomedical Engineering at the Lebanese. International University

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Scenario Design for Training Systems in Crisis Management: Training Resilience Capabilities

Pre-Algebra A. Syllabus. Course Overview. Course Goals. General Skills. Credit Value

TEACHING AND EXAMINATION REGULATIONS (TER) (see Article 7.13 of the Higher Education and Research Act) MASTER S PROGRAMME EMBEDDED SYSTEMS

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Did they acquire? Or were they taught?

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

A 3D SIMULATION GAME TO PRESENT CURTAIN WALL SYSTEMS IN ARCHITECTURAL EDUCATION

Success Factors for Creativity Workshops in RE

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

English (native), German (fair/good, I am one year away from speaking at the classroom level), French (written).

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

A Bootstrapping Model of Frequency and Context Effects in Word Learning

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

On-the-Fly Customization of Automated Essay Scoring

Stochastic Phonology Janet B. Pierrehumbert Department of Linguistics Northwestern University Evanston, IL Introduction

SIMULATION CENTER AND NURSING RESOURCE LABORATORY

Speeding Up Reinforcement Learning with Behavior Transfer

Exploration. CS : Deep Reinforcement Learning Sergey Levine

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

D Road Maps 6. A Guide to Learning System Dynamics. System Dynamics in Education Project

Universität Duisburg-Essen

SECOND LANGUAGE ACQUISITION RESEARCH IN THE LABORATORY

Innovative Teaching in Science, Technology, Engineering, and Math

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

ICTCM 28th International Conference on Technology in Collegiate Mathematics

Mastering Team Skills and Interpersonal Communication. Copyright 2012 Pearson Education, Inc. publishing as Prentice Hall.

ANGLAIS LANGUE SECONDE

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Student Mobility Rates in Massachusetts Public Schools

Mathematics subject curriculum

Development of Multistage Tests based on Teacher Ratings

Procedia - Social and Behavioral Sciences 143 ( 2014 ) CY-ICER Teacher intervention in the process of L2 writing acquisition

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Learning Methods for Fuzzy Systems

Phonological and Phonetic Representations: The Case of Neutralization

Process Evaluations for a Multisite Nutrition Education Program

DO YOU HAVE THESE CONCERNS?

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Automatization and orthographic development in second language visual word recognition

Agent-Based Software Engineering

THE ECONOMIC IMPACT OF THE UNIVERSITY OF EXETER

raıs Factors affecting word learning in adults: A comparison of L2 versus L1 acquisition /r/ /aı/ /s/ /r/ /aı/ /s/ = individual sound

The Algebra in the Arithmetic Finding analogous tasks and structures in arithmetic that can be used throughout algebra

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Knowledge Transfer in Deep Convolutional Neural Nets

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

National Advisory Committee on Creative and Cultural Education (NACCE, 1999;

12- A whirlwind tour of statistics

Multidisciplinary Engineering Systems 2 nd and 3rd Year College-Wide Courses

Transcription:

Explaining lexical frequency effects: a critique and an alternative account Márton Sóskuthy University of York 29 May 2014

Outline Introduction Exposure Independence Conclusion

Introduction lexical frequency effects (e.g. Schuchardt 1972; Bybee 2001; Pierrehumbert 2001): lexical frequency (partly) determines the speed at which lexical items undergo sound changes in this talk: only looking at phonetically gradient sound change rapidly growing literature many different explanations & findings

Introduction different sources for frequency effects: exposure (Pierrehumbert, 2001) high-frequency items erode faster independence (Bybee, 2001) high-frequency items are more independent more resistant to change...

Introduction problem: many different explanations predictions not clear and not sufficiently distinct main goal: clarify the predictions of the exposure-based and independence-based accounts solution: computational and mathematical modelling

Outline Introduction Exposure Independence Conclusion

Exposure Prediction high-frequency forms are exposed to phonetic biases more often they change faster Pierrehumbert (2002): [... H]igh frequency words are affected more because they are produced more often and so more memories of them in their lenited form accrue, once the lenition gets underway.

Exposure Prediction apparent prediction: positive linear frequency effect (e.g. Pierrehumbert 2001) Snapshot of sound change phonetic dimension 0 20 40 60 80 100 frequency

Exposure Simulation architecture modelling framework: simplified version of Pierrehumbert (2001) each word has a separate representation looking at how those representations evolve as a function of frequency parametric (prototype-based) representations the results hold for the original model as well

Exposure Simulation architecture category representations sampling biases density 0.00 0.02 0.04 density 0.00 0.02 0.04 density 0.00 0.02 0.04 density 0.00 0.02 0.04 100 50 0 50 VOT (ms) feedback 100 50 0 50 VOT (ms) 100 50 0 50 VOT (ms) 100 50 0 50 VOT (ms)

Exposure Simulation architecture in exemplar models: memory activation of exemplars decreases with time memory activations summed (MASS) frequency current model also includes MASS words with higher MASS (i.e. high frequency categories) are more resistant to incoming stimuli

Exposure Simulation architecture a single phonetic dimension each word initialised at 0 20,000 simulated word categories word frequency varies between 1 and 100 per time unit constant positive bias 500 time units for each word looking at means for each word

Exposure Results word frequency vs word means after 500 time units 0 20 40 60 80 100 4.4 4.6 4.8 5.0 5.2 5.4 observed values and predictions from mathematical model frequency phonetic dimension

Exposure Results 1. frequency has no effect on the expected values of the means the effect of MASS cancels out the effect of exposure 2. the mean values vary more for low frequency words consequence of the algebra of random variables... this is an empirically testable prediction!

Outline Introduction Exposure Independence Conclusion

Independence Prediction instances of the same sound category in different words are not completely independent high-frequency forms more independent than low-frequency forms (cf. Bybee (2001)) they resist analogical change more they can also stray further from the rest of the category?

Independence Simulation architecture modelling framework: simplified version of Pierrehumbert (2002) each word represented separately category representation, bias and update same as previously sampling is different: samples from a weighted mixture distribution frequent words heavier weight target word heavier weight

Independence Simulation architecture same as before, except: only half of the words are affected by the bias otherwise, results would be identical to prev. sim. similar to e.g. /u/-fronting: tube biased; cool not biased 100 word categories (frequencies: Zipf distribution) simulation repeated 100 times with same parameters looking at one word from each frequency bin

Independence Results word frequency vs word means (pooled results from 200 simulations) words with phonetic bias words without phonetic bias phonetic dimension 1 2 3 4 5 phonetic dimension 1 2 3 4 5 0 20 40 60 80 100 0 20 40 60 80 100 frequency frequency

Independence Results 1. frequency has a positive effect for biased words (tube-type) 2. frequency has a negative effect for non-biased words (cool-type) 3. the mean values vary more for low frequency words in both groups

Outline Introduction Exposure Independence Conclusion

Conclusion careful modelling is crucial to unpack the predictions of theories of sound change exposure: no effect on expected value of mean the evolution of infrequent words is less predictable independence: positive frequency effect in trigger environment negative frequency effect in elsewhere environment the evolution of infrequent words is less predictable

Bybee, J. L. (2001). Phonology and language use. Cambridge University Press, Cambridge. Jurafsky, D., Bell, A., Gregory, M., and Raymond, W. D. (2001). Probabilistic relations between words: Evidence from reduction in lexical production. In Bybee, J. L. and Hopper, P., editors, Frequency and the emergence of linguistic structure, pages 229 254. John Benjamins, Amsterdam. Phillips, B. S. (2006). Word Frequency and Lexical Diffusion. Palgrave Macmillan, Basingstoke, UK & New York, NY. Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition, and contrast. In Bybee, J. L. and Hopper, P., editors, Frequency effects and the emergence of lexical structure, pages 137 157. John Benjamins, Amsterdam. Pierrehumbert, J. B. (2002). Word-specific phonetics. In Gussenhoven, C. and Warner, N., editors, Laboratory phonology, Vol. VII. Mouton de Gruyter, Berlin. Schuchardt, H. (1885/1972). On sound laws: Against the Neogrammarians. In Vennemann, T. and Wilbur, T. H., editors, Schuchardt, the Neogrammarians, and the transformational

theory of phonological change (Linguistische Forschungen 26), pages 29 72. Athenaum, Frankfurt am Main.

Conclusion Simulation architecture for independence simulations category representations sampling biases density 0.00 0.04 0.08 density 0.00 0.04 0.08 density 0.00 0.04 0.08 density 0.00 0.04 0.08 100 50 0 50 VOT (ms) feedback 100 50 0 50 VOT (ms) 100 50 0 50 VOT (ms) 100 50 0 50 VOT (ms)