Speech Perception NACS April 2009

Similar documents
Phonological and Phonetic Representations: The Case of Neutralization

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Mandarin Lexical Tone Recognition: The Gating Paradigm

Learners Use Word-Level Statistics in Phonetic Category Acquisition

Rhythm-typology revisited.

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number

Translational Display of. in Communication Sciences and Disorders

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Phonological encoding in speech production

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Stages of Literacy Ros Lugg

Journal of Phonetics

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

On the Formation of Phoneme Categories in DNN Acoustic Models

Speech Recognition at ICSI: Broadcast News and beyond

Accelerated Learning Course Outline

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

Infants learn phonotactic regularities from brief auditory experience

Accelerated Learning Online. Course Outline

Lecture 2: Quantifiers and Approximation

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

INTRODUCTION. 512 J. Acoust. Soc. Am. 105 (1), January /99/105(1)/512/10/$ Acoustical Society of America 512

Florida Reading Endorsement Alignment Matrix Competency 1

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University

Speech Perception in Dyslexic Children. With and Without Language Impairments. Franklin R. Manis. University of Southern California.

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

Without it no music: beat induction as a fundamental musical trait

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Proceedings of Meetings on Acoustics

Consonants: articulation and transcription

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Universal contrastive analysis as a learning principle in CAPT

Individual Differences & Item Effects: How to test them, & how to test them well

Psychology of Speech Production and Speech Perception

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Phonological Processing for Urdu Text to Speech System

raıs Factors affecting word learning in adults: A comparison of L2 versus L1 acquisition /r/ /aı/ /s/ /r/ /aı/ /s/ = individual sound

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds

Unraveling symbolic number processing and the implications for its association with mathematics. Delphine Sasanguie

English Language and Applied Linguistics. Module Descriptions 2017/18

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

Segregation of Unvoiced Speech from Nonspeech Interference

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES MODELING IMPROVED AMHARIC SYLLBIFICATION ALGORITHM

Consonant-Vowel Unity in Element Theory*

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

Voice conversion through vector quantization

THE RECOGNITION OF SPEECH BY MACHINE

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

The ABCs of O-G. Materials Catalog. Skills Workbook. Lesson Plans for Teaching The Orton-Gillingham Approach in Reading and Spelling

Fribourg, Fribourg, Switzerland b LEAD CNRS UMR 5022, Université de Bourgogne, Dijon, France

On the nature of voicing assimilation(s)

Language Development: The Components of Language. How Children Develop. Chapter 6

LEXICAL CATEGORY ACQUISITION VIA NONADJACENT DEPENDENCIES IN CONTEXT: EVIDENCE OF DEVELOPMENTAL CHANGE AND INDIVIDUAL DIFFERENCES.

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Online Publication Date: 01 May 1981 PLEASE SCROLL DOWN FOR ARTICLE

Word Stress and Intonation: Introduction

Self-Supervised Acquisition of Vowels in American English

ERP measures of auditory word repetition and translation priming in bilinguals

L1 Influence on L2 Intonation in Russian Speakers of English

DOWNSTEP IN SUPYIRE* Robert Carlson Societe Internationale de Linguistique, Mali

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS

REVIEW OF NEURAL MECHANISMS FOR LEXICAL PROCESSING IN DOGS BY ANDICS ET AL. (2016)

Falling on Sensitive Ears

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Evolution of Symbolisation in Chimpanzees and Neural Nets

Concept Acquisition Without Representation William Dylan Sabo

The phonological grammar is probabilistic: New evidence pitting abstract representation against analogy

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Get Your Hands On These Multisensory Reading Strategies

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Learning Methods in Multilingual Speech Recognition

The Acquisition of English Intonation by Native Greek Speakers

THE INFLUENCE OF TASK DEMANDS ON FAMILIARITY EFFECTS IN VISUAL WORD RECOGNITION: A COHORT MODEL PERSPECTIVE DISSERTATION

Beeson, P. M. (1999). Treating acquired writing impairment. Aphasiology, 13,

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Phonological Encoding in Sentence Production

NAME: East Carolina University PSYC Developmental Psychology Dr. Eppler & Dr. Ironsmith

The acquisition of certain basic cognitive functions seems to

Neural & Predictive Effects of Verb Argument Structure

Written by Joseph Chilton Pearce Thursday, 01 March :00 - Last Updated Wednesday, 25 February :34

GOLD Objectives for Development & Learning: Birth Through Third Grade

Innovative Methods for Teaching Engineering Courses

2,1 .,,, , %, ,,,,,,. . %., Butterworth,)?.(1989; Levelt, 1989; Levelt et al., 1991; Levelt, Roelofs & Meyer, 1999

Transcription:

Speech Perception NACS 642 01 April 2009

power/amplitude

frequency

+

+ =

Tonotopic Organization

Speech...

Source-Filter Model

Source-Filter Model

Source-Filter Model

Source-Filter Model

Frequency Time

Stop Consonants: [p b t d k g]

Fricatives: [!! f v s z " #]

The Problem of Speech Perception

Hypothesized Representational Format

Hypothesized Representational Format

Hypothesized Representational Format

How do we get from here to there

How do we get from here to there

How do we get from here to there

The simplest theory Hypothesis: There is a one-to-one relationship between pieces of acoustic information and the segmental information stored in our head

The simplest theory

The simplest theory

29

30

30

30

Different Acoustic Input: Same percept! 30

Front Back High Low

Front Back High Good! Low

37

38

39

40

41

42

Peterson & Barney (1952)

Obscured by phonetic context and speaker differences...

Simple One-to-One Mapping between acoustic cue and phoneme doesn t seem to exist...

From vibrations in the ear to abstractions in the brain

From vibrations in the ear to abstractions in the brain sounds words

From vibrations in the ear to abstractions in the brain sounds words

From vibrations in the ear to abstractions in the brain sounds words Continuously varying waveform with information on multiple time- and frequency scales must be encoded

From vibrations in the ear to abstractions in the brain sounds words Continuously varying waveform with information on multiple time- and frequency scales must be encoded

From vibrations in the ear to abstractions in the brain sounds words Continuously varying waveform with information on multiple time- and frequency scales must be encoded and decoded to make contact with the long-term linguistic representations in memory WORD

From vibrations in the ear to abstractions in the brain sounds words Continuously varying waveform with information on multiple time- and frequency scales must be encoded word and decoded to make contact with the long-term linguistic representations in memory WORD WORD

sincetherearenowordboundarysignsinspokenlanguagethedifficultywefeelinreading andunderstandingtheaboveparagraphprovidesasimpleillustrationofoneofthemaind ifficultieswehavetoovercomeinordertounderstandspeechratherthananeatlyseparat edsequenceofletterstringscorrespondingtothephonologicalformofwordsthespeech signalisacontinuousstreamofsoundsthatrepresentthephonologicalformsofwordsin additionthesoundsofneighboringwordsoftenoverlapwhichmakestheproblemofident ifyingwordboundariesevenharder

Why speech perception should not work

Why speech perception should not work linearity no straightforward mapping between stretches of sound and phonemes

Why speech perception should not work linearity no straightforward mapping between stretches of sound and phonemes

Why speech perception should not work linearity invariance no straightforward mapping between stretches of sound and phonemes no (obvious) invariant features identify a given phoneme in all contexts

Why speech perception should not work linearity invariance no straightforward mapping between stretches of sound and phonemes no (obvious) invariant features identify a given phoneme in all contexts

Why speech perception should not work linearity invariance perceptual constancy no straightforward mapping between stretches of sound and phonemes no (obvious) invariant features identify a given phoneme in all contexts we reliably identify speech despite tremendous variation across speakers (pitch, rate, accent, affect )

Why speech perception should not work linearity invariance perceptual constancy no straightforward mapping between stretches of sound and phonemes no (obvious) invariant features identify a given phoneme in all contexts we reliably identify speech despite tremendous variation across speakers (pitch, rate, accent, affect )

Why speech perception should not work linearity invariance perceptual constancy no straightforward mapping between stretches of sound and phonemes no (obvious) invariant features identify a given phoneme in all contexts we reliably identify speech despite tremendous variation across speakers (pitch, rate, accent, affect ) Halle and Stevens 1962 Chomsky and Miller 1963

Varies across: speakers, phonetic context, rate, etc. Stable across: speakers, phonetic context, rate, etc.

Varies across: speakers, phonetic context, rate, etc. What set of perceptual/ neural mechanisms mediate the mapping between acoustic input and long term memory representations? Stable across: speakers, phonetic context, rate, etc.

The Problem of Speech Perception [+ voiced] [+ continuant]

The Problem of Speech Perception [+ voiced] [+ continuant] What s involved in this mapping?

The Problem of Speech Perception 0.4258 0 0.7621 0-0.6509 0 2.56916 Time (s) 0.4674-0.8202 0 3.20771 Time (s) 0-0.6457 0 5.56735 Time (s)

Questions Cognitive Neuroscience can help answer: 1. What is the nature of stored mental representations? 2. What types of mechanisms are involved in mapping from acoustics to memory? 3. What brain areas are implicated in the perception of speech?

Questions Cognitive Neuroscience can help answer: 1. What is the nature of stored mental representations? 2. What types of mechanisms are involved in mapping from acoustics to memory? 3. What brain areas are implicated in the perception of speech?

Levels of Representation Acoustics: Variation in air pressure; Analog input to auditory system Phonetics: Language-specific categorization of different acoustic tokens; phonetic tokens Discriminability of different acoustic tokens relatively preserved Phonology: Abstract symbolic representations; Fine-grained distinctions irrelevant; All or nothing category membership phonemes English [p h at] pot [spat] spot Hindi [p h $l] fruit [p$l] moment English /p/ Hindi /p h / /p/

Phonetic Categories Map acoustic tokens into a multidimensional space There still may be speech specific processing t t t t t t t t t t t t t t But... representations are not discrete, abstract, etc. Store fine phonetic detail d d d d d d d d d Dennis Klatt, Stephen Goldinger, Peter Jusczyk, Jessica Maye, Keith Johnson

Voice Onset Time

Voice Onset Time The dot The tot

Voice Onset Time The dot The tot

Voice Onset Time The dot The tot Short VOT Long VOT

Voice Onset Time /da/ /ta/ 20 Nb of tokens produced 15 10 [d] [t] 5 0 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100 100-110 110-120 VOT (in ms)

Voice Onset Time

Categorical Perception [da] VOT: 20ms [ta] VOT: 80ms same different Discrimination Task

Categorical Perception [da] VOT: 20ms /t/ /d/ Identification Task

Voice Onset Time Identification RT of Identification from Phillips et al (2000)

Voice Onset Time Identification RT of Identification from Phillips et al (2000)

MMN = Mismatch Negativity ERP (event related potential) that reflects sensory discrimination Elicited by repeated presentation of a sound stimulus (standard) which is sometimes changed into a different sound (deviant): X X X X X X Y X X X X X Y X X X X Y

MMN = Mismatch Negativity ERP (event related potential) that reflects sensory discrimination Elicited by repeated presentation of a sound stimulus (standard) which is sometimes changed into a different sound (deviant): Elicited pre-attentively!

from Näätänen (1999)

= Standard - Deviant NOTICE: negative voltage up positive voltage down from Näätänen (1999)

N1 or N100 = Standard - Deviant NOTICE: negative voltage up positive voltage down from Näätänen (1999)

Obligatory ERP Reflects sensory encoding of auditory stimulus attributes

Discriminability (Methods) % Behavioral level: Categorical Perception % Electrophysiological level: MMN

Discriminability of phones by VOT % Behavioral level: Categorical Perception & % Electrophysiological level: MMN?

Looking at VOT: [dæ] vs [tæ] Behavioral data EEG: N1 (sensory encoding) EEG: MMN (sensory discrimination)

Sharma & Dorman 1999 Behavioral Experiment:

Sharma & Dorman 1999

Sharma & Dorman 1999 Discrimination: AX task

Sharma & Dorman 1999 Discrimination: AX task Performing at chance level

Sharma & Dorman 1999 MMN Experiment 30-50ms 60-80ms

Sharma & Dorman 1999 MMN Experiment

Level of representation Acoustics: % Variation in air pressure; % Analog input to auditory system Phonetics: % Language-specific categorization of different acoustic tokens; % Discriminability of different acoustic tokens relatively preserved Phonology: % Abstract symbolic representations; % Fine-grained distinctions irrelevant; % All or nothing category membership

Questions What kinds of representation is the MMN sensitive to? % Acoustic? % Phonetic? % Phonemic? How can we be sure it s not just acoustics?

Potential problem How can we be sure it s not just acoustics? There seems to be a difference between the 30-50 and the 60-80 MMN response; BUT, what if this difference has nothing to do with the phonetic category people perceive? Could it be that there is something special about the 30-50ms gap, for instance?

Potential problem How can we be sure it s not just acoustics? Could it be that there is something special about the 30-50ms gap, for instance?

Perception of VOT Identification RT of Identification from Phillips et al (2000)

Potential problem How can we be sure it s not just acoustics? Could it be that there is something special about the 30-50ms gap, for instance? If Chinchillas can show the same Categorical Perception behavior for the VOT continuum, this response is probably not based on phonetics

Potential problem Neuroscience evidence: VOT < 30ms and > 60ms have different neuronal population encoding in mammalian auditory system than VOT in the the 30ms-60ms range

Potential problem Could it be that there is something special about the 30-50ms gap, for instance? There is, apparently. How can we be sure it s not just acoustics? With these results alone, we can t.

Suggestions? Can we come up with ways to test whether or not we can test the MMN response to see if it is sensitive to the phonetic and phonological level of representations? Requirement: Many-to-one ratio XXXXY

Look at sounds that are phonemically in one language, but not in the other. % Näätänen et al (1997)

Na!a!ta!nen et al (1999) Looking for language-dependent memory traces for sounds Vowels Finnish Estonian

Vowels varying only in F2

Vowels varying only in F2 Estonian extra vowel F2 values

Vowels varying only in F2

MMN = Standard - Deviant NOTICE: negative voltage up positive voltage down from Näätänen (1999)

Pure Tones with freq = F2

Pure Tones with freq = F2

F2 Pure Tones vs Vowels

F2 Pure Tones vs Vowels Nonmonotonic increase; Drop Linear increase; No drop

Vowels: Finns vs Estonians

Vowels: Finns vs Estonians Drop No Drop

Finns vs Estonians MMN peak amplitude at Fz Finns (! blue) Estonians (' purple)

Finns vs Estonians MMN peak amplitude at Fz Finns (! blue) Estonians (' purple) Drop

MEG data - Dipole Model

MEG data - Dipole Model Drop

Conclusions Tone vs Vowel data is dissimilar for Finnish speakers, even though what s being varied in the two conditions is the exact same acoustic quantity % Finnish judge a vowel with an F2 of 1,311Hz to be a very bad instance of /ö/ % Estonians have a vowel /õ/, and judge a vowel with an F2 of 1,311Hz as a good instance of that /õ/ vowel

Conclusions % Finnish judge a vowel with an F2 of 1,311Hz to be a very bad instance of /ö/ % Estonians have a vowel /õ/, and judge a vowel with an F2 of 1,311Hz as a good instance of that /õ/ vowel % Estonian vowel MMN data is more in line with Finnish Tone data

Any problems? Are you convinced? Does this show that the MMN is indeed sensitive to phonemic categories? Could these results be explained on a purely acoustic basis? Could these results be explained on a purely phonetic basis?

Level of representation Acoustics: % Variation in air pressure; % Analog input to auditory system Phonetics: % Language-specific categorization of different acoustic tokens; % Discriminability of different acoustic tokens relatively preserved Phonology: % Abstract symbolic representations; % Fine-grained distinctions irrelevant; % All or nothing category membership

Phillips et al (2000) Question: Is the MMN sensitive to Phonological Categories? % Abstract symbolic representations; % Fine-grained distinctions irrelevant; % All or nothing category membership

Phillips et al (2000) Template of MMN design: X X X X X X Y X X X X X Y X X X X Y Sharma & Dorman (1999) - VOT values: 30 30 30 30 30 50 30 30 30 30 30 50 30 30 60 60 60 60 60 80 60 60 60 60 60 80 60 60

Phillips et al (2000) Template of MMN design: X X X X X X Y X X X X X Y X X X X Y Sharma & Dorman (1999) - VOT values: 30 30 30 30 30 50 30 30 30 30 30 50 30 30 60 60 60 60 60 80 60 60 60 60 60 80 60 60 Many-to-one ratio at all levels

Phillips et al (2000) Template of MMN design: X X X X X X Y X X X X X Y X X X X Y Sharma & Dorman (1999) - VOT values: 30 30 30 30 30 50 30 30 30 30 30 50 30 30 60 60 60 60 60 80 60 60 60 60 60 80 60 60 Many-to-one ratio at all levels Let s try to do the many-to-one ratio only at the phonological level

Phillips et al (2000) Template of MMN design: X X X X X X Y X X X X X Y X X X X Y Sharma & Dorman (1999) - VOT values: 30 30 30 30 30 50 30 30 30 30 30 50 30 30 60 60 60 60 60 80 60 60 60 60 60 80 60 60 Phillips et al. (2000) - VOT values: 8 16 0 24 16 48 0 24 16 0 24 8 64 16 8 56 0

Perception of VOT Identification RT of Identification from Phillips et al (2000)

Many-to-one only at P level Phillips et al. (2000) - VOT values: A: 8 16 0 24 16 48 0 24 16 0 24 8 64 16 8 56 0 P: D D D D D T D D D D D D T D D T D

Results Exp1

What if not PhonCat, but... What if the results are not due to Phonological categories, but to something prosaic as the VOT difference between adjacent sounds? From standard to standard, the VOT difference could span 0 to 24ms (mean 12) From standard to deviant, the VOT difference could go from 14 to 72ms (mean 40) How can we address this?

Exp. 2 - Acoustics Add 20ms VOT in all sounds, such that the relative distance between them remains the same, but the proportion of sounds falling on each side of the boundary change:

Exp. 2 - Acoustics Add 20ms VOT in all sounds, such that the relative distance between them remains the same, but the proportion of sounds falling on each side of the boundary change: No longer many-to-one relations at P level

What if not PhonCat, but...

No MMN for acoustic condition

Phillips et al (2000) Conclusion: MMN here is driven by phonological category membership, not acoustics.

Question Are you convinced? Can we be sure this result does not stem from acoustics? What about phonetic categories?

No Abstract Categories You simply map acoustic tokens into a multidimensional space There still may be speech specific processing t t t t t t t t t t t t t t But... representations are not discrete, abstract, etc. Store fine phonetic detail d d d d d d d d d Dennis Klatt, Stephen Goldinger, Peter Jusczyk, Jessica Maye, Keith Johnson

VOT Distribution DISTRIBUTION OF VOT 30 25 20 Frequency 15 10 5 0 5 15 25 35 45 55 65 75 85 95 105 115 125 135 145 More Voice Onset Time (in ms)

Do We Even Have Categories? Perhaps we should not even be asking if infants have well-formed phonetic categories, separated by boundaries, but rather if any language users do. In other words, the very concept of categories, and even more so of boundaries, needs to be reconsidered...we have no evidence that boundaries exist in the natural world, or any account of how or why they may have evolved by natural selection. To extend to them any degree of psychological reality is unsupportable, and deleterious to efforts to understand how phonetic structure is indeed instantiated and retrieved from the speech signal. Nittrouer (2001)

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space t t t t t t t t t t t t t t t Sampling from/mapping into different distribution could elicit MMN d d d d d d d d d

A Phonetic Explanation for Phillips, et al. 2000 MMN could be induced by sampling from the statistical distribution of phonetic categories No need to rely on abstract phonological categories if this is how we conceive of the phonetic space Sampling from/mapping into different distribution could elicit MMN t t t t t t t t t t t t t t t MMN! d d d d d d d d d

Kazanina, et al. (2006)

Kazanina, et al. (2006)

Kazanina, et al. (2006)

Kazanina, et al. (2006)

Dupoux et al. 1999 Phonotactics seems to influence how people perceive phonetic sounds. % Look at these Japanese borrowed words:

Dupoux et al. 1999 Japanese has a restricted syllabic inventory when compared to languages such as English and French % V, CV, CVNasal, CVQ (Q = first half of a geminate consonant)

Dupoux et al. 1999 Look at these Japanese borrowed words: Production? or is it Perception? Orthography?

Dupoux et al. 1999 Exp 1 Use non-ambiguous stimuli, manipulate native language Hypothesis is that vowel epenthesis is perceptual phenomenon When presented with items like Ebzo, native French speakers would be ok, but Japanese speakers should report hearing a [u] sound.

Dupoux Dupoux et al. et 1999 al. 1999 Exp 1 Japanese speaker recorded pseudo words of the structure VCuCV middle [u] was spliced out to different degrees (from virtually erased to just a little) Subjects had to hear stimuli and say whether or not they heard [u]

Dupoux Dupoux et al. et 1999 al. 1999 Exp 1

Dupoux Dupoux et al. et 1999 al. 1999 Exp 1

Dupoux Dupoux et al. et 1999 al. 1999 Exp 1 Japanese participant reported many more [u]s when there was little or no [u] information in the signal, unlike French speakers BUT % Japanese speaker % Coarticulation cue in the preceding consonant?

Dupoux Dupoux et al. et 1999 al. 1999 Exp 1 Japanese speaker: Coarticulation cue in the preceding consonant? [u] is often reduced or devoiced in Japanese Japanese might be extra sensitive to subtle coarticulation cues indicating [u]

Dupoux Dupoux et al. et 1999 al. 1999 Exp 2 Japanese speaker: Coarticulation cue in the preceding consonant? -- Get a French speaker! Japanese might be extra sensitive to subtle coarticulation cues indicating [u] -- Make French speaker articulate true VCCVs as well as VCiCV The rest is the same as in Exp 1

Dupoux Dupoux et al. et 1999 al. 1999 Exp 2

Dupoux Dupoux et al. et 1999 al. 1999 Exp 2

Dupoux Dupoux et al. et 1999 al. 1999 Exp 2

Dupoux Dupoux et al. et 1999 al. 1999 Exp 2 Even with no coarticulation cue, Japanese speakers were reporting hearing [u] in VCCV nonwords.

How Early? (ERPs)

How Early? (ERPs)

Dehaene-Lambertz, et al. (2000) 164 ms

Dehaene-Lambertz, et al. (2000) 315 ms

Dehaene-Lambertz, et al. (2000) 531 ms

Dehaene-Lambertz, et al. (2000)

Dehaene-Lambertz, et al. (2000)

Dehaene-Lambertz, et al. (2000)

A quick word on cortical connectivity in speech perception...

Geschwind Model

Geschwind Model

Hickok & Poeppel (2007)

To wrap up...

To wrap up... 1. Speech perception involves a complex mapping between acoustic input and long term memory.

To wrap up... 1. Speech perception involves a complex mapping between acoustic input and long term memory. 2. Can use cognitive neuroscience methods to ascertain representational nature of speech segments.

To wrap up... 1. Speech perception involves a complex mapping between acoustic input and long term memory. 2. Can use cognitive neuroscience methods to ascertain representational nature of speech segments. 3. Understand how brain encodes speech representations.

To wrap up... 1. Speech perception involves a complex mapping between acoustic input and long term memory. 2. Can use cognitive neuroscience methods to ascertain representational nature of speech segments. 3. Understand how brain encodes speech representations. 4. Auditory cortex seems to store speech segments in phonemic form (at least in addition to phonetic representations).