Simulations of Feedback and Feedforward Control in Stuttering

Similar documents
Stages of Literacy Ros Lugg

Beginning primarily with the investigations of Zimmermann (1980a),

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Clinical Review Criteria Related to Speech Therapy 1

Proceedings of Meetings on Acoustics

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Mandarin Lexical Tone Recognition: The Gating Paradigm

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Learning Methods for Fuzzy Systems

Understanding and Supporting Dyslexia Godstone Village School. January 2017

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Audible and visible speech

A Comparison of the Effects of Two Practice Session Distribution Types on Acquisition and Retention of Discrete and Continuous Skills

Speech/Language Pathology Plan of Treatment

STAFF DEVELOPMENT in SPECIAL EDUCATION

Consonants: articulation and transcription

Phonetics. The Sound of Language

Phonological encoding in speech production

age, Speech and Hearii

CEFR Overall Illustrative English Proficiency Scales

Segregation of Unvoiced Speech from Nonspeech Interference

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Dyslexia/LD Attention Deficit Disorders

Speaking Rate and Speech Movement Velocity Profiles

Milton Public Schools Special Education Programs & Supports

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Prevalence of Oral Reading Problems in Thai Students with Cleft Palate, Grades 3-5

SARDNET: A Self-Organizing Feature Map for Sequences

Developing a College-level Speed and Accuracy Test

Speaker recognition using universal background model on YOHO database

Language Development: The Components of Language. How Children Develop. Chapter 6

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Beeson, P. M. (1999). Treating acquired writing impairment. Aphasiology, 13,

Recommended Guidelines for the Diagnosis of Children with Learning Disabilities

The Complete Brain Exercise Book: Train Your Brain - Improve Memory, Language, Motor Skills And More By Fraser Smith

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

The Journey to Vowelerria VOWEL ERRORS: THE LOST WORLD OF SPEECH INTERVENTION. Preparation: Education. Preparation: Education. Preparation: Education

Body-Conducted Speech Recognition and its Application to Speech Support System

California Rules and Regulations Related to Low Incidence Handicaps

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Fluency Disorders. Kenneth J. Logan, PhD, CCC-SLP

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

SLINGERLAND: A Multisensory Structured Language Instructional Approach

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

One major theoretical issue of interest in both developing and

Human Factors Engineering Design and Evaluation Checklist

UNIVERSITY OF SOUTHERN MISSISSIPPI Department of Speech and Hearing Sciences SHS 726 Auditory Processing Disorders Spring 2016

Speaking Rate among Adult Hebrew Speakers: A Preliminary Observation

THE RECOGNITION OF SPEECH BY MACHINE

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Rhythm-typology revisited.

Tracy Dudek & Jenifer Russell Trinity Services, Inc. *Copyright 2008, Mark L. Sundberg

Speech Emotion Recognition Using Support Vector Machine

Florida Reading Endorsement Alignment Matrix Competency 1

COMMUNICATION DISORDERS. Speech Production Process

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

A cautionary note is research still caught up in an implementer approach to the teacher?

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

ABSTRACT. Some children with speech sound disorders (SSD) have difficulty with literacyrelated

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Expressive speech synthesis: a review

Course Law Enforcement II. Unit I Careers in Law Enforcement

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

PROGRAM REQUIREMENTS FOR RESIDENCY EDUCATION IN DEVELOPMENTAL-BEHAVIORAL PEDIATRICS

Potential Interactions among Linguistic, Autonomic, and Motor Factors in Speech

Scenario Design for Training Systems in Crisis Management: Training Resilience Capabilities

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions

South Carolina English Language Arts

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Dynamic Pictures and Interactive. Björn Wittenmark, Helena Haglund, and Mikael Johansson. Department of Automatic Control

Speech Recognition at ICSI: Broadcast News and beyond

Learning Microsoft Publisher , (Weixel et al)

Accelerated Learning Online. Course Outline

Developmental coordination disorder DCD. Overview. Gross & fine motor skill. Elisabeth Hill The importance of motor development

GDP Falls as MBA Rises?

Intervening to alleviate word-finding difficulties in children: case series data and a computational modelling foundation

Circuit Simulators: A Revolutionary E-Learning Platform

Consonant Worksheets

Fribourg, Fribourg, Switzerland b LEAD CNRS UMR 5022, Université de Bourgogne, Dijon, France

Evolutive Neural Net Fuzzy Filtering: Basic Description

Longitudinal family-risk studies of dyslexia: why. develop dyslexia and others don t.

International Journal of Advanced Networking Applications (IJANA) ISSN No. :

Forget catastrophic forgetting: AI that learns after deployment

Transcription:

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 1 Simulations of Feedback and Feedforward Control in Stuttering Oren Civier 1 and Frank H. Guenther 1,2,3 1 Department of Cognitive and Neural Systems, Boston University. 2 Speech Communication Group, Research Laboratory of Electronics, Massachusetts Institute of Technology. 3 Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital. Abstract Simulation of speech production using a neurologically impaired model reveals patterns similar to stuttering: 1) High frequency of stutters on initial sounds; 2) enhancement of fluency by exposure to white noise; 3) enhancement of fluency by reducing the rate of speech. The results support the notion that stuttering may be in part due to weakening of the pathways involved in feedforward control of well practiced speech sounds and the consequent dominance of auditory feedback control. Introduction Dysfunction of auditory feedback has long been suspected as a source of stuttering (Fairbanks, 1954), mainly due to the apparent eect of delayed auditory feedback in alleviating stuttering (Kalinowski, Armson, Roland-Mieszkowski, Stuart, & Gracco, 1993) and the low incidence of stuttering among the deaf (Van Riper, 1971). Neilson & Neilson (1987) modeled feedback control and concluded that speech production is too fast to be controlled solely by feedback. The intrinsic delay between motor command and its auditory consequences will render the system unstable, especially during rapid transitions (e.g. consonants). Therefore, fluent speakers must rely primarily on feedforward control independent of auditory feedback. The hypothesis investigated here is that, due to weakened feedforward control projections, some people who stutter may rely too heavily on feedback control (Max, Guenther, Gracco, Ghosh, & Wallace, 24). When the feedback instabilities create too large an auditory error (the dierence between the expected and produced auditory signal), the system performs a "reset" of the current syllable production, resulting in a part-word repetition. Our hypothesis is consistent with neurological evidence that stuttering adults show abnormalities in the white matter pathways underlying the orofacial area of the left hemisphere primary motor cortex (Sommer, Koch, Paulus, Weiller, & Buchel, 22). Damage to these pathways may compromise the feedforward command from premotor to primary motor areas. DIVA: A neural model of speech production DIVA is a biologically plausible neural network model capable of simulating production and development of fluent speech (e.g., Guenther, Hampson, & Johnson, 1998). It combines mathematical descriptions of underlying commands, cerebral and cerebellar neural substrates corresponding to the model s components, and computer simulations controlling an articulatory synthesizer. In the model (schematically represented in Figure 1), cells in the motor cortex generate the overall motor command, M(t), for producing a speech sound. M(t) is a combination of a feedforward command (i.e., a command that was previously learned for producing the sound and does not rely on auditory feedback for its execution) and a feedback command (created by University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 2 comparing actual auditory feedback to a learned auditory target, then correcting for any mismatch) 1 : M& ( t) = α M& feedforward ( t) + α M & feedback ( t) α + α = 1 with α and α represent the amount of weighting toward feedforward and feedback control. M & (t) and & (t ) are the feedforward and feedback commands, respectively. feedforward M feedback Figure 1 - Schematic of the DIVA model. The stages of learning in the model are as follows: 1. Tune feedback control subsystem during babbling (self generated speech sounds), 2. Learn an auditory target (formant trajectory ranges) when a new sound sample is presented, 3. Learn a feedforward command for the sound by practicing its production. In eect, the feedback control subsystem tunes the feedforward command with practice. Once an accurate feedforward command is learned, there will be little or no auditory error, and thus the feedback control subsystem will no longer play a major role. However, if the feedforward projections are weak (e.g, in some people who stutter), the feedback controller will always be engaged and can cause instabilities, particularly during rapid speech. Our hypothesis is in keeping with the fact that the onset of stuttering occurs in early childhood, when this transition from feedback to feedforward control takes place. 1 The full DIVA model also includes somatosensory feedback control. It is not addressed here for the sake of simplicity. For more information regarding the full DIVA model, see Guenther, Ghosh, & Nieto-Castanon (23). University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 3 Simulations Normal vs. stuttering versions of the model We simulated speech production of the utterance good dog with weakened feedforward projections by setting α =.25 and α =.75 (the stuttering version), and compared it with a simulation with normal feedforward projections by setting α =.75 and α =.25 (the normal version). From the simulation results (Figure 2), it is evident that dominance of feedback control in the stuttering version causes instabilities (large auditory errors due to delayed feedback processing) that increase the chance for stutters. Normal version - feedforward dominant Stuttering version - feedback dominant I II III IV V 35 3 25 2 15 1 5.2.1 5 5 g - o o - d d - o - g 3 88 137 186 255 331 438 3 88 137 186 255 331 438 3 88 137 186 255 331 438 3 88 137 186 255 331 438 5 time (m Sec) word phonation starts formant freq. (Hz) pressure error(hz) # of stutters 3 88 137 186 255 331 438 F1 F1F2 F2F3 F3 5 3 88 137 186 255 331 438 3 88 137 186 255 331 438 5 time (m Sec) Figure 2 - Normal vs. Stuttering simulations. I. Vocal tract configurations. The images were created from the Maeda articulator model which is used by DIVA to synthesize the acoustic signal (Maeda, 199). The Maeda model has 7 degrees of freedom: Jaw (1), tongue (3), lips (2) and larynx height (1). Each frame shows the vocal tract configuration at a specific moment in time (corresponding approximately to the phonemes d, g, and aa ). II. Auditory target region and actual trajectory. The solid lines follow the 3 formant frequencies of the produced waveform. Dashed lines represent the 3 auditory target regions. When one of the formant frequencies is out of the target region (i.e., solid line is not between dashed lines), an auditory error is generated. III. Glottal pressure. The glottal pressure is the main indicator for vocal intensity, which modulates the auditory feedback. IV. Auditory error. The sum of the absolute auditory errors for the 3 formants. Notice that when glottal pressure is, no auditory error exists since there is no voicing. V. Stutter distribution. We assume that, due to noise in the system, the exact stutter position is not deterministic. For every 1/1 of a second, t, P( stutter( t)) = ε * Error ( t), where ε is a parameter set to 1x1-5 in the simulations. The histogram shows how many stutters on average occur on each sound in 37 repetitions of good dog. Sounds where the auditory error is greater, particularly at the beginning of words, are more likely to be stuttered. 35 3 25 2 15 1 5.2.1 5 g - o o - d d - o - g 3 88 137 186 255 331 438 University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 4 Since the movement of the articulators into their initial positions normally takes place before word phonation starts (-3ms), auditory feedback is not available at that time. For individuals with weak feedforward projections, the articulators will not reach their appropriate positions without auditory feedback. Thus, when phonation starts, the system will detect an auditory error and try to correct it using auditory feedback. Unfortunately, the correction can lead to instabilities (as described above), especially during sharp formant transitions. This can explain the higher frequency of stuttering on the initial sound or syllable of the word (where over 9% of stutters occur), especially if it is a consonant (Bloodstein, 1995). Figure 3 demonstrates a partword repetition on the initial sound of the word good produced by the stuttering version. g g g - o oo o -- d g d - d - d o- - o g g - time (msec) Figure 3 - Acoustic waveforms produced by the stuttering version of the model. Notice 3 repetitions of the initial sound g in the stuttering version. Stuttering version with white noise Since most subjects in white noise experiments report hearing themselves (even with loud noise presented binaurally through earphones), Bloodstein (1995) concluded that the noise acts as a distraction and not as a mere masker. Here we simulate increased noise level by reducing feedback dominance (Figure 4). We assume that the distraction of noisy auditory feedback prevents focusing on feedback control and forces control using primarily feedforward commands. Our assumption follows Van Riper s (1971) suggestion that distraction to the auditory feedback shifts attention to other forms of control (due to competition between control channels). noise level feedforward and feedback balance # of Stutters in 37 repetitions of good dog - - - - - - - - - - - - Figure 4 Stuttering version with white noise. University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 5 In Figure 5, the simulation results (in blue) are compared with results of two very similar white noise experiments (in red) by Maraist & Hutton (1957) and Adams & Hutchinson (1974). Both involved ~15 subjects, and increasing noise intensities. In both simulation and experiments, there is an approximately linear increase in fluency with level of noise. Figure 5 - Stuttering with white noise. Simulation compared with experiments. In the white noise simulation, louder white noise enforces a feedforward/feedback balance more similar to the normal version. Consequently, less stutters are generated. We propose that the same mechanisms may act to enhance fluency in the white noise experiments described above. Stuttering version at a slower rate When slowing down the speaking rate, people who stutter tend to either reduce articulatory rate or insert pauses between words (depending on experiment design). Here we simulate reduced articulatory rate (induced by timed word production) by stretching of the formant trajectories in time (Figure 6). normal speed - 1/2 second 35 - - - - - - - - 3 time (msec) formant freq. (Hz) 25 2 15 1 # of stutters in 65 repetitions 5 1 slow speed - 1 second 6 176 274 372 51 662 876 Glottal pressure (AgP ) 5 half the speeddau - 1 6 176 274 372 51 662 876 1 time (msec) Figure 6 - Stuttering version at normal vs. slow speech rate. University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 6 When simulating the stuttering version at a reduced rate, transitions are not so sharp. Consequently, the feedback delays are less of a problem and fewer stutters are generated. We predict that the same mechanisms act to enhance fluency in the two comparable timed word production experiments: Perkins, Bell, Johnson, & Stocks (1979) who had 19 subjects read aloud 1 word per 2 seconds, and Adams, Lewis, & Besozzi (1973) who instructed 15 subjects to read 1 word per second. In both simulation and experiments, slowing down cuts stuttering by approximately half. Conclusions The simulation results account for several experimental findings regarding stuttering, therefore supporting the hypothesis that dominance of feedback control due to weakened feedforward projections is a possible source of stuttering. In the simulations of the normal vs. stuttering version we showed that an overemphasis on feedback control results in stuttering-like behavior. Moreover, since auditory feedback control is useless before phonation starts, stuttering is more likely to occur on the initial sound of a word. In the stuttering with white noise simulation, shifting emphasis from feedback to feedforward control by using white noise enhances fluency. Finally, we demonstrated that slowing down articulation can also enhance fluency, by creating better conditions for feedback control. Acknowledgements This study was supported by NIH/NIDCD grants R1 DC2852 (P.I. Frank Guenther) and RO1 DC37 (J. Perkell, PI). We would like to thank Satrajit S. Ghosh and Jonathan Brumberg for the development and maintenance of the DIVA model code. University, 29th June - 2nd July 25

Civier & Guenther Simulations of Feedback and Feedforward control in Stuttering 7 References Adams, M.R., & Hutchinson, J. (1974) The eects of three levels of auditory masking on selected vocal characteristics and the frequency of disfluency of adult stutterers. J Speech Hear Res. 17(4):682-8 Adams, M.R., Lewis, J.I., & Besozzi, T.E. (1973) The eect of reduced reading rate on stuttering frequency. J Speech Hear Res. 16(4):671-5 Bloodstein, O. (1995) A handbook on stuttering / Oliver Bloodstein. San Diego, Calif. : Singular Pub. Group Fairbanks, G. (1954) Systematic research in experimental phonetics: 1. A theory of the speech mechanism as a servosystem. J. Speech Hear. Disord., Vol 19, pp. 133-139 Guenther, F.H., Ghosh, S.S., & Nieto-Castanon, A. (23) A neural model of speech production. Proceedings of the 6th International Seminar on Speech Production, Sydney, Australia. Guenther, F.H., Hampson, M., & Johnson, D. (1998) A theoretical investigation of reference frames for the planning of speech movements. Psychological Review, 15, pp. 611-633. Kalinowski, J., Armson, J., Roland-Mieszkowski, M., Stuart, A., & Gracco, V.L. (1993) Eects of alterations in auditory feedback and speech rate on stuttering frequency. Lang Speech. 36 ( Pt 1):1-16. Maraist, J.A., & Hutton, C. (1957) Eects of auditory masking upon the speech of stutterers. J Speech Hear Disord. 22(3):385-9 Max, L., Guenther, F.H., Gracco, V.L., Ghosh, S.S., & Wallace ME. (24) Unstable or insuiciently activated internal models and feedback-biased motor control as sources of dysfluency: A theoretical model of stuttering. Contemporary Issues in Communication Science and Disorders, 31, pp. 15-122. Maeda, S. (199) Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal tract shapes using an articulatory model. InW.J. Hardcastle and A. Marchal (Eds.), Speech production and speech modelling (pp. 131-149). Boston: Kluwer Academic Publishers. Neilson, M.D., & Neilson, P.D. (1987) Speech motor control and stuttering: a computational model of adaptive sensory-motor processing. Speech communications 6. 325-333 Perkins, W.H., Bell, J., Johnson, L., & Stocks, J. (1979) Phone rate and the eective planning time hypothesis of stuttering. J Speech Hear Res. 22(4):747-55 Sommer, M., Koch, M.A., Paulus, W., Weiller, C., & Buchel, C. (22) Disconnection of speech-relevant brain areas in persistent developmental stuttering. Lancet. 36(933):38-3 Van Riper, C. (1971) The nature of stuttering. Englewood Clis, N.J.,: Prentice-Hall. University, 29th June - 2nd July 25