Copyright by Niamh Eileen Kelly 2015

Similar documents
Mandarin Lexical Tone Recognition: The Gating Paradigm

Rhythm-typology revisited.

Word Stress and Intonation: Introduction

Phonological and Phonetic Representations: The Case of Neutralization

The Acquisition of English Intonation by Native Greek Speakers

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

L1 Influence on L2 Intonation in Russian Speakers of English

A survey of intonation systems

Bitonal lexical pitch accents in the Limburgian dialect of Borgloon

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

Universal contrastive analysis as a learning principle in CAPT

Lexical specification of tone in North Germanic

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Journal of Phonetics

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Phonological encoding in speech production

The influence of metrical constraints on direct imitation across French varieties

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

18 The syntax phonology interface

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

Collecting dialect data and making use of them an interim report from Swedia 2000

Proceedings of Meetings on Acoustics

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Review in ICAME Journal, Volume 38, 2014, DOI: /icame

The Odd-Parity Parsing Problem 1 Brett Hyde Washington University May 2008

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Florida Reading Endorsement Alignment Matrix Competency 1

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Speech Recognition at ICSI: Broadcast News and beyond

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds

Regional variation in the realization of intonation contours in the Netherlands

Delaware Performance Appraisal System Building greater skills and knowledge for educators

Speech Emotion Recognition Using Support Vector Machine

THE INFLUENCE OF COOPERATIVE WRITING TECHNIQUE TO TEACH WRITING SKILL VIEWED FROM STUDENTS CREATIVITY

Accounting 380K.6 Accounting and Control in Nonprofit Organizations (#02705) Spring 2013 Professors Michael H. Granof and Gretchen Charrier

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

IMPROVING STUDENTS SPEAKING SKILL THROUGH

DOWNSTEP IN SUPYIRE* Robert Carlson Societe Internationale de Linguistique, Mali

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Local and Global Acoustic Correlates of Information Structure in Bulgarian

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

CEFR Overall Illustrative English Proficiency Scales

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

On the nature of voicing assimilation(s)

Eyebrows in French talk-in-interaction

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Program in Linguistics. Academic Year Assessment Report

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

Discourse Structure in Spoken Language: Studies on Speech Corpora

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

REVIEW OF CONNECTED SPEECH

learning collegiate assessment]

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

A Study of Successful Practices in the IB Program Continuum

To appear in The TESOL encyclopedia of ELT (Wiley-Blackwell) 1 RECASTING. Kazuya Saito. Birkbeck, University of London

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

(De-)Accentuation and the Processing of Information Status: Evidence from Event- Related Brain Potentials

SARDNET: A Self-Organizing Feature Map for Sequences

The Bruins I.C.E. School

5. Margi (Chadic, Nigeria): H, L, R (Williams 1973, Hoffmann 1963)

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

A simpler view of Danish stød

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

Fluency Disorders. Kenneth J. Logan, PhD, CCC-SLP

English Language and Applied Linguistics. Module Descriptions 2017/18

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

1GOOD LEADERSHIP IS IMPORTANT. Principal Effectiveness and Leadership in an Era of Accountability: What Research Says

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

This Performance Standards include four major components. They are

Modern TTS systems. CS 294-5: Statistical Natural Language Processing. Types of Modern Synthesis. TTS Architecture. Text Normalization

Cambridgeshire Community Services NHS Trust: delivering excellence in children and young people s health services

Sources of difficulties in cross-cultural communication and ELT: The case of the long-distance but in Chinese discourse

First Grade Curriculum Highlights: In alignment with the Common Core Standards

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

THE SURFACE-COMPOSITIONAL SEMANTICS OF ENGLISH INTONATION MARK STEEDMAN. University of Edinburgh

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University

Consonant-Vowel Unity in Element Theory*

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

Strategic Practice: Career Practitioner Case Study

Phonological Processing for Urdu Text to Speech System

Lexical phonology. Marc van Oostendorp. December 6, Until now, we have presented phonological theory as if it is a monolithic

Infants Perception of Intonation: Is It a Statement or a Question?

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #3 Higher Education Salary Problem

Evaluation of Teach For America:

Longitudinal Analysis of the Effectiveness of DCPS Teachers

A Socio-Tonetic Analysis of Sui Dialect Contact. James N. Stanford Rice University. [To appear in Language Variation and Change 20(3)]

Disambiguation of Thai Personal Name from Online News Articles

Timeline. Recommendations

IMPROVING STUDENTS READING COMPREHENSION BY IMPLEMENTING RECIPROCAL TEACHING (A

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations

Principal vacancies and appointments

Transcription:

Copyright by Niamh Eileen Kelly 2015

The Dissertation Committee for Niamh Eileen Kelly certifies that this is the approved version of the following dissertation: An Experimental Approach to the Production and Perception of Norwegian Tonal Accent Committee: Rajka Smiljanić, Supervisor Scott Myers Megan Crowhurst Harvey Sussman Gjert Kristoffersen

An Experimental Approach to the Production and Perception of Norwegian Tonal Accent by Niamh Eileen Kelly, B.A., M.A. DISSERTATION Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY THE UNIVERSITY OF TEXAS AT AUSTIN May 2015

You don t set out to build a wall! You don t say, I m going to build the biggest, baddest, greatest wall that s ever been built. You don t start there. You say, I m going to lay this brick as perfectly as a brick can be laid. And you do that every single day and soon you have a wall. - Will Smith

Acknowledgments The work that resulted in this dissertation was a collaborative effort, and I have many people to thank. I could not have done this without the support of my committee, whom I sincerely thank: Rajka, for unfaltering support and encouragement, not to mention extensive comments and guidance in all aspects of the research and writing, as well as understanding, affirmation, and belief in me. For constant confidence and help, whenever needed. Scott, for solid advice on experiments and writing, and for always being available to redirect when necessary! Megan, for very beneficial experiences as a research assistant, for very practical advice about academia, and for inspired perspectives on this work. Harvey, for enthusiasm and willingness to learn about Norwegian tones, also for being a wonderful example of an academic who truly enjoys teaching. Gjert, for constant guidance and insight on the relevant literature, for enthusiasm beginning with the very first ideas for this work, for extensive help in finding target words and creating the stimuli sentences, and for confidence in my abilities. To all in the Department of Linguistics; professors, colleagues, staff: thank you for such a supportive, motivating, warm environment. I am also very grateful for funding in the form of TAships, RAships, and AIships. These have provided me with invaluable experience and also allowed me to develop my teaching skills, something that I have enjoyed tremendously. To all at NTNU who kindly allowed me to use their facilities, helped with participant recruitment and also made me feel very welcome: Wim van Dommelen, Jacques Koreman, Terje Lohndal, Dawn Behne. I would like to thank Allison Wetterlin, Arnold Dalen, Thorstein Fretheim, Randi Nilsen, Stian Hårstad and Jørn Almberg for their advice and guidance. Thanks to all participants in the experiments. I also thank Johan, Hilde and Olve for their hospitality in Trondheim. I had a lot of assistance with translations of fliers and posters and consent forms, for which I especially thank Johan and Øystein. I would also like to thank Miquel Simonet for guidance on how v

to use PsychoPy, Katrin Schneider for her advice on designing perception experiments, and Grzegorz Dogil for the opportunity to spend time at the University of Stuttgart. I could not have done this without the unwavering support of my family. I am thankful to my parents for unending love and encouragement, to Deirdre, Triona and Eoin and to my friends in Ireland and all over the world, for love and support. My life in Austin would not have been as joyful without the companionship, support and laughter shared with so many wonderful friends. Stacy, Stephanie, Robyn, Lauren, Sean, Cindy, Alex, Whitney, Aimee, Justin, Brian, Taylor, Oren, Brooks, Megan, and all who have been part of my life here, thank you for all the good times. My second family, the Bennetts, thank you for always being there for me. Finally, I thank the National Science Foundation for supporting the research used as the basis for this dissertation (Doctoral Dissertation Research Improvement Grant No. 1322700). vi

An Experimental Approach to the Production and Perception of Norwegian Tonal Accent Publication No. Niamh Eileen Kelly, Ph.D. The University of Texas at Austin, 2015 Supervisor: Rajka Smiljanić This dissertation examines the lexical tonal accent contrast of the Trøndersk dialect of East Norwegian from the perspective of both production and perception. The goal of the production study was to conduct an in-depth investigation of the tonal accent realization in this understudied dialect, as well as to examine how the lexical accents are impacted by pragmatic focus and sentential intonation. The Trøndersk dialect is unusual typologically in that it exhibits a tonal contrast on monosyllabic words. Therefore, the current study examines the contrast on disyllabic and monosyllabic words. Ten speakers were recorded reading target monosyllabic and disyllabic words representing each accent, in noncontrastive and contrastive focus, and also at the right edge of an accent phrase (AP). The goal of the perception study was to determine what cues listeners use to identify the accents. The results of the acoustic analysis revealed that the main correlate of the disyllabic accent distinction in this dialect was in the timing of the F 0 contour, with accent 2 having a later alignment of F 0 landmarks and a higher F 0 minimum than accent 1. In contrastive focus, the accent contrast was found to be enhanced. Accent 1 showed an expanded pitch range and accent 2 an even later alignment of the HL contour compared to noncontrastive focus. When produced at the end of an AP, both accents had a higher F 0 minimum and lower AP boundary tone compared to AP-medial position. The AP-final position vii

also had an influence on segment duration, such that the stressed vowels were shorter and final vowels were longer compared to the AP-medial position. The results of the production experiments thus revealed that contrastive focus and AP-final position both affected pitch cues even though these cues are primarily used to distinguish the lexical pitch contrasts. However, the variation in pitch contour introduced by these factors did not diminish the lexical contrast. In fact, the asymmetrical impact of focus on accent 1 and accent 2 words enhanced the distinction between the two accents. For the monosyllabic contrast, the results revealed that in a noncontrastive focus realization, words with the circumflex accent have a wider HL contour compared to the unmarked accent. In contrastive focus, both accents have a wider pitch range and later low tone alignment. Unlike the effect of contrastive focus on disyllabic words where this increased the timing difference between the accents, the timing of the monosyllabic accents changed in the same direction in contrastive focus. Phonologically long vowels were also lengthened in this condition. Based on the production results, a categorization of stimuli with manipulated pitch contours was conducted. This experiment tested which acoustic cues (height and alignment of F 0 minimum, and alignment of F 0 maximum and turning point from maximum to minimum) are necessary for the perception of the tonal contrast. The results are consistent with the production findings in that changes in all of the examined acoustic cues contributed to the shift in accent categorization. The later timing of the main F 0 landmarks (F 0 maximum, F 0 minimum and turning point from maximum to minimum) induced accent 2 identification. Raising F 0 minimum height also led to more accent 2 responses. The analysis of the perception patterns furthermore revealed that the effect of a later timing of F 0 minimum was weak unless combined with a later timing of the other F 0 landmarks, or a higher F 0 minimum level, all of which contributed to more accent 2 responses. These results indicate that accent 1 is characterized by an early fall, and accent 2 by a salient initial high tone. This comprehensive investigation provided an in-depth description of the monosyllabic and disyllabic accents in this understudied, more conservative dialect that is being replaced by less conservative urban varieties. This viii

contributes to the literature on Scandinavian accentology. Furthermore, this study adds to the literature on the realization of focus in tonal accent languages, and how prosodically marked focus and sentence intonation interact with lexical accents. Finally, this work provides insights into how production and perception constraints shape processing of pitch variation. ix

Table of Contents Acknowledgments Abstract List of Tables List of Figures v vii xiii xvi Chapter 1. Introduction 1 Chapter 2. Background 3 2.1 Previous Research into Scandinavian Tonal Accent....... 3 2.1.1 The Trøndersk Variety of Norwegian........... 5 2.2 The Tonal Accent Contrast.................... 6 2.2.1 The Disyllabic Accent Contrast.............. 6 2.2.2 The Monosyllabic Accent Contrast............ 8 2.2.3 The Effect of Sentence-Level Intonation......... 9 2.2.4 The Prosodic Effect of Focus............... 11 2.3 The Perception of F0....................... 13 2.3.1 Perception of Lexical Tonal Accents........... 14 2.4 Goals and Research Questions.................. 15 2.5 Outline............................... 19 Chapter 3. Experiment 1: Disyllabic Accent Realization in Broad Focus and Contrastive Focus 20 3.1 Methods.............................. 20 3.1.1 Participants........................ 20 3.1.2 Materials.......................... 21 3.1.3 Procedure.......................... 23 x

3.1.4 Measurements and Analysis................ 23 3.2 Results............................... 27 3.3 Discussion............................. 38 Chapter 4. Experiment 2: Interaction of Disyllabic Accent Realization with Higher Level Intonation 46 4.1 Methods.............................. 46 4.1.1 Materials.......................... 46 4.1.2 Measurements and Analysis................ 47 4.2 Results............................... 48 4.3 Discussion............................. 58 Chapter 5. Experiment 3: Monosyllabic Accent Realization in Broad Focus and Contrastive Focus 61 5.1 Methods.............................. 61 5.1.1 Materials.......................... 61 5.1.2 Measurements and Analysis................ 62 5.2 Results............................... 64 5.3 Discussion............................. 75 Chapter 6. Experiment 4: Perception of Disyllabic Accents 78 6.1 Methods.............................. 79 6.1.1 Materials.......................... 79 6.1.2 Listeners.......................... 83 6.1.3 Procedure.......................... 84 6.1.4 Analysis........................... 84 6.2 Results............................... 85 6.3 Discussion............................. 92 Chapter 7. General Discussion and Conclusions 97 Appendices 102 Appendix A. Test Sentences 103 xi

Appendix B. Tables of Disyllabic Raw Results 108 Appendix C. Tables of Monosyllabic Raw Results 116 Bibliography 130 xii

List of Tables 2.1 Summary of previous analyses.................. 8 2.2 Experiments............................ 18 3.1 Speakers.............................. 21 3.2 Disyllabic target words...................... 23 3.3 Disyllabic dependent variables.................. 26 3.4 Raw results............................ 29 3.5 Statistical results......................... 29 3.6 Raw results............................ 30 3.7 Statistical results......................... 30 3.8 Raw results............................ 30 3.9 Statistical results......................... 30 3.10 Raw results............................ 31 3.11 Statistical results......................... 31 3.12 Raw results............................ 32 3.13 Statistical results......................... 32 3.14 Raw results............................ 32 3.15 Statistical results......................... 32 3.16 Raw results............................ 33 3.17 Statistical results......................... 33 3.18 Raw results............................ 34 3.19 Statistical results......................... 34 3.20 Raw results............................ 34 3.21 Statistical results......................... 35 3.22 Raw results............................ 35 3.23 Statistical results......................... 35 3.24 Raw results............................ 36 3.25 Statistical results......................... 36 xiii

3.26 Raw results............................ 36 3.27 Statistical results......................... 37 3.28 Differences between disyllabic accents.............. 38 3.29 Distance of F0 landmarks from segments............ 40 4.1 Raw results............................ 50 4.2 Statistical results......................... 50 4.3 Raw results............................ 51 4.4 Statistical results......................... 51 4.5 Raw results............................ 51 4.6 Statistical results......................... 51 4.7 Raw results............................ 52 4.8 Statistical results......................... 52 4.9 Raw results............................ 53 4.10 Statistical results......................... 53 4.11 Raw results............................ 53 4.12 Statistical results......................... 53 4.13 Raw results............................ 54 4.14 Statistical results......................... 54 4.15 Raw results............................ 55 4.16 Statistical results......................... 55 4.17 Raw results............................ 55 4.18 Statistical results......................... 55 4.19 Raw results............................ 56 4.20 Statistical results......................... 56 4.21 Raw results............................ 56 4.22 Statistical results......................... 57 4.23 Raw results............................ 57 4.24 Statistical results......................... 57 5.1 Monosyllabic target words.................... 63 5.2 Monosyllabic dependent variables................ 63 5.3 Raw results............................ 66 xiv

5.4 Statistical results......................... 67 5.5 Raw results............................ 67 5.6 Statistical results......................... 67 5.7 Raw results............................ 68 5.8 Statistical results......................... 68 5.9 Raw results............................ 69 5.10 Statistical results......................... 69 5.11 Raw results............................ 69 5.12 Statistical results......................... 70 5.13 Raw results............................ 70 5.14 Statistical results......................... 70 5.15 Raw results............................ 71 5.16 Statistical results......................... 71 5.17 Raw results............................ 72 5.18 Statistical results......................... 72 5.19 Raw results............................ 72 5.20 Statistical results......................... 73 5.21 Raw results............................ 73 5.22 Statistical results......................... 73 5.23 Raw results............................ 74 5.24 Statistical results......................... 74 5.25 Differences between monosyllabic accents............ 77 6.1 Manipulation steps........................ 80 6.2 Listeners.............................. 83 6.3 Logistic regression results.................... 88 6.4 Majority response crossover points............... 89 xv

List of Figures 2.1 Stockholm Swedish accents.................... 5 2.2 Map of dialect region....................... 6 2.3 East Norwegian accents...................... 7 2.4 East Norwegian intonation.................... 11 3.1 Disyllabic pitch contour and labels............... 24 3.2 Praat example pitch track.................... 25 3.3 Disyllabic alignment cues..................... 26 3.4 Disyllabic broad focus contours................. 27 3.5 Disyllabic contrastive focus contours.............. 28 3.6 Phonological analysis....................... 41 4.1 Disyllabic contours in AP-medial position........... 48 4.2 Disyllabic contours in AP-final position............. 49 5.1 Monosyllabic pitch contour and labels.............. 64 5.2 Monosyllabic alignment cues................... 65 5.3 Monosyllabic broad focus contours............... 65 5.4 Monosyllabic contrastive focus contours............. 66 6.1 Accents showing landmarks that were manipulated...... 80 6.2 F0 maximum and minimum alignment steps.......... 81 6.3 HTP alignment steps....................... 82 6.4 F0 minimum height and alignment steps............ 82 6.5 Responses for F0 Maximum alignment............. 85 6.6 Responses for HTP........................ 86 6.7 Responses for F0 Minimum alignment.............. 87 B.1 Disyllabic means for F0 maximum................ 108 xvi

B.2 Disyllabic means for F0 minimum................ 109 B.3 Disyllabic means for slope of the rise.............. 109 B.4 Disyllabic means for F0 maximum alignment.......... 110 B.5 Disyllabic means for F0 minimum alignment.......... 110 B.6 Disyllabic means for HTP alignment.............. 111 B.7 Disyllabic means for slope of the fall.............. 111 B.8 Disyllabic means for AP H% height............... 112 B.9 Disyllabic means for boundary slope.............. 112 B.10 Disyllabic means for final vowel duration............ 113 B.11 Disyllabic means for AP H% timing............... 113 B.12 Disyllabic means for stressed vowel duration (long)...... 114 B.13 Disyllabic means for stressed vowel duration (short)...... 114 B.14 Disyllabic means for consonant duration (long)......... 115 B.15 Disyllabic means for consonant duration (short)........ 115 C.1 Monosyllabic means for F0 maximum.............. 116 C.2 Monosyllabic means for F0 minimum.............. 117 C.3 Monosyllabic means for vowel onset............... 118 C.4 Monosyllabic means for slope of the rise............ 119 C.5 Monosyllabic means for F0 maximum alignment........ 120 C.6 Monosyllabic means for F0 minimum alignment........ 121 C.7 Monosyllabic means for slope of the fall............. 122 C.8 Monosyllabic means for AP H% height............. 123 C.9 Monosyllabic means for boundary slope............. 124 C.10 Monosyllabic means for AP H% timing............. 125 C.11 Monosyllabic means for stressed vowel duration (long)..... 126 C.12 Monosyllabic means for stressed vowel duration (short).... 127 C.13 Monosyllabic means for consonant duration (long)....... 128 C.14 Monosyllabic means for consonant duration (short)...... 129 xvii

Chapter 1 Introduction In spoken languages, pitch can be employed in a number of ways. All languages, including English, use intonation - pitch changes across the course of a phrase - to express emotions such as surprise or anger, and to distinguish between different utterance types, such as questions or statements, and pragmatic information, such as contrastive focus. At least 42% (Maddieson, 2011) of the world s languages also use pitch changes within words to change the meaning of the word. Tone languages, such as Mandarin Chinese, may do this on every syllable, while tonal accent (or pitch accent ) languages, such as Norwegian and Lithuanian, do this only on stressed syllables (Hayes, 1995). Languages with such lexical pitch changes (tone languages and tonal accent languages) not only have specific pitch contours on words, but they also use pitch across the sentence to express utterance type or pragmatic information. The question arises, then, as to how the lexical level and the sentence level interact. Research into the interaction of lexical and post-lexical tones has been conducted on a variety of languages (e.g., Bruce, 1977; Pierrehumbert and Beckman, 1988; Gussenhoven and Bruce, 1999; Xu, 1999; Ma et al., 2006; Scholz, 2012). European languages with lexical pitch contrasts tend to have simpler intonation systems than languages that do not have such a contrast (Gussenhoven and van der Vliet, 1999). Furthermore, while languages such as English and Dutch can express pragmatic focus by a change in peak height or alignment (e.g., Pierrehumbert, 1980; Cooper et al., 1985; Peters et al., 2014), this is restricted in languages with lexical pitch contrasts, which tend to use an increased pitch range for this purpose (Pierrehumbert and Beckman, 1988; Xu, 1999; Remijsen, 2002; Fournier et al., 2006). Norwegian is a tonal accent language, with two contrasting accents, accent 1 and accent 2 (Storm, 1884; Vanvik, 1957; Fintoft, 1970; Elstad, 1978). 1

The tonal makeup and phonetic realization of the accents differ across the dialects of Norwegian (Gårding, 1973; Fintoft, 1970). Despite these differences, Fintoft makes the generalization that (1) the main peak is always earlier in accent 1, and (2) accent 1 never has more peaks than accent 2. He also notes that these changes do not occur abruptly as one moves through the country, rather, there are gradual changes in the relative position of the peak(s) and/or the frequency difference between the peaks (Fintoft, 1987, p.44). The overlap in contours and the gradual changes across dialects lead to questions about what exactly characterizes the tonal accent contrast for each variety, as well as how much any changes due to sentence intonation and pragmatic context can modify the tonal contours without jeopardizing the contrast. This dissertation is an experimental analysis of the tonal accent contrast from the perspective of both production and perception. Trøndersk, a variety of East Norwegian spoken in the Trøndelag region in central Norway, will function as a test case for examining the acoustic cues that distinguish the accents in continuous speech. This dialect has not been subjected to a large-scale quantitative analysis, particularly in terms of the interaction of the accents with higher level intonation. Some varieties of Trøndersk, furthermore, have the unique feature of exhibiting a tonal contrast on monosyllabic words, something that is not common in Norwegian or Swedish (Kristoffersen, 1992). Both the disyllabic and monosyllabic contrasts are examined, as are their interactions with pragmatic focus (for both word lengths) and sentence intonation (for disyllabic words). Finally, perception experiments explore which cues listeners are sensitive to when identifying the tonal accents in this dialect. Next I will provide the background motivating this research followed by the main goals and hypotheses. 2

Chapter 2 Background Tonal accent is a prosodic pattern found on stressed syllables (e.g., Beckman, 1986; Hyman, 2009). Yip (2002) describes accentual languages as a particular type of language in which tone is used in a rather limited way, with one (or perhaps two) tone melodies, either lexically linked to particular TBUs [tone bearing units] or perhaps attracted to a syllable selected as prominent by rhythmic principles (p.260). In such a language, two segmentally identical words can thus be distinguished by the tonal contour only: bønder farmers (accent 1) and bønner beans (accent 2) (the segments of both words are pronounced /"bøn:@r/) is a minimal pair in the Oslo variety of Norwegian. Hualde (2012) defines them further as a class of stress languages where words contrast in the tonal melody that is associated with the stressed syllable (p.1335). Hyman (2006) argues that what are generally referred to as pitch accent languages are not a homogeneous group, rather they tend to pick and choose how they instantiate this characteristic and may combine features of stress accent languages and tonal languages. As such, the meaning of the term tonal accent as relevant to Scandinavian will be described below. Varying instantiations of this phenomenon are found in Scandinavian languages (e.g., Bruce, 1977; Gårding, 1973; Fintoft, 1987), Lithuanian (Senn, 1966), Latvian (Karins, 1996; Derksen, 1966), Japanese (Pierrehumbert and Beckman, 1988), and some varieties of Korean (e.g., Kim, 1988), Basque (Hualde, 1991), Serbian and Croatian (Smiljanić, 2006), and Dutch and German (Gussenhoven, 2004). 2.1 Previous Research into Scandinavian Tonal Accent Tonal accent is also known as word accent, particularly in reference to Swedish (Bruce, 1977; Gårding, 1973), while Fintoft (1987) refers to it as a toneme system in his description of Norwegian. In this paper I will refer 3

to it as tonal accent, as in Kristoffersen (2000). The reason for this is to differentiate it from the pitch accent of intonation, and also to show that in Scandinavian languages, the prominence is not just independently one of pitch but is dependent on primary stress, and is, in fact, a means of indicating primary stress 1. Finally, the term tonal accent seems clearer than word accent since the morphological word is not necessarily the domain of the accent, at least in some varieties of Norwegian, where the accent phrase 2 is in fact the domain of the accent (Kristoffersen, 2000). The tonal accent found today in most varieties of Norwegian and Swedish is thought to have arisen historically from a tonal contrast between monosyllables and polysyllables in Old Norse (Oftedal, 1952; Kristoffersen, 2000). When monosyllables ending in an obstruent-sonorant sequence became disyllabic due to vowel insertion, they retained the tonal contour of monosyllables, thus creating a contrast in tonal contour rather than in syllable number. Another analysis, whereby stress was replaced with a lexical accent (Kock, 1884/85; Riad, 2003), has also been proposed. The tonal makeup and phonetic realization of the accents differ across the Scandinavian language varieties (Gårding, 1973; Fintoft, 1970). For example, in Norway, some dialects have the tonal makeup of a high-low contour where others have a low-high contour (e.g., Almberg, 2004). The two contrasting accents are referred to as accent 1 and accent 2. The first comprehensive acoustic analysis of the tonal accent contrast in Stockholm Swedish revealed that the difference between accent 1 and accent 2 was in the timing of the F 0 fall in relation to the stressed syllable, whereby it was later for accent 2 than accent 1, as seen in Figure 2.1 (Bruce, 1977). The nature of the contrast has been studied extensively, both impressionistically and experimentally, for a number of dialects of Swedish and Norwegian (e.g. Storm, 1884; Bjerrum, 1948; Vanvik, 1957; Fintoft, 1970; Gårding, 1973; Gårding and Lindblad, 1975; Bruce, 1977; Elstad, 1978; Lorentz, 1981; Riad, 1998; Kristoffersen, 2000; Van Dommelen, 2002; Van Dommelen and Nilsen, 2003; Segerup, 2003, 2004; Almberg, 2004; Gussenhoven, 2004; Riad, 2006). 1 In this way it is similar to Serbian and Croatian and different from Japanese, and could be called a stress language with a lexical contrast in the alignment of pitch contours. 2 The accent phrase is described in section 2.2.3. 4

Figure 2.1: Tonal Accents of Stockholm Swedish (Bruce, 1977). The beginning of the fall is marked by blue circles. The dialect focused on in the current study is Trøndersk, an East Norwegian variety spoken in central Norway. This variety was chosen because it has not been as extensively described in terms of a large-scale analysis as other varieties, so the current analysis contributes to the typological literature. Also, one unusual feature of this variety is the fact that, unlike most varieties of Norwegian, it has a tonal accent contrast on monosyllabic words, the more complex contour known as the circumflex accent (e.g, Kristoffersen, 1992; Almberg, 2001; Kristoffersen, 2011). 2.1.1 The Trøndersk Variety of Norwegian There are two main varieties of Norwegian, East Norwegian and West Norwegian. East Norwegian comprises a group of dialects spoken in the southeast and central regions of Norway (Kristoffersen, 2000). Figure 2.2 shows the Trøndelag region, where Trøndersk is spoken, highlighting Trondheim (the capital city of the region). East Norwegian is generally referred to as a low-tone dialect, where accent 1 is a low tone (L) and accent 2 is a high-low melody, HL, and West Norwegian as a high-tone dialect, where accent 1 is H and accent 2 is a low-high melody, LH (e.g., Kristoffersen, 2000; Almberg, 2004). However, within these regions there is further variability, for example, in terms of which accent has a higher F 0 peak. 5

Figure 2.2: Map showing the Trøndelag region (Bookcoverimgs.com, 2012) 2.2 The Tonal Accent Contrast 2.2.1 The Disyllabic Accent Contrast The tonal accent contrast in the Trøndersk variety has been examined in some detail. The contrast has been described as a difference in tonal makeup, with accent 1 being L and accent 2 HL, thus aligning the Trøndersk variety with other varieties of East Norwegian (Nilsen, 1992), such as the Oslo variety (Fintoft, 1970; Kristoffersen, 2006b). In contrast, other studies have suggested that the difference lies in the alignment of the F 0 contour, with both accents having a HL lexical tonal accent (Fintoft, 1970; Kristoffersen, 2006b). In a study analyzing recordings of six disyllabic word pairs spoken (in sentence final position, preceded by a pause) by 13 male speakers from Trondheim (the capital city of the region where Trøndersk is spoken), Fintoft (1970) found that: accent 1 reaches its F 0 minimum (L target) in the (initial) stressed vowel, while accent 2 has its initial H tone in the middle of the stressed vowel, and falls from there, as shown in Figure 2.3. In addition, the unstressed (second) vowel tends to be significantly longer in accent 2 words (Fintoft, 1970). Also examining the Trondheim variety, Wetterlin (2010) found both accents to have just an L contour, although accent 1 has a steeper fall than accent 2. In this variety, the L tone is found earlier in accent 1, where it occurs during the 6

first syllable, while in accent 2 it occurs in the second syllable. A similar difference in alignment was described for a variety spoken in the west of the Trøndelag region (Van Dommelen and Nilsen, 2003). While the two accents had a similar overall contour, the difference between them was in the timing of the F 0 fall and rise, which both occurred earlier for accent 1. Examination of the tonal accent contrast in the south of the Trøndelag region, Oppdal, revealed that both accents have a HL contour, with an earlier alignment in accent 1 than in accent 2 (Kristoffersen, 2006b). In Oppdal and Trondheim, the initial H is in the stressed syllable for both accents, while the following L is associated with the stressed syllable in accent 1 and the post-stressed syllable in accent 2 (Fintoft, 1970; Kristoffersen, 2006b). These results show the conflicting analyses of the tonal accent contrast in Trøndersk (see Table 2.1), which centers on whether it is one of timing or tonal makeup. The key to this lies in whether there is an initial H target in accent 1. If the presence of such a target can be discerned, both accents could be argued to have a HL contour. If not, accent 1 is L and accent 2 HL, and the contrast is in the tonal makeup. This question will be addressed in the current study. It should be noted that the current stud does not examine the variety spoken in Trondheim, as other studies did, rather it examines those spoken in towns around this city. Figure 2.3: Trøndersk disyllabic accents, based on average contours: smilet (accent 1) and smile (accent 2). (Fintoft, 1970) 7

Tone categories Phonetic Author Dialect Accent 1 Accent 2 Difference Fintoft (1970): Trondheim HL HL Timing Onset? Nilsen (1992): Trøndersk L HL H tone Van Dommelen West Trøndersk HL HL Timing & Nilsen (2003): Wetterlin (2010): Trondheim L L Timing F0 Range? Kristoffersen (2006b): Oppdal HL HL Timing Table 2.1: Summary of previous production findings on Trøndelag dialects 2.2.2 The Monosyllabic Accent Contrast In the majority of Norwegian and Swedish dialects the accent contrast is only found on polysyllabic words. One explanation for this is that since the accent 2 contour has a later alignment and/or an extra tone in comparison to accent 1, it needs a second syllable in order for the later tones to surface (e.g., Haugen and Joos, 1952). Another explanation is that accent 2 derives from words in Old Norse that had at least one syllable following the main stress (Kristoffersen, p.c.). Some analyses regard all monosyllabic words as carrying accent 1 (Haugen, 1983; Felder et al., 2009). However, a small number of dialects, including Trøndersk, have instances of a tonal contrast surfacing on monosyllabic words, in this case due to apocope (Elstad, 1978; Kristoffersen, 1992, 2011). The tonal contrast on monosyllabic words in Trøndersk is realized as a difference between the circumflex accent and the unmarked accent (Almberg, 2001; Kristoffersen, 2011). The circumflex accent occurs on words in which the final vowel is deleted, but are disyllabic in other varieties of Norwegian. This accent can surface on words that were originally either accent 1 or 2 (Almberg, 2001). The circumflex accent also occurs in the Nordland dialect of Norwegian (Almberg, 2001; Kristoffersen, 2011) but while the Trøndersk version can form from polysyllabic words of either accent, the Nordland form can only form from accent 2 (Elstad, 1982). Unmarked monosyllabic words 8

in East Norwegian have been described as being characterized only by an L tone (Dalen, 1985). A phonetic analysis of a small set of circumflex words found that this accent has a HL contour, with a longer vowel and a higher F 0 at vowel onset than the unmarked monosyllabic accent (Almberg, 2001). The circumflex contour has also been described as a temporally displaced version of the unmarked contour (Almberg, 2001), but since circumflex is HL, this would suggest that the unmarked accent is also HL. Since the unmarked contour was previously described as just L (e.g., Dalen, 1985), Almberg (2001) suggests that in order to determine whether this is the case, the F 0 contour before the monosyllabic accents must be examined. Since Almberg (2001) appears to be the only acoustic analysis of the monosyllabic accents, it is worth investigating further if the displaced theory holds up. The goal of the current study is to examine the monosyllabic contrast in contrastive focus. It has been noted that the circumflex accent is moribund and therefor rare among young speakers (Dalen et al., 2008), so an acoustic analysis of it is crucial before it is lost. This will help elucidate the features of each monosyllabic accent and how pragmatic focus affects them. The current analysis also examines the anacrusis, that is, the unstressed syllables before the target word, which has not been examined before, to determine whether the circumflex accent is a displaced version of the unmarked accent, thus providing new evidence which will hopefully contribute toward resolving the earlier conflicting findings. 2.2.3 The Effect of Sentence-Level Intonation The interaction of lexical pitch with sentence intonation has been examined in a variety of languages (e.g., Bruce, 1977; Pierrehumbert and Beckman, 1988; Gussenhoven and Bruce, 1999; Riad, 2006). In the current study, the goal is to examine how sentential intonation affects the accent contours in Trøndersk. Work on other langauges has shown that the pitch level and alignment of a lexical tone can be affected by sentential intonation (Ma et al., 2006). This can occur in a variety of ways. For example, Gussenhoven and van der Vliet (1999) found that the lexical tonal accents of the Dutch dialect of Venlo have different pitch contours depending on the utterance type, position and pragmatic context. High boundary tones can induce a higher pitch 9

on lexical tones close to them (Myers, 2004). The realization of Cantonese tones in different positions was examined by Vance (1976), who found that the lexical tones were lowered in sentence-final position compared to medial position, due to sentence-final lowering. In Kammu, a language spoken in Laos, when a lexical high-low tone is followed by the sentence-final boundary H tone, the sentence-final boundary tone is not fully realized, and instead surfaces as a level or falling contour (Karlsson et al., 2010). Here, the authors suggest that the realization of the lexical tone supersedes the sentence intonation tones. In Thai, on the other hand, Abramson (1979) observed that the lexical tones were affected by sentence intonation but the contrasts were still preserved. In Mandarin, Lin (2004) found that lexical tones and sentence intonation affect different dimensions of the F 0 contour, whereby lexical tones were distinguished by their F 0 contour and sentence intonation was expressed through F 0 range. In order to tease apart the lexical accents from higher level intonation effects, first it is necessary to examine descriptions of intonation in Norwegian. East Norwegian sentence intonation is extensively described by the Trondheim Model (Fretheim, 1981, 1982). An utterance is composed of intonational phrases (IP) which are further composed of accent phrases (AP), specified for accent 1 or 2 depending on the accent of the word that heads the AP (e.g., Haugen and Joos, 1952; Fretheim, 1987a, 1991; Fretheim and Nilsen, 1989). Each AP starts with a primary stressed syllable at the left edge and includes any number of unaccented syllables before the next stressed syllable which is the head of the following AP. The right edge of the AP is delimited by a high boundary tone (H%) (Fretheim, 1987b; Nilsen, 1989; Kristoffersen, 2000), as shown in Figure 2.4 from Fretheim (1987b) (he uses AU where I have used AP). While previous work provides important information about East Norwegian prosodic structure, none of these studies examined how IP- and AP-level boundary tones interact with the lexical tonal accents and whether this interaction impacts their realization. Borgstrøm (1962) noted that in the Oslo variety of East Norwegian, accent 1 may be more affected than accent 2 by higher-level intonation, leading to the tonal accent contrast being somewhat reduced (p. 36) in falling intonation. Work on Swedish showed that while the range of the rise or fall in tonal accent contours can be affected by sentence intonation, the contrast is preserved (Hadding-Koch, 1961, 1962). 10

One brief mention of the effect of the number of syllables in the AP in East Norwegian is in Teig (2001), who states that the contour of a two-syllable AP differs from that of a one-syllable AP. Although this was not the focus of the study, the pitch track shows a more marked drop to the lexical L when there is a second, unstressed syllable in the AP, something there is no time for in the...contour with only one syllable in the [AP] (p.224). This is an indication that the AP tones indeed do affect the lexical accent tones, something that will be explored in the current study. Figure 2.4: Sentence intonation of East Norwegian, broken into Accent Phrases (Fretheim, 1987b). The high boundary tone can be seen in the raised contour at right edge of each AP. 2.2.4 The Prosodic Effect of Focus While there are various ways in which focus can be defined (related both to its meaning and scope), here the term contrastive focus is employed to denote a specific type of narrow focus (Chafe, 1976; Rooth, 1985; Gussenhoven, 2005; Selkirk, Elisabeth, 2008; Katz and Selkirk, 2011). In this sense, a constituent under contrastive focus relates to a set of alternatives that are shared between the interlocutors. Numerous studies across prosodically different languages have documented the effect of narrow and contrastive focus on the realization of the F 0 contour and segmental duration (Ladd, 1978, 1996; Gussenhoven, 1984; Beckman and Edwards, 1994; Sluijter and van Heuven, 1996; Campbell and Beckman, 1997; Remijsen and van Heuven, 2005; Zhang et al., 2006; Arvaniti et al., 2006; Prieto, 2014; Peters et al., 2014). In German, for instance, in narrow focus, intonational pitch accents are lowered in prenuclear position and deaccented in postnuclear position (Féry and Kügler, 11

2008). Narrow focus in English and Dutch is realized by a higher F 0 peak and longer segments (Pierrehumbert, 1980; Cooper et al., 1985; Eefting, 1991; Cambier-Langeveld and Turk, 1999; Xu and Xu, 2005). A later alignment of the tonal targets in narrow focus was found for some varieties of Dutch and German (Peters et al., 2014). Narrow focus can also change the shape of the lexical tonal accents. In languages with lexical pitch, an expanded pitch range (Pierrehumbert and Beckman, 1988; Xu, 1999; Remijsen, 2002; Fournier et al., 2006; Scholz, 2012) and greater articulatory force (Chen, 2010) are often used to mark narrow focus. In Swedish, single-peaked dialects expand the pitch range on the target word to signal narrow focus, while double-peaked dialects add a pitch gesture after the stressed syllable (Bruce, 2005). In Serbian and Croatian, Smiljanić (2003) found that narrow focus was indicated by the use of an expanded pitch range, a change in peak alignment, and vowel lengthening. The change in peak alignment was restricted, however, in the dialect with the lexical tonal accent. Interestingly, even closely related tonal languages exhibit differences in focus realization, such that post-focal F 0 range compression is found in Beijing Mandarin but not in Taiwan Mandarin or Taiwanese (Chen et al., 2009). Narrow focus impacts segmental durations, as mentioned above for English and Dutch (Cooper et al., 1985; Eefting, 1991). Similar segmental lengthening was found in dialects of Dutch and German (Peters et al., 2014). In a dialect of West Limburgian, Peters (2007) found that durational differences between the tonal accents (one accent had consistently longer syllables than the other accent) were increased in nuclear position. In Swedish, phonologically long segments were lengthened more than short segments in focus, thus exaggerating the phonological contrast between short and long vowels and consonants (Bruce, 1977; Bannert, 1979; Bruce, 1981). Also in Swedish, the unstressed syllable following the stressed syllable was lengthened under focus (Heldner and Strangert, 2001). Similar to the findings that phonologically short and long vowels were more distinct in narrow focus, tonal accents contrasts can also be enhanced. Smiljanić (2003) found that in the Belgrade variety of Serbian, narrow focus caused asymmetric changes in the alignment of a low tonal target between the lexical accents, leading to a greater contrast between the accents in this 12

condition. In a slightly different way, focus enhances the lexical tonal accent contrast in the Venlo dialect of Dutch, where the contrast only surfaces when target words are focused or final (Gussenhoven and van der Vliet, 1999). In East Norwegian, narrow focus is marked by a high tone at the right edge of the AP (Fretheim, 1987b; Nilsen, 1989; Kristoffersen, 2000). This tone contributes to the lexical item being perceived as focused even though the focus marker is a few syllables beyond the focused word (Abrahamsen, 2004). A similar pattern whereby the focus tone is not realized on the focused word is found in languages such as Bengali (Hayes and Lahiri, 1991) and Greek (Arvaniti et al., 2006). For Norwegian, the original H% of the AP and the focus H tone combine, causing the H% at the right edge of the AP to have a higher F 0 (Fretheim, 1987b; Fretheim and Nilsen, 1989; Kristoffersen, 2000). Accent 1 words in East Norwegian were found to signal narrow focus with increased duration of the vowel, syllable and word (Mixdorff et al., 2010). An earlier AP H% alignment was found for accent 1 words in narrow focus compared to broad focus (Koreman et al., 2009), while Mixdorff et al. (2010) found the earlier AP H% alignment in both accents. It appears there is no description, however, of how narrow focus impacts the height or alignment of the lexical tones, and whether narrow focus enhances or reduces the tonal accent contrast. These questions are examined in the current study by using contrastive focus on the target words. 2.3 The Perception of F0 The fact that a speaker produces certain acoustic cues does not mean that the listener attends to all of them. For example, when multiple acoustic cues are available, listeners can weight one cue more heavily than others (Francis et al., 2008b), so the presence of a particular cue in production is not evidence for its use in perception. Examining perception of high and rising tones in Korean, Chang (2013) found that the cues for perception lined up well with the descriptions of the tone production. Work on a variety of languages has found an effect of systematic manipulations of F 0, on the perception of both tone and intonation, and this approach highlights what characteristics are necessary for the listener to per- 13

ceive a particular feature (e.g., Shen, 1993; Gósy and Terken, 1994; Almberg and Husby, 2000; Gussenhoven and Chen, 2000; Francis et al., 2003; Shattuck- Hufnagel et al., 2004; Xu et al., 2006; Francis et al., 2008b; Shport, 2011; Chang, 2013; Liu, in press). The goal of the perception study conducted here is to examine further what cues listeners attend to in distinguishing the accents. Following the approach of the studies mentioned above, the results of the production study will be a starting point for determining what probable cues are used to distinguishing tones in Trøndersk. The perception experiments will use stimuli with artificially manipulated cues in order to pinpoint the most salient cues for accent identification. 2.3.1 Perception of Lexical Tonal Accents While a number of studies examined how the contrasts are realized acoustically in various Scandinavian dialects, few studies have looked at the perception of the tonal accent contrast. In one study, Segerup (2004) found that listeners could correctly identify naturally produced tonal minimal pairs 96% of the time for one variety of West Swedish. With regard to which cues listeners use to make lexical accent identification, one study (Efremova et al., 1963) using gating experiments found that the shape of the F 0 contour in the initial, stressed syllable contains important cues for distinguishing the accents in disyllabic words in Swedish. Similarly, Norwegian listeners were able to identify the two accents accurately even when presented with portions of the words up to the end of the initial, stressed vowel (Fintoft, 1970). Using synthesized contours, Bruce (1977) examined perception of the accent contrast in Stockholm Swedish and confirmed that listeners used the timing of the F 0 contour in relation to the stressed syllable as the main cue in differentiating between the two accents, thus aligning production and perception findings closely. Specifically, Swedish listeners identified accent 2 as long as the fall began 25% of the way into the vowel, or later. With regards to the perception of Norwegian tonal accents, two studies tested which aspects of the F 0 contours were salient indicators of tonal categories. In a small-scale study, Fintoft and Mártony (1964) manipulated F 0 peak height, alignment and the slope of the rise and fall to examine accent identification in the Oslo dialect. With just the first consonant-vowel syllable of the manipulated disyllabic words played to 14

the listeners, a level or rising F 0 in the stressed vowel was identified as accent 1, and a falling F 0 at the end of the stressed vowel was identified as accent 2. In another perception study, Fintoft (1970) used synthesized stimuli composed of sine wave signals with manipulated frequency contours, superimposed on segments. The results also showed that Norwegian listeners identified a level or rising frequency contour at the end of the stressed vowel as accent 1, and a falling contour at that point as accent 2. These results indicate that the alignment of the F 0 contour is a salient cue for the perception of the lexical pitch contrasts in a variety of Scandinavian dialects. In terms of perception, narrow focus has been found to affect accuracy in the perception of the tonal accents. Listeners were better at distinguishing Norwegian words that had been produced in isolation than those excised from context, presumably since the speakers emphasized the F 0 contours when no context was present (Van Dommelen, 2002). Speakers of the Roermond dialect of Dutch were more accurate at distinguishing the accents when they had been excised from a narrow focus context than from pre-nuclear or post-nuclear contexts (Fournier et al., 2006). Chen (2010) mentions a pilot study where focused Chinese words were excised from context and presented to listeners, who identified them with an accuracy rate of 90%. (This was compared to tones in post-focal position, which had an accuracy rate of 65%). Combined, these results suggest that narrow focus exaggerates cues to the tonal contrasts and therefore contributes to the enhanced word recognition. The current study examines the perception of accent 1 and accent 2 words with the goal of determining which cues are salient markers of the accentual distinction for listeners. 2.4 Goals and Research Questions As described above, focus and sentence-level prosody can further affect the tonal contours. They impact segmental duration and determine distribution and identity of tonal events as well as their exact realization. Little work has directly examined how pragmatic focus and higher level sentential intonation impact the accent contrast in Norwegian. The current study examines the nature of the tonal accent contrasts in the Trøndersk variety. It expands on previous research by conducting detailed acoustic analyses examining a 15

number of acoustic cues (F 0 maximum and minimum height and alignment, alignment of the F 0 fall, F 0 slope, vowel duration, accent phrase tone height and alignment) in a larger number of sentences and speakers. The stimuli used in this study control for the effect of sentence intonation, thereby investigating the lexical and intonation effects on the tonal contours separately. The question of interest is how much variation in the features that define the phonological tonal contrast can be allowed due to contrastive focus and sentence intonation while preserving the lexical contrast itself. Accordingly, the goals of the production studies are to examine how the tonal contrast is implemented in Trøndersk. A second goal is to investigate how sentence intonation (position in utterance) and focus impact the realization of the tonal accent contrast. Experiment 1 (Chapter 3) examines the tonal accent contrast in disyllabic words and the effect of contrastive focus on this contrast. Experiment 2 (Chapter 4) examines the effect of sentential intonation on the tonal accents in disyllabic words. Experiment 3 (Chapter 5) examines the accent contrast on monosyllabic words in both broad (noncontrastive) focus and contrastive focus. Based on previous research, it is hypothesized that both disyllabic accents will have a HL contour and the tones in accent 2 will have a later alignment in relation to the segmental string than accent 1. Through this investigation, the current study will provide further insight into the question of whether accent 1 is L or HL (cf. Kristoffersen, 2006b) by examining the F 0 contour of the sentence-initial words (the anacrusis) that precede the target words. It is predicted that given enough segmental material, the initial H of the accent 1 HL tonal accent will be observed. In comparing how the contrast is realized in AP-medial versus AP-final position, it is hypothesized that the AP-final boundary H% tone will be realized on the target word when in AP-final position, thus adding an extra tone to this unit. It is expected that the closer presence of the AP-final H% tone will cause an earlier alignment of the lexical tones in relation to the syllable, compared to AP-medial position. This immediately following H tone may also cause the lexical L tone (F 0 minimum) to not be as low as in the non-ap-final condition. It is hypothesized that in contrastive focus, both accents will have a wider pitch range, earlier and higher AP H% tone, and longer segments. If indeed contrastive focus causes an enhancement of the tonal accent contrast, it is hypothesized that 16

for disyllabic words, accent 1 will have an earlier alignment of F 0 landmarks in contrastive focus than in broad focus, while accent 2 will have a later alignment in contrastive focus. For the monosyllabic words, it is hypothesized that in broad focus, the circumflex accent would have a wider pitch range and later F 0 minimum alignment than the unmarked monosyllabic accent. As in other languages with lexical F 0 contrasts on monosyllabic words, it is also expected that contrastive focus will induce an expanded pitch range (e.g., Xu, 1999), and also a higher and earlier AP H%. The goal of the current perception study is to examine in detail which acoustic cues are important for lexical accent identification in Trøndersk. The current investigation expands on previous perception studies by systematically manipulating more cues (F 0 minimum height and alignment, F 0 maximum alignment, and alignment of the turning point from maximum to minimum). This is done in order to examine the perception of which cues will trigger categorical shifts between the two accents and whether any one of these F 0 cues or a combination of them will shift the responses. A perception experiment was designed to determine whether listeners use the differences in these cues in making word identification decisions, and their relative importance. Listener responses will provide an insight into how these acoustic dimensions are perceived and used in processing of lexical pitch contrasts. This study further builds on previous perception work by examining a large number of listeners and by carefully controlling for sentence intonation effects on the lexical tonal contrast by focusing on words that were produced in sentence-medial positions and with neutral intonation. Experiment 4 (Chapter 6) presented listeners with manipulated tokens and assessed which cues were used to differentiate between the two accents. For the manipulated contours, it is hypothesized that listeners will pay attention to the alignment of the F 0 fall and F 0 minimum, and the height of the F 0 minimum. Specifically, an earlier and lower F 0 minimum is expected to induce more accent 1 responses, and a later and higher F 0 minimum, more accent 2 responses. This dissertation examines the production and perception of fundamental frequency (F 0 ), or pitch, in Trøndersk. Table 2.2 lays out the experiments conducted. The following research questions form the focus of this investigation: 17

(1) Which acoustic cues characterize tonal accent distinctions in this understudied dialect? Does the implementation of the lexical contrast differ for monosyllabic and disyllabic words? (Experiments 1, 3) (2) What is the impact of sentence-level intonation and position within the Accent Phrase (medial vs. final) on the lexical tonal contrast? (Experiment 2) (3) What is the impact of pragmatic contrastive focus on the lexical tonal contrast? (Experiments 1, 3) (4) Which acoustic cues are listeners sensitive to in differentiating the lexical tonal accents? (Experiment 4) The systematic examination of the acoustic F 0 patterns will provide an insight into what cues characterize the accents. In order to examine this, the current study involves carrier sentences designed to compare F 0 and duration cues in broad focus and contrastive focus readings of target words, and also sentences designed to examine these cues when a target word is at the right edge of an Accent Phrase versus a number of syllables before this point. Perception experiments examine perception of naturally produced words excised from sentence context, as well as manipulated versions of target words, to determine what acoustic features consistently distinguish the accents. Production experiments: Word Length Accents Condition (1) Disyllabic 1, 2 Broad & Contrastive Focus (2) Disyllabic 1, 2 AP-medial & AP-final (3) Monosyllabic Unmarked, Broad & Contrastive Focus Circumflex Perception experiment: Word Length Accents Condition (4) Disyllabic 1, 2 Manipulated Contours Table 2.2: Experiments 18

2.5 Outline The outline of chapters is as follows: Chapters 3 to 5 describe the production experiments examining the monosyllabic and disyllabic accents in Trøndersk and their interaction with higher level intonation. Chapter 6 reports the experiments on the perception of the disyllabic accents. Chapter 7 provides a general discussion on the findings and how they relate to the literature on tonal accent and intonation. 19

Chapter 3 Experiment 1: Disyllabic Accent Realization in Broad Focus and Contrastive Focus The first goal of this experiment was to examine how the two lexical tonal accents are realized in the Trøndersk dialect. The second goal was to examine how contrastive focus affects the accent contrast. It was hypothesized that in broad (noncontrastive) focus, both accents would have a HL contour with accent 2 having a later alignment of tones in relation to the segmental string, than accent 1 (Fintoft, 1970; Kristoffersen, 2006b). In contrastive focus, it was hypothesized that both accents would have longer segments, a higher and earlier AP H% tone (Koreman et al., 2009; Mixdorff et al., 2010), a higher F 0 maximum and a lower F 0 minimum (e.g., Xu, 1999). It may also be the case that accent 1 has an earlier alignment of tones, and accent 2 a later alignment, in broad focus than in contrastive focus. 3.1 Methods 3.1.1 Participants Ten native speakers (6 female, 4 male) of the Trøndersk dialect, aged 18-45 participated in the experiment. They were recruited by posters and fliers around the campus of the National University of Science and Technology (NTNU), Trondheim and were paid for their participation. Before the recording session, they filled out a language background questionnaire. The results confirmed that they were all from towns south and west 1 of Trondheim and had all grown up speaking Trøndersk at home. Their parents were also native speakers of this dialect. 1 These towns were chosen because the circumflex accent still occurs here. 20

Speaker Sex Age range Hometown 01 F 36-40 Tingvoll 02 F 25-30 Oppdal 03 F 30-35 Tingvoll 04 M 18-24 Øksendal 05 M 36-40 Rennebu 06 F 18-24 Surnadal 07 F 18-24 Sunndal 08 M 36-40 Halsa 09 M 36-40 Ålvundeid 10 F 18-24 Surnadal Table 3.1: Speaker details 3.1.2 Materials The target words were disyllabic and had initial stress. The stressed vowel was always /i/, to control for intrinsic pitch (Whalen and Levitt, 1995) and duration (Lindblom et al., 1981) differences. Only sonorant consonants appeared next to the stressed vowel, for example, "limet the glue (accent 1) and "minne memory (accent 2). There were five target words for accent 1, each produced three times, giving 15 tokens. There were 2 target words 2 for accent 2, each produced seven or eight times, also giving 15 tokens. This gave 15 tokens per accent per speaker for each condition (noncontrastive and contrastive focus), for a total of 600 (15 tokens x 2 accents x 2 conditions x 10 speakers). For accent 1, four of the five target words contained a phonologically long vowel followed by a short consonant (V:C), and one of the five had a short vowel followed by a long consonant (VC:). For accent 2, one of the target words had V:C and one VC:. (These segment duration differences were compensated for in the timing measurements (see below).) The target words (shown in Table 3.2) were produced in sentences (listed in Appendix A) to elicit either a noncontrastive or contrastive focus reading. For the noncontrastive focus condition, a content word a number 2 Due to the constraints on vowel, stress, word class and consonant type, only two target words could be found for accent 2. 21

of syllables after the target word was contrasted with a word at the end of the sentence. This ensured that the target word did not receive contrastive focus. In the contrastive focus condition, the target word was contrasted with a word at the end of the sentence. In both conditions, the target word was preceded by two or three unstressed syllables, which were outside of any AP (Kristoffersen, 2006b). The target words were also followed by two unstressed syllables in the same AP, to ensure that the target word did not carry the H% boundary tone that marks the right edge of the AP in East Norwegian (Fretheim, 1987a). Example sentences for all conditions are below, with the target words highlighted in bold. (AP = accent phrase, IP = intonational phrase, IU = intonational utterance, based on the Trondheim Model.) Accent 1, broad focus: Det var glimtet i en film, men ikke i et stykke. (((Det var ( 1 glimtet-i-en) AP ( 1 FILM) AP ) IP, men itj i et ( 1 STYKKE) AP ) IP ) IU There was the flash in a film, but not in a play. Accent 1, contrastive focus: Det var glimtet i en film, men ikke brannen. (((Det var ( 1 GLIMTET-i-en) AP ) IP (( 1 film) AP, men itj ( 1 BRANNEN) AP ) IP ) IU There was the flash in a film, but not the fire. Accent 2, broad focus: Det var et minne i en film, men ikke i et stykke. (((Det var et ( 2 minne-i-en) AP ( 1 FILM) AP ) IP, men itj i et ( 1 STYKKE) AP ) IP ) IU There was a memory in a film, but not in a play. Accent 2, contrastive focus: Det var et minne i en film, men ikke en drøm. (((Det var et ( 2 MINNE-i-en) AP ) IP (( 1 film) AP, men itj en ( 1 DRØM) AP ) IP ) IU There was a memory in a film, but not a dream. 22

Accent 1 Gloss Accent 2 Gloss limet the glue minne memory linet the flax/linen Line girl s name smilet the smile slimet the mucus glimtet the flash Table 3.2: Disyllabic target words 3.1.3 Procedure The sentences were presented in slide format, in a randomized order, which was the same for all participants. Different focus conditions were interspersed randomly, but the sentences were presented in the same order for each participant. The participants were in control of when to move to the next slide. The sentences were written in the standard Bokmål orthography and also in a transcription of Trøndersk. This was to encourage them to use the Trøndersk dialect. They were instructed to speak in a casual manner as they would at home. The recordings were conducted using Adobe Audition at a sampling rate of 44.1kHz. The experiments took place in the phonetics studio at NTNU. The production experiments took 30-45 minutes per participant and they were paid 170 NOK (approx. US$30). All production experiments (Experiments 1-3) took place in the same sitting. 3.1.4 Measurements and Analysis A number of measurements were taken using Praat (Boersma and Weenink, 2011) to examine the F 0 contours and segment durations in detail. The tonal and segmental landmarks labels are shown in Figure 3.1. Figure 3.2 shows a pitch track of part of a sentence from Praat, comparing the accent 2 word Line in noncontrastive focus and contrastive focus. Table 3.3 shows the calculations made for duration and pitch measurements. All measurements were made on the target word, except for AP H% measurements, which were made on the final syllable in the AP. AP H% timing was measured in milliseconds from the AP H% tone to the AP boundary. This 23

Figure 3.1: Accent 1 contour of the word linet and the two following unstressed syllables in the AP showing measurement points. S = beginning of the sentence; B = beginning of f0 rise; C1 = onset of target word; C2 = onset of second consonant (if present); V1 = vowel onset; C3 = onset of post-vocalic consonant; V2 = unstressed vowel onset; W = end of target word; AP = end of AP; H = F0 maximum, HTP = turning point from F0 maximum; L = F0 minimum; LTP = turning point from F0 minimum; APH = AP boundary tone. was measured to determine whether it is higher and earlier, as expected, in the contrastive focus condition. The alignment of F 0 minimum and high turning point (henceforth HTP, the point where F 0 starts to fall, which occurs after a high plateau) were measured from vowel onset and then divided by the combined duration of the vowel and post-vocalic consonant. This was done to control for speaking rate differences. The duration of the vowel and following consonant were combined because some target words had V:C and some VC:, so combining these allowed for pooling of timing measures regardless of phonological vowel length. Because F 0 maximum often occurred before word onset or vowel onset, especially for accent 1 words, the timing of this measure was divided by word duration. This was to compensate for speaking rate but also for onset type differences, since some target words had a complex onset, and some did not. These F 0 landmarks were measured to determine whether, as hypothesized, accent 2 has a later tonal alignment than accent 1, and to examine whether contrastive focus affects tonal height and alignment. Slope 24

Figure 3.2: Example pitch track highlighting accent 2 word Line in noncontrastive focus (top) and contrastive focus (bottom). of the rise was the pitch difference between the beginning of the F 0 rise to the F 0 maximum, divided by the duration between these two points. This was measured to determine whether both accents had a rise to an initial H tone. Slope of the fall was the pitch difference from H to L (F 0 maximum to minimum), divided by the duration between the two points. Boundary slope was the F 0 difference between the turning point from L to AP H%, divided by the duration between these two tonal events. This was measured in order to examine whether contrastive focus affects the pitch contour leading to the AP H% tone. Stressed vowel duration was examined separately for long and short vowels. Figure 3.3 indicates where the alignment cues occur on the accents. Vowel onset was determined by the beginning of periodicity in the waveform, higher intensity than the surrounding sonorants and consistent formants in the spectrogram. F 0 maximum and minimum were determined by examining the pitch track for the highest or lowest point. When this was unclear, the region was selected and the Praat function for choosing local maxima and minima was used. HTP was where the F 0 height was equal to the F 0 maximum height and began to drop, as determined by examining the pitch track. A mixed model multiple linear regression analysis was conducted using the lmertest package in R (R Development Core Team, 2008). The indepen- 25