CUE-ENHANCEMENT STRATEGIES FOR NATURAL VCV AND SENTENCE MATERIALS PRESENTED IN NOISE. Valerie HAZAN and Andrew SIMPSON

Size: px
Start display at page:

Download "CUE-ENHANCEMENT STRATEGIES FOR NATURAL VCV AND SENTENCE MATERIALS PRESENTED IN NOISE. Valerie HAZAN and Andrew SIMPSON"

Transcription

1 CUE-ENHANCEMENT STRATEGIES FOR NATURAL VCV AND SENTENCE MATERIALS PRESENTED IN NOISE Valerie HAZAN and Andrew SIMPSON Abstract Two sets of experiments to test the perceptual benefits of enhancing information-rich regions of consonants in natural speech were performed. In the first set, hand-annotated consonantal regions of natural VCV stimuli were amplified to increase their salience, and filtered to stylize the cues they contained. In the second set, natural semantically unpredictable sentence (SUS) material was annotated and enhanced in the same way. Both sets of stimuli were combined with speech-shaped noise and presented to normally-hearing listeners. Both sets of experiments showed statistically significant improvements in intelligibility as a result of enhancement, although the increase was greater for VCV than for SUS material. These results demonstrate the benefits gained from enhancement techniques which use knowledge about acoustic cues to phonetic contrasts to improve the resistance of speech to background noise. 1. Introduction This paper reports some work carried out as part of a program investigating the effects of acoustic cue enhancement on the intelligibility of natural and synthetic speech. This approach aims to enhance relatively clear speech before degradation by noise, reverberation or band-pass filtering and therefore differs from conventional signal enhancement which is largely concerned with the removal of additive noise through techniques such as spectral subtraction, adaptive filtering, adaptive noise cancellation, and harmonic selection. While these methods appear to improve signal quality, they show only small increases in intelligibility (e.g., Cheng, O Shaughnessy & Kabal, 1995). When describing methods which enhance speech prior to degradation, a distinction must be made between those techniques which apply enhancements automatically to portions of the signal which display certain characteristics (e.g. regions characterised by fast spectral change) and those which apply enhancements to specific phonetic segments and so require the speech signal to be annotated in terms of its phonetic components. Automatic enhancement methods such as those involving high-frequency emphasis or removal of the first formant can have a significant effect on intelligibility; however, these have shown most benefit in conditions of extreme distortion, such as infinite clipping, which have a very substantial effect on signal quality (Niederjohn & Grotelueshen, 1976). More recently, Tallal and her colleagues (Tallal et al., 1996) have applied automatic enhancement techniques which involve amplifying regions of rapid spectral change and manipulating segment durations. These were found to be beneficial in speech training with languagedisordered children believed to have specific difficulty in processing sounds containing fast spectral change. Methods in which signals are segmented and labelled using phonetic knowledge permit the manipulation of specific, perceptually-important, regions which may not be reliably identified via the types of signal processing techniques described above. Many concentrate on enhancing landmark regions of the signal that are known to contain a high density of cues to phonetic identity (Stevens, 1985). These landmark regions can be inherently transient and of low amplitude, such as the perceptually-important formant transitions following plosive release which are both brief and of low initial intensity as vocal fold vibration starts. Phoneticallymotivated enhancement approaches have been used to increase the salience of these 1

2 information-bearing regions by increasing their relative intensity or duration. By making it easier for normally-hearing listeners to process acoustic cues contained in these segments, the speech signal could become more resistant to subsequent degradation. Techniques which enhance clear speech prior to degradation can be applied in telecommunications (e.g. telephone-based information services, or communication in noisy aircraft) where the communication channel can significantly degrade the speech signal. There are also applications in speech and language therapy and second language learning where certain perceptually-important portions of the speech signal can be emphasised in a computerbased training system to help listeners develop phonetic discrimination abilities. Jamieson (1995) used such an approach successfully in auditory training in second-language learners. These techniques have also been investigated with the view to improve speech intelligibility for listeners suffering from different types of hearing disability. For example, Gordon-Salant (1986) explored the effects of increasing consonant duration and consonant-vowel intensity ratio in a set of nonsense syllables presented to normally-hearing and hearing-impaired listeners. The manipulation of intensity ratios had the greatest effect on intelligibility. This phonetically-motivated approach has therefore clearly been successful although the need for pre-annotated material is a serious limitation in the use of these techniques. The objectives of the work reported here are to determine the cue-enhancement strategies which are likely to have the greatest effect on intelligibility but which are also easily implemented in signal processing terms. Manipulations were primarily made to the relative intensity and the spectral shape of different portions of the signal. In the first experiment, the effect of cue-enhancement was examined using controlled nonsense Vowel-Consonant-Vowel (VCV) material which contained no contextual information. In this way, segmental intelligibility based on the perception of acoustic information alone can be evaluated. In the following experiments, similar cue-enhancement strategies were implemented in sentencelength material which exhibits a much higher degree of variability in vocalic context and degree of coarticulation. 2. VCV material enhancement 2.1. Method 36 vowel-consonant-vowel (VCV) stimuli comprising the consonants /b,d,g,p,t,k,f,v,s,z,m,z/ in the context of the vowels /a,i,u/ spoken by a male speaker were recorded and digitized at 48 khz sampling rate with 16-bit amplitude quantization. Annotations were made manually using a waveform editing tool to segment the stimuli into different sections. The relative levels of sections of the stimuli were then manipulated before the stimuli were reassembled by abutting adjoining segments and then down-sampling the resultant stimuli to 16 khz to smooth any waveform discontinuities at segment boundaries. Amplitude manipulations were made by calculating the mean RMS level of each segment of the stimulus; with reference to this level sample values within a segment were then scaled to either produce a relative amplitude increase, or to set the mean RMS level of a number of segments to the same value. After manipulation, stimuli were combined with noise which had the same spectral envelope as the long-term average spectrum of speech. Signal-to-noise ratios (SNRs) of 0 and -5 db were calculated on a stimulus by stimulus basis and took into account any change in the amplitude of the stimulus produced as a result of enhancement. The noise started 100 ms before the onset of the first vowel and finished 100 ms after the end of the second. 2

3 For all stimuli a distinction was made between (a) the transition regions between vowel and consonant, and (b) the consonantal constriction/occlusion regions, i.e. the burst transient, burst and aspiration, frication or nasality portions. For the transition portions, the problem of reduced amplitude as the consonant constriction/occlusion was formed or released was counteracted by amplifying the final five cycles of the first vowel, or the initial five cycles of the second vowel. This was done by setting the level of the first four cycles to the level of the fifth cycle. The amplitude of the consonant occlusion/constriction region was amplified by either 6 or 12 db according to consonant category (see Table 1). NATURAL ENHANCED Figure 1: Waveforms of burst and formant transition regions of /d/ in /AdA/ NATURAL ENHANCED 3

4 Figure 2: Spectrograms of the burst and initial transition regions in /AdA/ (natural and BTF conditions). In two further conditions filtering was used to change the spectral content of perceptuallyimportant regions in order to make them more discriminable. For plosives, the burst spectrum was examined to locate the greatest concentration of energy; the precise location varied depending on the vowel context but was around 300 Hz for labials, between 1.2 and 3 khz for velars, and between 2.5 and 4 khz for alveolars. The burst was then band-pass filtered to retain energy at and around this frequency with the width of the pass-band set to four times the ERB (Glasberg and Moore, 1990) at this frequency. For the fricative stimuli, the frication region was filtered to enhance the contrast in its lower-cut-off frequency, a cue to place of articulation in fricatives. The fricatives /f,v/ were high-pass and band-stop filtered respectively so that frication only appeared above 1 khz; /s,z/ were filtered so that aperiodic energy only appeared above 4 khz. No filtering was performed on nasal consonants. In summary, the following test conditions were used: in condition B, only the occlusion/constriction region was amplified; in condition BT, both the occlusion/constriction and format transition regions were amplified; in condition BF, the occlusion/constriction region (for plosives and fricatives) were filtered before being amplified; in condition BFT, all types of manipulations were applied. B BT BF BFT Plosives Burst:+12dB Burst:+12dB Transitions:+ Fricatives Friction:+6dB Friction:+6dB Transitions:+ Nasals Nasality:+6dB Nasality:+6dB Transitions+ Burst:filtered, +12dB Burst:filtered, +12dB, aspiration + 6dB Transitions+ Friction:filtered Friction:filtered +6dB +6dB Nasality:+6dB Transitions+ Nasality:+6dB Transitions+ Table 1: Manipulations applied in the VCV Experiment 2.2 Subjects 13 listeners aged between 20 and 35 with pure tone thresholds below 20 db HL were tested. 2.3 Test procedure Listeners were tested individually in a sound-attenuating room, using a computer-based testing procedure. Stimuli were presented binaurally via AKG240DF headphones, and listeners responded by pointing at a consonant only on the screen using a mouse. Listeners heard three blocks of each enhancement condition, and three blocks containing natural stimuli; each block contained five repetitions of the 36 stimuli. The presentation order was randomized across listeners. All listeners heard stimuli at 0 db and -5 db SNRs. 2.4 Results Figure 3 shows the intelligibility scores for all conditions. ANOVAs revealed that the effect of test condition was significant at -5 db SNR [F(4,48)=41.54; p<0.0001] and at 0 db SNR [F(4,48)=16.04, p<0.0001]. At both SNRs all enhanced conditions gave significantly higher intelligibility scores than the natural condition. Filtering combined with amplitude manipulations did produce a significant additional improvement at the worse SNR. The highest mean increase was 12% for -5 db SNR and 6% for 0 db SNR. The scores obtained for the BFT condition at -5 db were nearly identical to those obtained for the unenhanced 4

5 stimuli at 0 db SNR. The effect of the enhancement therefore corresponds to an increase in signal-to-noise ratio of approximately 5 db. The main effects of subject and vocalic context were also significant at both SNRs. Duncan s Multiple Range tests revealed that consonant perception in the /u/ context was significantly poorer than in the /i/ and /a/ contexts (see Figures 4 and 5). 90 % Correct SNR 0dB -5dB 50 Nat B BF BT BFT Test Condition Figure 3: Mean consonant intelligibility scores for different enhancements in the VCV experiment. Information Transfer analyses were applied in order to determine how well consonants were recognised in terms of their voicing, place and manner of articulation, and how the enhancements applied affected the correct labelling in terms of these features. ANOVAs were then carried out on these voicing, place and manner scores. The pattern of errors obtained is consistent with what is known about consonant perception in noise. The voicing feature was robust and was well preserved in conditions of noise degradation. However, consonants were most often confused in terms of their place of articulation, and to a lesser extent, in terms of their manner of articulation. Enhancements led to a significant increase in the correct perception of the place and manner of articulation. Results are presented here for the BFT condition, where all types of enhancements were applied. Results are presented separately for the consonants in the context of the vowels /A/, /i/ and /u/. It can be seen that recognition was poorest overall for consonants in the context of /u/ and that the same general patterns of errors are seen at both SNRs. In all three contexts, the greatest effect of the enhancements applied is in the correct recognition of the consonants place of articulation (20% improvement in the context of /A/). Manner discrimination also improved slightly, which is likely to be due to a reduction in plosive/fricative confusions. 5

6 100 % info transfer Voicing Manner Place 40 /a/ nat. /a/ enh. /i/ nat. /i/ enh. /u/ nat. /u/ enh. Figure 4: Correct identification of features of voicing, place and manner of articulation for consonants presented in three vocalic contexts at an SNR of -5 db for natural and BFT conditions. 100 % info transfer Voicing Manner Place 40 /a/ nat. /a/ enh. /i/ nat. /i/ enh. /u/ nat. /u/ enh. Figure 5: Correct identification of features of voicing, place and manner of articulation for consonants presented in three vocalic contexts at an SNR of 0 db for natural and BFT conditions. The data were analysed further to see which specific consonants benefited the most from the enhancement strategies applied. Bar charts showing the percentage of correct responses per consonant are presented in figures 4 and 5. Overall, it can be seen that all but one consonant (/g/) either showed higher or similar scores in the enhanced condition relative to the control condition. As expected, the correct identification of place of articulation of voiced plosives was particularly difficult in noise. At SNR -5 db, /d/ identification showed a dramatic improvement after enhancement % correct nat -5 db SNR enh -5 db SNR 0 p t k b d g f v s z m n Figure 6: correct identification of natural and enhanced (BFT condition) consonants at SNR -5 db. 6

7 Confusion matrices for the data reveal which confusions occuring in the natural condition were disambiguated once enhancement strategies have been applied. At 0 db SNR, a number of confusions are seen within the voiceless stop (/p/-/k/) and the voiced stop (/b/-/g/, /d/-/g/) classes. The nasals /m/ and /n/ are also frequently confused, as are the fricatives /f/ and /s/. Once enhancements were applied, only the nasal confusions and voiced-stop confusions remained. The enhancements therefore had the greatest effect in disambiguating voiceless stops, and voiceless fricatives. A similar pattern was seen at SNR -5 db. 3. Sentence material enhancement Many studies evaluating the effect of enhancement have used nonsense VCV, CV or CVC syllables (e.g., Gordon-Salant, 1986). This is a necessary step as it is only possible to analyse the effect of enhancement in stimuli in which the perceptual contribution of contextual information has been eliminated. However, the greater degree of coarticulation and the greater variety in vocalic context seen in sentence-level material may strongly affect the perceptual effect of enhancements. It is therefore important to test enhancement strategies with more natural sentence-length material whilst still controlling the contribution of contextual information General Method The second set of experiments applied similar enhancement techniques to natural sentence materials. 50 semantically-unpredictable sentences (SUS) (Benoit, Grice and Hazan, 1996), read by the same male speaker as in the VCV experiment, were recorded and digitized at 16 khz with 16 bit amplitude quantization. SUS material was used in order to limit the amount of contextual information present; sentences were syntactically correct but had words with no semantic relationship. They were constructed using five different grammatical structures, and each sentence contained four key words. Examples of SUS sentences are presented in Appendix I. A greater range of consonants including affricates and approximants was manipulated than in the VCV experiment; consonants annotated were /b,d,g,p,t,k,f,s,t,h,v, z,n,n,ts,dz,l,r,w/. Sentences were annotated to identify the consonant constriction/occlusion and transition regions as described in the VCV experiment Sentence Experiment 1 a. Method Following informal listening experiments with the SUS material enhanced in the same way as in VCV Experiment 2, some small adjustments were made to the enhancement strategies. Plosive and affricate bursts were filtered, but it was necessary to use wider pass-bands given the greater variation in centre burst frequency in this sentence-length material. The degree of amplification of the burst was reduced to 9 db. Amplification was also applied to the aspiration segments in the voiceless stops. No filtering was applied to fricatives due to the increased variability in cut-off frequency in these phones in sentence material. In the formant transition regions, the five final and initial voicing cycles before and after the consonant occlusion/constriction region were boosted by 3 db. After being manipulated, stimuli were combined with speech-shaped noise at 0 db and 5 db SNR. In addition, in order to check the effect of these small changes in enhancement levels, the same manipulations that were used in the SUS material were also applied to the VCV material described above. 7

8 Class Manipulations Plosives burst: +9dB,filtered; aspiration: +9dB; transitions+ Fricatives friction: +6dB; transitions+ Affricates burst: +9dB, filtered; friction: +6dB; transitions+ Approximants constriction: +3dB; transitions+ Nasals nasality: +6dB; transitions+ Table 2: Manipulations applied in Sentence Experiment 1. Figure 6: Time-amplitude waveforms of the SUS sentence The bright bit raised the weight in natural and enhanced conditions. b. Subjects Separate groups of listeners were used for each SNR condition on order to avoid word learning effects. All were aged between 20 and 35 with pure tone thresholds below 20 db HL. 12 listeners were tested in the 0 db SNR condition and 13 in the 5 db SNR condition. c. Test procedure Listeners were tested individually in a sound-attenuating room, using computer-controlled sentence presentation. Sentences were presented binaurally via AKG240 DF headphones, and listeners responded by writing down the sentence heard on a response sheet. Each listener heard 25 SUS sentences in the natural condition and 25 in the enhanced condition. Sentence order within a block was randomized, and which half of the sentence list a block was drawn from, and whether a subject heard the enhanced or natural condition first were counterbalanced across subjects. 8

9 d. Results Sentences were scored in terms of the number of key-words correctly transcribed. Intelligibility scores were then obtained by calculating the percentage of key-words correctly transcribed in each 25-sentence block (total of 100 key-words). Figure 7 shows the intelligibility scores for all conditions. At 5 db SNR, the effect of enhancement was significant [F(1,8)=6.08, p=0.039]. The order in which conditions were presented, and the sentence blocks used did not significantly affect test scores. At 0 db SNR, the enhanced condition did not produce significantly higher scores than the natural condition. Results obtained for the VCV tests replicated those obtained above. At 0 db SNR, mean intelligibility scores showed a significant increase from 76% to 83% (paireddifference t-test p<0.001) as compared to an increase from 77% to 83% in VCV Experiment 2. % correct Effect of enhancement on mean intelligibility scores 0 db 5 db Signal-to-noise ratio Nat Enh Figure 7: Intelligibility scores for Sentence Experiment 1 Little benefit of cue-enhancement on intelligibility was obtained for this sentence material. This experiment varied from previous one in three important respects: First, the type of material itself was radically different: the sentence-length material imposed a greater cognitive load on the listeners, especially as the sentences used were semanticallyunpredictable. Second, a wider range of consonant classes with a greater variety of vocalic contexts were manipulated compared to previous experiments. Third, a different set of subjects was tested. The replication of previously-obtained VCV results with a different listener group makes it unlikely that listener effects might be the cause for this difference. A detailed examination of sentence results did suggest that some of the enhancements made to affricates and approximants had led to an increased number of errors for words containing those sounds. In order to test whether this was the cause of the poorer results compared with those obtained for the VCV material, a further experiment was set up using the same SUS material, but with manipulations made only to plosives, fricatives and nasals, as in the VCV experiment Sentence Experiment 2 a. Method Further adjustments were made to the enhancement techniques used in sentence experiment 1. First, bursts were no longer filtered as it was found that the filter-bandwidths could not be 9

10 reliably set due to the greater variability in burst center frequency in continuous speech. The degree of amplification of the burst and aspiration was also changed relative to sentence experiment 1. Second, a change was made in the way in which the initial and final vocalic cycles were amplified to avoid discontinuities in the speech signal; vocalic cycles were amplified by between 4 and 2 db; amplification was gradually altered with the cycles nearest the occlusion being given the greatest amplification. After being manipulated, stimuli were combined with speech-shaped noise at 0 db SNR. Class Plosives Fricatives Affricates Approximants Nasals Enhancement burst: +12dB; aspiration: +6dB; transitions+ 4-2 db friction: +6dB; transitions+ 4-2 db not manipulated not manipulated nasality: +6dB; transitions+4-2 db Table 3: Manipulations applied in Sentence Experiment 2. b. Subjects 12 listeners were tested. All were aged between 20 and 30 with pure tone thresholds below 20 db HL. c. Test procedure As in Sentence Experiment 1. Mean scores are presented in Figure 8. The effect of enhancement was significant [F(1,8)=19.66, p=0.002]. The effect of order of presentation was not significant but there was significant interaction between order of presentation and enhancement [F(1,1)=21.45; p=0.002]: listeners who heard the enhanced sentences second showed a greater increase in intelligibility scores, but this effect did not apply when the presentation order was reversed. Learning effects with SUS material have also been reported in other studies (e.g. Grice and Hazan, 1989). 10

11 Effect of enhancement on mean intelligibility scores at 0 db % correct Nat Enh 60 Nat-Enh Enh-Nat Combined Presentation order Figure 8: Intelligibility scores for Sentence Experiment Sentence material discussion The extension of enhancement techniques from highly-controlled VCV material to sentencelevel material did lead to a need for refinements of the enhancement strategies. This was due to the fact that consonants appeared in a much wider variety of vocalic contexts and were also inherently more variable in their spectral and temporal characteristics; as a result, the types and levels of enhancements which were appropriate in the VCV experiments sometimes led to abrupt changes in amplitude and other discontinuities in the sentence material which had a deleterious effect on intelligibility. Results obtained in Experiment 2, however, showed that more careful adjustments made to the degree of amplification of certain constriction/occlusion and transition portions did lead to significant increases in sentence intelligibility as a result of cue-enhancement. 4. Conclusion The work reported here has shown the benefit of speech pattern enhancements in improving perception by normally-hearing listeners in poor listening conditions. Despite the relatively gross manipulations made to the stimuli in this study, a significant improvement in intelligibility was achieved both for VCV and sentence material. These enhancement techniques are currently being implemented within a diphone-based text-to-speech system to allow testing of the enhancement technique on the unlimited range of utterances that can be generated in this way. 5. Acknowledgements This work was funded by an EPSRC project grant (GR/J10426). 6. References Benoit, C., Grice, M. and Hazan, V. (1996) The SUS test: a method for the assessment of textto-speech synthesis intelligibility using Semantically Unpredictable Sentences, Speech Communication 18: in press. Cheng, Y.M., O Shaughnessy, D. & Kabal, P. (1995) Speech enhancement using a statistically derived filter mapping. Proceedings of International Conference of Spoken Language Processing, Banff, October 1992, vol.1, Glasberg, B.R. and Moore, B.C.J. (1990) Derivation of auditory filter shapes from notchednoise data. Hearing Research, 47,

12 Gordon-Salant, S. (1986) Recognition of natural and time/intensity altered CVs by young and elderly subjects with normal hearing. Journal of the Acoustical Society of America, 80, Jamieson, D.G. (1995) Techniques for training difficult non-native speech contrasts. Proceedings of the XIIIth International Congress of Phonetic Sciences, 4, Nagarajan, S.S., Wang, X., Merzenich, M.M., Schreiner, C.E., Jenkins, W.M., Johnston, P.A., Miller, S.L., Byma, G. & Tallal, P. (1995) Speech modification algorithms for training language-learning impaired children. Proceedings of Society of Neuroscience Conference, Niederjohn, R.J. & Grotelueschen, J.H. (1976) The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression, IEEE Trans. ASSP-24, p277. Stevens, K.N. (1985) Evidence for the role of acoustic boundaries in the perception of speech sounds. In V. Fromkin (ed) Phonetic Linguistics: Essays in the honor of Peter Ladefoged. Academic Press, Orlando. Tallal, P., Miller, S.L., Bedi, G., Byma, G., Wang, X., Nagarajan, S., Schreiner, C., Jenkins, W., Merzenich, M. (1996) Language comprehension in language-learning impaired children improved with acoustically modified speech. Science, 271,

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35

More information

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence

Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence INTERSPEECH September,, San Francisco, USA Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence Bidisha Sharma and S. R. Mahadeva Prasanna Department of Electronics

More information

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics

More information

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:

More information

Segregation of Unvoiced Speech from Nonspeech Interference

Segregation of Unvoiced Speech from Nonspeech Interference Technical Report OSU-CISRC-8/7-TR63 Department of Computer Science and Engineering The Ohio State University Columbus, OH 4321-1277 FTP site: ftp.cse.ohio-state.edu Login: anonymous Directory: pub/tech-report/27

More information

A comparison of spectral smoothing methods for segment concatenation based speech synthesis

A comparison of spectral smoothing methods for segment concatenation based speech synthesis D.T. Chappell, J.H.L. Hansen, "Spectral Smoothing for Speech Segment Concatenation, Speech Communication, Volume 36, Issues 3-4, March 2002, Pages 343-373. A comparison of spectral smoothing methods for

More information

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology

Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano. Graduate School of Information Science, Nara Institute of Science & Technology ISCA Archive SUBJECTIVE EVALUATION FOR HMM-BASED SPEECH-TO-LIP MOVEMENT SYNTHESIS Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano Graduate School of Information Science, Nara Institute of Science & Technology

More information

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations Post-vocalic spirantization: Typology and phonetic motivations Alan C-L Yu University of California, Berkeley 0. Introduction Spirantization involves a stop consonant becoming a weak fricative (e.g., B,

More information

Rhythm-typology revisited.

Rhythm-typology revisited. DFG Project BA 737/1: "Cross-language and individual differences in the production and perception of syllabic prominence. Rhythm-typology revisited." Rhythm-typology revisited. B. Andreeva & W. Barry Jacques

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

Learners Use Word-Level Statistics in Phonetic Category Acquisition

Learners Use Word-Level Statistics in Phonetic Category Acquisition Learners Use Word-Level Statistics in Phonetic Category Acquisition Naomi Feldman, Emily Myers, Katherine White, Thomas Griffiths, and James Morgan 1. Introduction * One of the first challenges that language

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm Prof. Ch.Srinivasa Kumar Prof. and Head of department. Electronics and communication Nalanda Institute

More information

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions

Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions 26 24th European Signal Processing Conference (EUSIPCO) Noise-Adaptive Perceptual Weighting in the AMR-WB Encoder for Increased Speech Loudness in Adverse Far-End Noise Conditions Emma Jokinen Department

More information

Consonants: articulation and transcription

Consonants: articulation and transcription Phonology 1: Handout January 20, 2005 Consonants: articulation and transcription 1 Orientation phonetics [G. Phonetik]: the study of the physical and physiological aspects of human sound production and

More information

Phonological and Phonetic Representations: The Case of Neutralization

Phonological and Phonetic Representations: The Case of Neutralization Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider

More information

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud

More information

On the Formation of Phoneme Categories in DNN Acoustic Models

On the Formation of Phoneme Categories in DNN Acoustic Models On the Formation of Phoneme Categories in DNN Acoustic Models Tasha Nagamine Department of Electrical Engineering, Columbia University T. Nagamine Motivation Large performance gap between humans and state-

More information

age, Speech and Hearii

age, Speech and Hearii age, Speech and Hearii 1 Speech Commun cation tion 2 Sensory Comm, ection i 298 RLE Progress Report Number 132 Section 1 Speech Communication Chapter 1 Speech Communication 299 300 RLE Progress Report

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Pobrane z czasopisma New Horizons in English Studies  Data: 18/11/ :52:20. New Horizons in English Studies 1/2016 LANGUAGE Maria Curie-Skłodowska University () in Lublin k.laidler.umcs@gmail.com Online Adaptation of Word-initial Ukrainian CC Consonant Clusters by Native Speakers of English Abstract. The phenomenon

More information

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Linguistic Portfolios Volume 6 Article 10 2017 An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Cassy Lundy St. Cloud State University, casey.lundy@gmail.com

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA

Rachel E. Baker, Ann R. Bradlow. Northwestern University, Evanston, IL, USA LANGUAGE AND SPEECH, 2009, 52 (4), 391 413 391 Variability in Word Duration as a Function of Probability, Speech Style, and Prosody Rachel E. Baker, Ann R. Bradlow Northwestern University, Evanston, IL,

More information

Body-Conducted Speech Recognition and its Application to Speech Support System

Body-Conducted Speech Recognition and its Application to Speech Support System Body-Conducted Speech Recognition and its Application to Speech Support System 4 Shunsuke Ishimitsu Hiroshima City University Japan 1. Introduction In recent years, speech recognition systems have been

More information

Phonetics. The Sound of Language

Phonetics. The Sound of Language Phonetics. The Sound of Language 1 The Description of Sounds Fromkin & Rodman: An Introduction to Language. Fort Worth etc., Harcourt Brace Jovanovich Read: Chapter 5, (p. 176ff.) (or the corresponding

More information

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets

More information

Universal contrastive analysis as a learning principle in CAPT

Universal contrastive analysis as a learning principle in CAPT Universal contrastive analysis as a learning principle in CAPT Jacques Koreman, Preben Wik, Olaf Husby, Egil Albertsen Department of Language and Communication Studies, NTNU, Trondheim, Norway jacques.koreman@ntnu.no,

More information

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused

More information

Christine Mooshammer, IPDS Kiel, Philip Hoole, IPSK München, Anja Geumann, Dublin

Christine Mooshammer, IPDS Kiel, Philip Hoole, IPSK München, Anja Geumann, Dublin 1 Title: Jaw and order Christine Mooshammer, IPDS Kiel, Philip Hoole, IPSK München, Anja Geumann, Dublin Short title: Production of coronal consonants Acknowledgements This work was partially supported

More information

Voice conversion through vector quantization

Voice conversion through vector quantization J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,

More information

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence

A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence Bistra Andreeva 1, William Barry 1, Jacques Koreman 2 1 Saarland University Germany 2 Norwegian University of Science and

More information

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS

DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS DEVELOPMENT OF LINGUAL MOTOR CONTROL IN CHILDREN AND ADOLESCENTS Natalia Zharkova 1, William J. Hardcastle 1, Fiona E. Gibbon 2 & Robin J. Lickley 1 1 CASL Research Centre, Queen Margaret University, Edinburgh

More information

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397,

Dyslexia/dyslexic, 3, 9, 24, 97, 187, 189, 206, 217, , , 367, , , 397, Adoption studies, 274 275 Alliteration skill, 113, 115, 117 118, 122 123, 128, 136, 138 Alphabetic writing system, 5, 40, 127, 136, 410, 415 Alphabets (types of ) artificial transparent alphabet, 5 German

More information

Different Task Type and the Perception of the English Interdental Fricatives

Different Task Type and the Perception of the English Interdental Fricatives Different Task Type and the Perception of the English Interdental Fricatives Mara Silvia Reis, Denise Cristina Kluge, Melissa Bettoni-Techio Federal University of Santa Catarina marasreis@hotmail.com,

More information

Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System

Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System ARCHIVES OF ACOUSTICS Vol. 42, No. 3, pp. 375 383 (2017) Copyright c 2017 by PAN IPPT DOI: 10.1515/aoa-2017-0039 Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System

More information

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

The analysis starts with the phonetic vowel and consonant charts based on the dataset: Ling 113 Homework 5: Hebrew Kelli Wiseth February 13, 2014 The analysis starts with the phonetic vowel and consonant charts based on the dataset: a) Given that the underlying representation for all verb

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

Online Publication Date: 01 May 1981 PLEASE SCROLL DOWN FOR ARTICLE

Online Publication Date: 01 May 1981 PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by:[university of Sussex] On: 15 July 2008 Access Details: [subscription number 776502344] Publisher: Psychology Press Informa Ltd Registered in England and Wales Registered

More information

Author's personal copy

Author's personal copy Speech Communication 49 (2007) 588 601 www.elsevier.com/locate/specom Abstract Subjective comparison and evaluation of speech enhancement Yi Hu, Philipos C. Loizou * Department of Electrical Engineering,

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds

Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds Linking object names and object categories: Words (but not tones) facilitate object categorization in 6- and 12-month-olds Anne L. Fulkerson 1, Sandra R. Waxman 2, and Jennifer M. Seymour 1 1 University

More information

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language

A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language A Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language Z.HACHKAR 1,3, A. FARCHI 2, B.MOUNIR 1, J. EL ABBADI 3 1 Ecole Supérieure de Technologie, Safi, Morocco. zhachkar2000@yahoo.fr.

More information

Stages of Literacy Ros Lugg

Stages of Literacy Ros Lugg Beginning readers in the USA Stages of Literacy Ros Lugg Looked at predictors of reading success or failure Pre-readers readers aged 3-53 5 yrs Looked at variety of abilities IQ Speech and language abilities

More information

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012 Text-independent Mono and Cross-lingual Speaker Identification with the Constraint of Limited Data Nagaraja B G and H S Jayanna Department of Information Science and Engineering Siddaganga Institute of

More information

Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition

Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition J Am Acad Audiol 17:331 349 (2006) Effects of Open-Set and Closed-Set Task Demands on Spoken Word Recognition Cynthia G. Clopper* David B. Pisoni Adam T. Tierney Abstract Closed-set tests of spoken word

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Consonant-Vowel Unity in Element Theory*

Consonant-Vowel Unity in Element Theory* Consonant-Vowel Unity in Element Theory* Phillip Backley Tohoku Gakuin University Kuniya Nasukawa Tohoku Gakuin University ABSTRACT. This paper motivates the Element Theory view that vowels and consonants

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

Perceptual scaling of voice identity: common dimensions for different vowels and speakers

Perceptual scaling of voice identity: common dimensions for different vowels and speakers DOI 10.1007/s00426-008-0185-z ORIGINAL ARTICLE Perceptual scaling of voice identity: common dimensions for different vowels and speakers Oliver Baumann Æ Pascal Belin Received: 15 February 2008 / Accepted:

More information

Perceptual Auditory Aftereffects on Voice Identity Using Brief Vowel Stimuli

Perceptual Auditory Aftereffects on Voice Identity Using Brief Vowel Stimuli Perceptual Auditory Aftereffects on Voice Identity Using Brief Vowel Stimuli Marianne Latinus 1,3 *, Pascal Belin 1,2 1 Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United

More information

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC On Human Computer Interaction, HCI Dr. Saif al Zahir Electrical and Computer Engineering Department UBC Human Computer Interaction HCI HCI is the study of people, computer technology, and the ways these

More information

Speaker Recognition. Speaker Diarization and Identification

Speaker Recognition. Speaker Diarization and Identification Speaker Recognition Speaker Diarization and Identification A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences

More information

Human Factors Engineering Design and Evaluation Checklist

Human Factors Engineering Design and Evaluation Checklist Revised April 9, 2007 Human Factors Engineering Design and Evaluation Checklist Design of: Evaluation of: Human Factors Engineer: Date: Revised April 9, 2007 Created by Jon Mast 2 Notes: This checklist

More information

Audible and visible speech

Audible and visible speech Building sensori-motor prototypes from audiovisual exemplars Gérard BAILLY Institut de la Communication Parlée INPG & Université Stendhal 46, avenue Félix Viallet, 383 Grenoble Cedex, France web: http://www.icp.grenet.fr/bailly

More information

Understanding and Supporting Dyslexia Godstone Village School. January 2017

Understanding and Supporting Dyslexia Godstone Village School. January 2017 Understanding and Supporting Dyslexia Godstone Village School January 2017 By then end of the session I will: Have a greater understanding of Dyslexia and the ways in which children can be affected by

More information

Journal of Phonetics

Journal of Phonetics Journal of Phonetics 40 (2012) 595 607 Contents lists available at SciVerse ScienceDirect Journal of Phonetics journal homepage: www.elsevier.com/locate/phonetics How linguistic and probabilistic properties

More information

Course Law Enforcement II. Unit I Careers in Law Enforcement

Course Law Enforcement II. Unit I Careers in Law Enforcement Course Law Enforcement II Unit I Careers in Law Enforcement Essential Question How does communication affect the role of the public safety professional? TEKS 130.294(c) (1)(A)(B)(C) Prior Student Learning

More information

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab Revisiting the role of prosody in early language acquisition Megha Sundara UCLA Phonetics Lab Outline Part I: Intonation has a role in language discrimination Part II: Do English-learning infants have

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation

UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation Taufiq Hasan Gang Liu Seyed Omid Sadjadi Navid Shokouhi The CRSS SRE Team John H.L. Hansen Keith W. Godin Abhinav Misra Ali Ziaei Hynek Bořil

More information

ABSTRACT. Some children with speech sound disorders (SSD) have difficulty with literacyrelated

ABSTRACT. Some children with speech sound disorders (SSD) have difficulty with literacyrelated ABSTRACT Some children with speech sound disorders (SSD) have difficulty with literacyrelated skills. In particular, they often have trouble with phonological processing, which is a robust predictor of

More information

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION by Adam B. Buchwald A dissertation submitted to The Johns Hopkins University in conformity with the requirements

More information

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016 AGENDA Advanced Learning Theories Alejandra J. Magana, Ph.D. admagana@purdue.edu Introduction to Learning Theories Role of Learning Theories and Frameworks Learning Design Research Design Dual Coding Theory

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

Bi-Annual Status Report For. Improved Monosyllabic Word Modeling on SWITCHBOARD

Bi-Annual Status Report For. Improved Monosyllabic Word Modeling on SWITCHBOARD INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING Bi-Annual Status Report For Improved Monosyllabic Word Modeling on SWITCHBOARD submitted by: J. Hamaker, N. Deshmukh, A. Ganapathiraju, and J. Picone Institute

More information

Journal of Phonetics

Journal of Phonetics Journal of Phonetics 41 (2013) 297 306 Contents lists available at SciVerse ScienceDirect Journal of Phonetics journal homepage: www.elsevier.com/locate/phonetics The role of intonation in language and

More information

Automatic segmentation of continuous speech using minimum phase group delay functions

Automatic segmentation of continuous speech using minimum phase group delay functions Speech Communication 42 (24) 429 446 www.elsevier.com/locate/specom Automatic segmentation of continuous speech using minimum phase group delay functions V. Kamakshi Prasad, T. Nagarajan *, Hema A. Murthy

More information

Guidelines for blind and partially sighted candidates

Guidelines for blind and partially sighted candidates Revised August 2006 Guidelines for blind and partially sighted candidates Our policy In addition to the specific provisions described below, we are happy to consider each person individually if their needs

More information

DIBELS Next BENCHMARK ASSESSMENTS

DIBELS Next BENCHMARK ASSESSMENTS DIBELS Next BENCHMARK ASSESSMENTS Click to edit Master title style Benchmark Screening Benchmark testing is the systematic process of screening all students on essential skills predictive of later reading

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

On Developing Acoustic Models Using HTK. M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. Delft, December 2004 Copyright c 2004 M.A. Spaans BSc. December, 2004. Faculty of Electrical

More information

COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION

COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION Session 3532 COMPUTER INTERFACES FOR TEACHING THE NINTENDO GENERATION Thad B. Welch, Brian Jenkins Department of Electrical Engineering U.S. Naval Academy, MD Cameron H. G. Wright Department of Electrical

More information

Infants learn phonotactic regularities from brief auditory experience

Infants learn phonotactic regularities from brief auditory experience B69 Cognition 87 (2003) B69 B77 www.elsevier.com/locate/cognit Brief article Infants learn phonotactic regularities from brief auditory experience Kyle E. Chambers*, Kristine H. Onishi, Cynthia Fisher

More information

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan James White & Marc Garellek UCLA 1 Introduction Goals: To determine the acoustic correlates of primary and secondary

More information

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales GCSE English Language 2012 An investigation into the outcomes for candidates in Wales Qualifications and Learning Division 10 September 2012 GCSE English Language 2012 An investigation into the outcomes

More information

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu

More information

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS

THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS THE PERCEPTION AND PRODUCTION OF STRESS AND INTONATION BY CHILDREN WITH COCHLEAR IMPLANTS ROSEMARY O HALPIN University College London Department of Phonetics & Linguistics A dissertation submitted to the

More information

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Stromswold & Rifkin, Language Acquisition by MZ & DZ SLI Twins (SRCLD, 1996) 1 Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Dept. of Psychology & Ctr. for

More information

Beginning primarily with the investigations of Zimmermann (1980a),

Beginning primarily with the investigations of Zimmermann (1980a), Orofacial Movements Associated With Fluent Speech in Persons Who Stutter Michael D. McClean Walter Reed Army Medical Center, Washington, D.C. Stephen M. Tasko Western Michigan University, Kalamazoo, MI

More information

Purpose of internal assessment. Guidance and authenticity. Internal assessment. Assessment

Purpose of internal assessment. Guidance and authenticity. Internal assessment. Assessment Assessment Internal assessment Purpose of internal assessment Internal assessment is an integral part of the course and is compulsory for both SL and HL students. It enables students to demonstrate the

More information

The Acquisition of English Intonation by Native Greek Speakers

The Acquisition of English Intonation by Native Greek Speakers The Acquisition of English Intonation by Native Greek Speakers Evia Kainada and Angelos Lengeris Technological Educational Institute of Patras, Aristotle University of Thessaloniki ekainada@teipat.gr,

More information

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University

Perceived speech rate: the effects of. articulation rate and speaking style in spontaneous speech. Jacques Koreman. Saarland University 1 Perceived speech rate: the effects of articulation rate and speaking style in spontaneous speech Jacques Koreman Saarland University Institute of Phonetics P.O. Box 151150 D-66041 Saarbrücken Germany

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction INTERSPEECH 2015 Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa Department of Computer

More information

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Presentation Format Effects in a Levels-of-Processing Task

Presentation Format Effects in a Levels-of-Processing Task P.W. Foos ExperimentalP & P. Goolkasian: sychology 2008 Presentation Hogrefe 2008; Vol. & Huber Format 55(4):215 227 Publishers Effects Presentation Format Effects in a Levels-of-Processing Task Paul W.

More information

Phonological encoding in speech production

Phonological encoding in speech production Phonological encoding in speech production Niels O. Schiller Department of Cognitive Neuroscience, Maastricht University, The Netherlands Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands

More information

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method Sanket S. Kalamkar and Adrish Banerjee Department of Electrical Engineering

More information