Institute of Phonetic Sciences University of Amsterdam First semester 2007
The Speech Organs
Speech Speech production organs determine acoustic characteristics of speech sounds. Motivation Find explanations for acoustical attributes of sounds Relation vocal tract shape and formants Why are female formants higher than males Characteristics of nasal and oral sounds Fricative sounds Spectral slope of vowels
Speech Speech production organs determine acoustic characteristics of speech sounds. Motivation Find explanations for acoustical attributes of sounds Relation vocal tract shape and formants Why are female formants higher than males Characteristics of nasal and oral sounds Fricative sounds Spectral slope of vowels
Speech Speech production organs determine acoustic characteristics of speech sounds. Motivation Find explanations for acoustical attributes of sounds Relation vocal tract shape and formants Why are female formants higher than males Characteristics of nasal and oral sounds Fricative sounds Spectral slope of vowels
Speech Speech production organs determine acoustic characteristics of speech sounds. Motivation Find explanations for acoustical attributes of sounds Relation vocal tract shape and formants Why are female formants higher than males Characteristics of nasal and oral sounds Fricative sounds Spectral slope of vowels
Speech Speech production organs determine acoustic characteristics of speech sounds. Motivation Find explanations for acoustical attributes of sounds Relation vocal tract shape and formants Why are female formants higher than males Characteristics of nasal and oral sounds Fricative sounds Spectral slope of vowels
Production of Speech Main processes in production of speech Sound source (glottis +/ turbulent airstream) Shape of vocal tract Radiation from the mouth Energy losses
Production of Speech Main processes in production of speech Sound source (glottis +/ turbulent airstream) Shape of vocal tract Radiation from the mouth Energy losses
Production of Speech Main processes in production of speech Sound source (glottis +/ turbulent airstream) Shape of vocal tract Radiation from the mouth Energy losses
Production of Speech Main processes in production of speech Sound source (glottis +/ turbulent airstream) Shape of vocal tract Radiation from the mouth Energy losses
Production of Speech Main processes in production of speech Sound source (glottis +/ turbulent airstream) Shape of vocal tract Radiation from the mouth Energy losses Model: These processes are independent
Source-Filter Theory The Source-Filter theory models the production aparatus as two independent units: The source (the glottal source or noise generated at a constriction) The filter (resonances in the cavities of the vocal tract)
Source-Filter Theory The Source-Filter theory models the production aparatus as two independent units: The source (the glottal source or noise generated at a constriction) The filter (resonances in the cavities of the vocal tract)
Source-Filter Theory The Source-Filter theory models the production aparatus as two independent units: The source (the glottal source or noise generated at a constriction) The filter (resonances in the cavities of the vocal tract) A speech sound is the result of a source signal being filtered
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
From: (Rosenberg, 1971) Excitation of the vocal tract By volume velocity at glottis Is pulse-like (open and closed phase) Primarily because of rapid closure of glottis Slope at closure increases with increasing vocal effort Pitch or intensity then t open Waveform more sinusoidal Pitch or intensity then t open and slope-at-closure Damping of formants higher on open phase
Glottal flow Glottal flow derivative 0 0 1 2 Time (normalized) 0 0 1 2 Time (normalized) Open phase Closed phase
Creating a Source from Pitch Targets Create PitchTier... source 0 0.15 Add point... 0 150 Add point... 0.15 100 To PointProcess To Sound (phonation)... 44100 0.9 0.05 0.7 0.03 3 4 x1=1000 y1=40 x2=8000 y2=40-3*12 b = 1/(x2-x1)*ln(y1/y2) Draw function... x1 x2 1000 40*exp(-b*(x-x1)) 1 0-1 0 0.15 Time (s) Sound pressure level (db/hz) 60 40 36 20 1000 0 8000 Frequency (Hz)
Creating a Noise Source For fricatives we need a noise source: Create Sound from formula... noise Mono 0 0.015 22050...randomGauss(0,0.2) 1 0-1 0 0.015 Time (s) Sound pressure level (db/hz) 40 20 0 0 8000 Frequency (Hz)
Tube Models for Vowels Curvature of tract can be neglected! Only cross-sectional area Diameters equal over large lengths of vocal tract Lossless Number of segments 1...
Tube Models for Vowels Curvature of tract can be neglected! Only cross-sectional area Diameters equal over large lengths of vocal tract Lossless Number of segments 1...
Tube Models for Vowels Curvature of tract can be neglected! Only cross-sectional area Diameters equal over large lengths of vocal tract Lossless Number of segments 1...
Tube Models for Vowels Curvature of tract can be neglected! Only cross-sectional area Diameters equal over large lengths of vocal tract Lossless Number of segments 1...
Tube Models for Vowels Curvature of tract can be neglected! Only cross-sectional area Diameters equal over large lengths of vocal tract Lossless Number of segments 1...
The 1-tube closed end: anti-node open end: node Closed-open tube For length l: (2n 1) λ 4 We use λf = c and obtain F n = (2n 1)c 4l Shorter length, higher formants! c = 340 m/s Male l = 0.17 m F n = 500, 1500, 2500, 3500,... Female l = 0.145 m F n = 586, 1759, 2931, 4103,...
Deductions from a Straight Tube Constriction at node/anti-node decreases/increases resonance frequency. at lip-end always a node so rounding causes lowering of all formants velar constriction at node of F 2 : lowering u
The 2-tube Equal or unequal section lengths: not all vowels can be simulated only some peripheral ones
The 4-tube of Fant (1960) the constriction area: 1 segment (fixed length: 0.05m) before and after the constriction: 2 segments the lips: 1 segment (fixed length: 0.01m) Three parameter model of Fant (1960) 1 1 distance of constriction from glottis 2 constriction area 3 lip opening Effect of lip-rounding was a lowering of F 2 : [i] vs [y] 1 G. Fant (1960),The Speech Production,Mouton: The Hague.
Fricatives Relation between vocal tract shape and speech waveform is obscure. Noise source location varies in vocal tract Limited resources of X-ray data for model validation Depend less on tract shape than vowels
Nasals Opening of an extra cavity: difficult to model Nasal formants: fixed tube somewhat longer than vocal tract Uvular + post-velar: Nasal formants at 300/400 +k800 Palatal to labial: anti-formants at c 4l mouth
[Rosenberg, 1971] A.E. Rosenberg (1971), Effect of glottal pulse shape on the quality of natural vowels, J. Acoust. Soc. Am. (49), 583 590.