Adding Japanese language synthesis support to the espeak system

Size: px
Start display at page:

Download "Adding Japanese language synthesis support to the espeak system"

Transcription

1 Adding Japanese language synthesis support to the espeak system Richard Pronk Bachelor thesis Credits: 18 EC Bachelor Opleiding Kunstmatige Intelligentie University of Amsterdam Faculty of Science Science Park XH Amsterdam Supervisor dr. D.J.M. (David) Weenink Institute of Phonetic Sciences Faculty of Humanities University of Amsterdam Spuistraat VT Amsterdam June 28th, 2013

2 Abstract In this paper we describe an addition to the espeak system which is capable of pronouncing Japanese language. This implementation is used for automatic segmentation of Japanese speech from Japanese text. The speech synthesiser that we use is part of the praat speech analyse program and is based on the espeak text-to-speech engine. Because the Japanese writing system is very complex i.e. it mixes several alphabets with logograms (kanji) and it doesn t use explicit word boundaries, we made some restriction on the input form. First we force the user to make explicit where words end and secondly we do not yet support logograms (kanji) because we need a pronunciation database to implement this feature. We supply hints how these limitations can be overcome.

3 Contents 1 Introduction The Japanese writing systems Rules of the Hepburn romanization system The espeak system Praat Literature review 7 3 Theoretical foundation Phonetic transcription Place of articulation Nasality Voicing Phonetic overview of the Japanese language Vowels Voiced and semi-voiced sounds Devoicing Particles Palatalised sounds Moraic nasal n Gemination Implementation within espeak Word segmentation Pronunciation rules Normalisation to a single writing system Text to phoneme translation Phoneme definitions Input using Rōmaji Latin characters for abbreviations Kanji Results and Evaluation 19 6 Conclusion 20 7 Future work espeak functionality A How to use this initial implementation 23 B IPA for Japanese 24

4 1 Introduction We describe an initial implementation of Japanese speech synthesis support for the espeak[3] system. This initial implementation will enable future research regarding Japanese phonetics to be carried out more easily. Our implementation described in this paper is aimed at providing assistance during the segmentation process of Japanese speech within the speech analysis system praat[2]. The main focus of this paper is therefore on the correct pronouncing of Japanese characters given the rules of the language, rather than focusing on perfect natural sounding Japanese. Furthermore, this paper will provide an overview on Japanese phonetics and the issues encountered regarding the implementation of Japanese language within the espeak system. 1.1 The Japanese writing systems The Japanese language uses three writing systems; hiragana ( ひらがな ), katakana ( カタカナ ) and kanji ( 漢字 ), and to complicate things even further, sometimes even Latin characters are used in Japanese text. The hiragana and katagana writing systems make up the alphabet, covering all the possible sounds in the language. These writing systems have corresponding character sets, where a i u e o あ (a) い (i) う (u) え (e) お (o) k か (ka) き (ki) く (ku) け (ke) こ (ko) a i u e o ア (a) イ (i) ウ (u) エ (e) オ (o) k カ (ka) キ (ki) ク (ku) ケ (ke) コ (ko) Table 1: Example of hiragana chart (top) and katakana chart (bottom) each character represents one mora (mora being one sound-unit in the Japanese language). As seen in Table 1 the hiragana and katagana writing both have characters for the same sounds. In the Japanese language these writing systems are used in combination with each other, where one sentence can consist of hiragana, katagana, kanji and even Latin characters. Kanji are Chinese characters which are widely used within Japanese texts, when a kanji character is not available for a word, hiragana is often used. Hiragana can also be combined with kanji characters for declensions and conjugations. And Katagana is used to transcribe foreign language words or writing loan words. Due to the fact that all possible sounds in the Japanese language are covered by the hiragana and katakana writing systems, the pronunciation of kanji (i.e. the Chinese characters) can be written in terms of those characters (e.g. 漢字 かんじ ). The focus of the current implementation is therefore on being able to pronounce these Japanese characters, with the exception of kanji due to its complexity and lexical dependency in pronunciation (section 4.5). Another supported input method however, is by using rōmaji ( ローマ字 ) which allows for 4

5 Japanese input using purely Latin characters. In this paper the modified Hepburn romanization system will be used for the transcription from hiragana and katagana to rōmaji and is also the romanization system which is supported by the provided implementation. There are more romanization systems available for the Japanese language, however in addition to the fact that the modified Hepburn romanization system is frequently used, the system is also most adjusted to English pronunciation which is therefore the most suitable for the espeak system. The full Hiragana 1 and Katakana 2 charts with romaji transcription using the modified Hepburn romanization system can be seen in the links provided as footnotes Rules of the Hepburn romanization system In order to properly convert Japanese characters to rōmaji a number of rules must be adhered to. These rules have been compiled in the Hepburn romanization system of which there are many versions, this section will discuss the rules that are relevant to this paper. The first rule is that double vowels need to be indicated with a macron or circumflex, whereas /oo/ has a different pronunciation as /ō/ (see section 3.5.1). With the exception of the vowel i since /ii/ is always pronounced as a long vowel, however the implemented system of this paper also allows for /ī/ as input as this is frequently used for loan-words. As seen in table 3 the vowel combination /ou/ is a special case, whereas the vowel combination as single long vowel as two separate vowels aa ā aa ii ī or ii (not possible) uu ū uu ee ē ee oo ō oo ou ō ou Table 2: Double vowel representation in the modified Hepburn romanization system pronunciation of ou can be pronounced as a single long /ō/ or as two separate vowels. For example 東京 ( とうきょう ) would be transcribed from hiragana as /toukyou/, however the ou combination is not pronounced as two separate vowels but as a single long ō vowel. Therefore the transcription from 東京 ( とうきょう ) to the modified Hepburn romanization system should be /tōkyō/. The second rule within the modified Hepburn romanization system is that particles are written as pronounced. For example the subject marker は is written as /wa/ (as pronounced) instead of /ha/, which would be the standard reading when the character does not have a grammatical function. The same goes for the particle へ, which be pronounced as /e/ instead of /he/ and the particle を, which would be pronounced as /o/ instead of /wo/. The third rules is that the n syllable should be written as n and as n before vowels and y in order

6 to disambiguate from sounds from the n row (n-row being the sounds: /na/, /ni/, /nu/, /ne/ and /no/). The syllables /no/ can be read in two different a i u e o n な (na) に (ni) ぬ (nu) ね (ne) の (no) ん (n) Table 3: The ambiguity of the syllable n ways, as one mora or two as separate mora s. Meaning that /no/ can be read as の (no) or as ん (n) お (o). A good example for when this distinction is required is with the following words, 蟹 ( かに ) meaning crab and 簡易 ( かんい ) meaning simplicity. Without this distinction both words are written as /kani/, this would be the correct pronunciation for crab, however simplicity should be written as /kan i/ in order to be pronounced correctly. Double consonants (see section 3.5.7) can be written as expected, however there is one exception that is with ch, which becomes tch. For example 抹茶 ( まっちゃ ) becomes matcha instead of maccha. this was also the fourth and final relevant rule of the Hepburn romanization system. So, to summarize, the relevant rules of the Hepburn romanization system are: 1. Long vowel are represented with a macron or circumflex 2. Particles are written as pronounced 3. The syllable n is written as n before vowels and the syllable y 4. Gemination with ch becomes tch 1.2 The espeak system espeak is an open source Text-To-Speech (TTS) system, which mainly uses formant-based synthesis. Currently, espeak supports over 50 languages, however at the start of the project it didn t support the Japanese language. The speech produced by espeak is highly configurable, but is however not as natural sounding as larger synthesisers which use unit-based synthesis and are based on human speech recordings. In formant-based synthesis, voiced speech (e.g. vowels and sonorant consonants) is created by using formants. On the other hand, unvoiced consonants (e.g. /s/) are created by using pre-recorded sounds. Furthermore, voiced consonants (e.g. /z/) are created as a mixture of formantbased voiced sounds in combination with a pre-recorded unvoiced sound. The espeak system uses modular language data files, which are easy to understand text files. This way, a language can be added or modified without the need of understanding the underlying source code of espeak. 1.3 Praat Praat is a speech analysis system which is based on the espeak text-to-speech engine and therefore uses the language files provided by the espeak system. The provided implementation for espeak regarding Japanese pronunciation will therefore also be used in the praat system. This speech analysis program will also be used during the evaluation process later on. 6

7 2 Literature review There are two types of articles related to this implementation, namely articles about formant synthesis and articles about Japanese phonetics. The first article of interest is the paper from Klatt (1980)[6] which describes software for a cascade/parallel formant synthesiser. This paper gives an in-depth view on how a formant synthesis system is built, which will be useful for this research since the espeak program is based on this system. The second paper of interest is the paper from Klatt & Klatt (1990)[7] which is a continuation of the previous paper. This paper provides a view on the analysis and synthesis of different types of voices, with the main focus being the differences between male and female voices which will be helpful for creating more natural sounding synthesis. Although, as previously stated, natural sounding synthesis is not the main focus of this paper, making the implementation sound as natural as possible will contribute to better results during the segmentation process. These articles about formant synthesis are relatively old, however this is due to the fact that current research focuses mainly on unit-based synthesis. Although unit-based synthesis provides a more natural sounding output, it lacks configurability and theoretical knowledge on the sounds to be produced. The actual research conducted on speech synthesis is therefore mainly done on formant synthesis whereas commercial products tend to use unit-based speech synthesis due to the higher quality of the speech output. Amongst the literature concerning Japanese phonetics is the book by Vance (2008)[9] which provides an in-depth view on Japanese phonological research as well as an insight into the basics of phonological research itself. Another useful book regarding this research is by Kawase et al. (1978)[5] which provides theory on the pronunciation of the Japanese language, where its main focus is on the mouth movements used during pronunciation. More specific papers regarding Japanese phonetics include the paper by Shigeto (2012/forthcoming)[8] which focuses on the actual duration of double consonants and the paper by Bion et al. (2013)[1] which focuses on the differences of vowel durations in the Japanese Language. The paper by Halpern (n.d.)[4] provides insight in how a phonetic database could help by providing the phonological representation of words, which is essential for natural sounding speech synthesis due to the presence of lexically-dependent pronunciation in the Japanese language. This paper demonstrates an implementation of a phonetic database for the Japanese language and the usage of this database. Although the idea of the phonetic database described in this paper can be used, the actual phonetic database itself is not available in terms of the General Public License (GPL) which is a requirement for this project. The idea of this project is to combine these two types of articles by adding Japanese phonetics to a formant synthesis system namely, implementing Japanese speech synthesis support to espeak. 7

8 3 Theoretical foundation First a number of phonetic terminology and concepts will be described which are used later on in this paper. Afterwards the phonetic aspects of the Japanese language itself will be discussed. 3.1 Phonetic transcription The International Phonetic Alphabet (IPA) provides a phonetic transcription of speech, this alphabet is used to describe the pronunciation of the language rather than trying to form words within the language. Due to this standardised alphabet a language can be correctly pronounced without the need of knowing the rules of the language. The IPA notation for the Japanese language can be seen in attachment B. 3.2 Place of articulation Articulation is the process of physically forming the sounds that will result in the pronunciation of a word. This process uses various body parts which are divided into active articulators and passive articulators. Active articulators are generally identified as the articulators that move during the formation of speech. Examples of active articulators include the tongue and the lower lip. In contrast to this, the passive articulators make little to no movement during this process. Examples of passive articulators include the upper lip, upper teeth and the roof of the mouth. The position of these articulators will define the resulting speech. 3.3 Nasality Nasality refers to the effect of the velum in the articulation of consonants. An open velum (see Figure 1) allows for air to escape through the nasal cavity (inner nose), whereas a closed velum causes that air to escape only through the oral cavity (inner mouth). Therefore the meaning of a consonant being nasal would be that while articulating the consonant the velum is open, which allows for air to escape through the nose. 3.4 Voicing Voicing is dependent on the glottis, which refers to both the vocal cords and the open space between them. This (small) opening allows for the vocal cords to vibrate which results in the voiced sound. The opening in the glottis can also be wide, whereas air can pass through freely and the vocal folds have reduced vibration. This is the cause of the so called voiceless sounds. As example take the voiceless consonant s, when the s is pronounced you can t feel any vibrations in the vocal folds, where in contrast to this the voiced consonant z produces vibrations which can be felt. Furthermore the glottis can also be closed, in which case no air can pass. The sound which is produced by obstructing the airflow by closing the glottis is called a glottal stop. 8

9 Figure 1: closed/open velar (taken from: Vance(2009)) 3.5 Phonetic overview of the Japanese language Vowels Articulators alter the vocal resonances which results in the formation of vowel sounds. Peaks amongst the spectra of vowel sounds are called vocal formants. These vocal formants are extremely useful to, for example, distinguish between individual vowel sounds. Distinguishing vowel sounds can be done by comparing the formants, in which case the first two formants tend to be sufficient for the task. This is also used in the implementation within espeak which will be discussed later on (see section phoneme definitions). Another important aspect of vowels (especially in the Japanese language) is length, where the meaning of a word can depend on the length of a vowel. Take for instance the words 雪 ( ゆき ) which is read as /yuki/ meaning snow and 勇気 ( ゆうき ) read as /yu:ki/ and means courage. Here the u (transcribed as u:) is a long vowel where the meaning of the word changes due to the length of the vowel. Furthermore there is another thing to consider when talking about about long vowels, that is the question how the double vowels (e.g. uu, ee, ii, aa, uu) should be pronounced. This due to the fact that these double vowels can be read in two different ways, as two separate vowels or as a single one long vowel. The difference between the pronunciation of a single one long vowel and two short vowels can be clearly seen in words like /satooya/ and /sato:ya/. In figure 2 the pronunciation difference is clearly visible, whereas /satooya/ is clearly pronounced with two separate vowels, which can be seen by the drop between the vowels (i.e. where the arrows point to). Although a good estimation can be given on the pronunciation of double vowels (i.e. by checking the word boundaries of kanji s within a word) this would require a lexical analysis system. Unfortunately such a system is not yet available for this project and therefore automatically checking on vowel combinations is out of the scope of this paper. Instead we force the user to make the input unambiguous with regard to double vowels by using a so called prolonged sound mark ( ー ), which is already the standard way to explicitly indicate long vowels in Japanese text (e.g. おー ). 9

10 Figure 2: Long and short Vowel distinction (taken from: Vance(2009)) Voiced and semi-voiced sounds As previously described (section 3.4) a consonant is voiced when the vocal cords are vibrating during pronunciation process, whereas if a consonant is voiceless the vocal cords are not vibrating during pronunciation. In the Japanese writing system this indication of whether or not the character is voiced is marked in the top right corner of a character with a so called dakuten ( ). For example in the character さ (sa) the first syllable is voiceless, however if we were to add the dakuten to this character making it ざ (za), the first syllable of this character would be voiced. Adding a dakuten is possible for the characters from the k-row (becomes g-row), s-row (becomes z-row), t-row (becomes d-row) and the h-row (becomes b-row). It is also possible to make characters semi-voiced by adding a so called handakuten ( ) to the character. If we for example take は (ha) and add a handakuten to this character it becomes ぱ (pa), whereas if were to add the dakuten this character it would have become ば (ba). The addition of handakuten to make a character semi-voiced is only possible for characters from the b-row (or h-row) as /b/ is the semi-voiced counterpart of /p/ Devoicing We have already seen the difference between voiced and voiceless consonants (e.g. z and s), however it is also possible to have devoiced vowels. Although no actual sound is produced, the mouth moves in the direction of the vowel. In the Japanese language there are a couple of rules for when devoicing takes place, for example when the vowels i or u are between unvoiced phonemes these vowels are devoiced. Take for example the word /sukiyaki/ here the vowel 10

11 u is between the unvoiced syllable s and the unvoiced syllable k which causes the vowel u to devoice. Therefore the word will be phonetically transcribed as /su0kiyaki/ where as the u is devoiced. There is another rule where there is a high probability that the vowels u and i are preceded by an unvoiced consonant and are immediately followed by an pause. However since there is not a certainty that this is always the case, a lexical analysis system is needed in order to verify in which cases this is true. Vance(1990) also states the following about the devoicing of vowels: When the /su/ is the last syllable of a polite nonpast verb form or the polite nonpast copula /desu/ です and immediately followed by a pause, devoicing is quite consistent for most Tokyo speakers. Vance(1990) This sounds like a lexical analysis system is needed in order to find the polite nonpast verbs, however this is not the case as will be shown in the implementation section, this because there is a small rule that can be implemented which can find the polite nonpast verbs without the need of a lexical analysis system. However for this Vance(1990) also comes with an additional exception: Vowel devoicing also interacts with intonation in an obvious way. If the last syllable in a sentence contains a short high vowel preceded by a voiceless consonant but has to carry the intonation for a question, the vowel doesn t devoice. Vance(1990) What this means is that when there is a question which is indicated with a question mark (i.e. there is a rising pitch) the last syllable will not be devoiced anymore, therefore the need for an audible rise overrides the devoicing rule. Futhermore, when the vowel u is devoiced and the preceding syllable is an s, the pronunciation for the devoiced u will then be taken by the pronunciation of the syllable s which will then have a duration of two mora s. Therefore /desu/ will be pronounced as /dess/ and phonetically transcribed as /desu0/ (u0 standing for a devoiced vowel u) Particles When a character has a grammatical function (a so called particle), the pronunciation of the character can change from its original reading. Take for instance the Japanese sentence これは日本語です (which translates to this is Japanese ), the character は is normally pronounced as /ha/, however since this character has the grammatical function of being a particle (i.e. it addresses the subject in the sentence) it is pronounced as /wa/ (otherwise written as わ ). The same goes for the particle へ, which would be pronounced as /e/ instead of /he/ and the particle を, which would be pronounced as /o/ instead of /wo/ Palatalised sounds The palatalised sounds (as shown in the full hiragana and katagana charts) use a consonant-semivowel-vowel syllable structure, whereas the semivowel is a palatal approximant (written as y but phonetically transcribed as /j/). The semivowel-vowel part can consist of three different characters ゃ (ya), ゅ (yu) and 11

12 ょ (yo), whereas during the articulation of the consonant the tongue is raised toward the hard palate and the alveolar ridge. The pronunciation of palatalised sounds goes as following, き (ki) + ゃ (ya) きゃ (kya), note that this is not pronounced as /kiya/ but as /kya/. Furthermore, notice that when working with palatalised sounds like these the ゃ (ya) character is written smaller than the normal や (ya) character (and the same goes for the ゅ (yu) and ょ (yo) characters) Moraic nasal n The moraic nasal n has an articulation which is dependent on which syllable follows, for this the place of articulation is altered depending on the following syllable. This gives us the following articulation rules for the syllable n (as taken from wikipedia 3 ): 1. uvular [ð] at the end of utterances and in isolation. 2. bilabial [m] before [p], [b] and [m] 3. dental [n] before coronals /d/, /t/, and /n/ 4. velar [N] before [k] and [g] Gemination In the Japanese language a double consonant is indicated by a so called sokuon, which is presented as a small tsu character and is both available in the hiragana ( っ ) and katakana ( ッ ) wrtiting systems, whereas the sokuon copies the next following syllable. See for example the word けっか (kekka) meaning result, the sokuon in this word precedes the syllable か (ka) which starts with the syllable k. The sukuon character therefore also becomes the letter k making the double consonant. There are however a few exception in which the sokuon does not become the first following letter, that is with ch, which becomes tch, 抹茶 ( まっちゃ ) for example becomes matcha instead of maccha. This sokuon used for gemination can also be used at the end of a sentence, which will indicate a glottal stop. 4 Implementation within espeak 4.1 Word segmentation The current implementation within espeak requires the user to segmentate the input sentence themselves, this due to the absence of a lexical analysis system (see section future work). For this all words but also particles needs to be separated with a space. A sentence like これはにほんごですか should therefore be inserted into the system as これはにほんごですか. 3 taken on 28/06/13 12

13 4.2 Pronunciation rules The espeak system uses two kinds of text files to implement the pronunciation rules, the first file is the * rules file (in this case ja rules since we are working with the Japanese language) which contain the actual pronunciation rules and the second file is the * list file (ja list in our case) which contains a lookup dictionary. The following rules are implemented into the ja rules file unless otherwise notified. In order to correctly pronounce Japanese text there are three main steps need to be taken (the first step does not comply to rōmaji input): 1. Normalising to a single writing system 2. Text to phoneme translation 3. Describing Japanese phonemes Normalisation to a single writing system Due to the fact that every sound in Japanese language can be written in terms of hiragana characters. The first step is by normalising the input sentence to a single writing system (in this case the hiragana writing system), for katagana and half-width katagana this can done with a straightforward replacement rule. This replacement rule is possible since there is a one on one conversion possible between the hiragana and katakana writing systems. The replace function within espeak works as follows:.replace a b Where a will be replaced with b. Each line specifies either one or two alphabetic characters to be replaced by another one or two alphabetic characters. This substitution is done before the text to phoneme translation. The katakana characters will therefore be placed on the a side, whereas the hiragana characters will be placed on the b side..replace アあイいウう Palatalised sounds (e.g. きゃ (kya)) can be divided into two separate characters, the き (ki) character and the small ゃ (ya) character ( き (ki) キ (ki) and ゃ (ya) ャ (ya) therefore きゃ (kya) キャ (kya)), where the palatalisation rules text to phoneme conversion still apply (as this rule is the same in hiragana as katagana). Half-width katakana (katakana but written smaller) however has a small thing to take into consideration, that is that the カ (ga) character consist of 2 parts the カ (ka) and the voicing syllable. The replacement function will convert the first thing it matches, therefore if the character カ (ka) is placed before カ (ga) the replace function will convert the half-width katakana ga to the hiragana character ka and leaving the voicing mark unparsed. 13

14 4.2.2 Text to phoneme translation The text to phoneme translation contains the actual parsing of the pronunciation rules. These rules are given in groups which is made for each letter or character 4, these rules are used by the espeak system to try to find the best fit for each found character of the input sentence. Every rule is given on a separate line and has the following syntax: [< pre >)] < match > [(< post >] < phonemestring > This can be best explained with a small example:.group ああ a あー a: The first line creates the group for the character あ (a), in this case if the espeak system now encounters the syllable あ (a) it will be handled by this group. Now that the group is known, pronunciation rules are needed on what to do when this group is encountered, in this case we want to make the character あ a and あー a:. The < match > section of the rule is therefore the hiragana character あ and this will be converted to the phoneme a (i.e. the < phonemestring > in this case is simply a ). However, if long vowel is inserted ( あー ) this should be translated into a:. In order to do this, a second rule is added into the group あ, that is when あー is found in the input sentence this will be a better match and therefore tranlating あー (< match >) a: (< phonemestring >). The support for palatalised sounds is done in a similar way, in this case the group of the first character will be used (in this case き ) and in the rules of this group we added a match for the palatalised sounds with the corresponding phoneme string (In this case きゃ (< match >) kya (< phonemestring >))..group き き ki きー ki: きゃ kya きゃー kya: Gemination (see section 3.5.7) is indicated by a small tsu character (in hiragana っ ) called the sokuon, however on its own it does not have a reading. Therefore, a check is needed on which syllable follows, this can be done by using the (< post > section of the rule. This section can provide rules such as, if the next character starts with the syllable k, change the sokuon character to that syllable. This is also exactly what has been done in this implementation, however the syllable starting with the same syllable have been grouped together in order to make it more efficient and more readable. This grouping has be done as following:.l01 かきくけこ //starts with k.l02 がぎぐげご //starts with g.l03 さしすせそ //starts with s... 4 for more information see 14

15 Where L01, L02, etc. defines a group of letter sequences (in this case the hiragana characters start with the same syllable). The text to phoneme translation for the sokuon therefore becomes:.group っっ (L01 k っ (L02 g っ (L03 s... っ? Notice that the rules now have a (< post > section meaning that one of these characters in the specified group is must after the sokuon in order for the rule to be true. This makes the phoneme translation from the sokuon to the specified syllable in the < phonemestring >. Another rule in the sokuon pronunciation is that when the sokuon is at the end of a word this is pronounced as a glottal stop (e.g. あっ ) indicated with a question mark. The (< post > section of the rule ( means that the next syllable is a pause or a hyphen, which in this case means that the sokuon must be at the end of a word. The rule that has been implemented to devoice the last syllable on polite nonpast verbs is based on the fact that those verbs always end on masu (with exception of the word desu). Therefore our rule simply puts the whole word masu in the < match > section where as the last syllable (u) must be at the end of the word (the ( rule in the (< post > section). In doing so all polite non-past verbs are found and correctly devoiced, the exception case desu is handled in the same manner with the addition of the ) rule in the (< pre > section Phoneme definitions Defining the correct Japanese phonemes must be done in the phoneme file dedicated to the Japanese language (i.e. the ph japanese file). In this file, a phoneme table for the Japanese language must be given, if however a phoneme is not defined in the phoneme table, but is use in the rules files (i.e. the ja rules file) the espeak system will use its base phoneme. In these phoneme definition attributes like the correct IPA notation, place of articulation and the reference to the corresponding sound files for that specific phoneme can be given. For a detailed explanation on the rules and possibilities within these definitions please refer to the documentation on the espeak page 5. It is for example also possible to import phonemes from other files by using the import phoneme statement which can be used to copy a previously defined phoneme from a specified phoneme table. The phoneme table contains a list of phoneme definitions, which has the following structure: phoneme u ipa IF NOT nextphw(isvoiced) AND NOT prevphw(isvoiced) AND... NOT thisph(iswordend) AND thisph(notwordstart) THEN 5 for more information see 15

16 ChangePhoneme(u0) ENDIF vowel starttype #u endtype #u length 83 FMT(vowel/uu bck) endphoneme In this case the phoneme u is defined for the Japanese language, in this phoneme description the correct IPA notation, as well as the correct length and the sound for that phoneme is given. The sound file which is referenced to has been chosen relative to the formants of the Japanese vowel u and is available in the espeak sound database. The starttype and endtype allocates the phonemes to groups so that functions can be tested on groups of phonemes. As can be seen it is also possible to add if statements to the phoneme descriptions, in this case, if the phoneme u is between voiceless phonemes and is not at the beginning nor at the end of a word, this phoneme will be changed to the u0 phoneme (which devoices the vowel). This because the ChangePhoneme(u0) function changes the current phoneme to the phoneme u0 which is a devoiced vowel u phoneme u0 ipa IF prevphw(s) THEN WAV(ufric/s ) ENDIF vowel starttype #u endtype #u length 83 endphoneme Here can be seen that the phoneme how no sound defined, which is the cause of the phoneme being devoiced. However if the preceding phoneme is the phoneme s, the s sound is given for the phoneme u. This due to the rule that the syllable s takes over the the unvoiced u, when the phoneme u is devoiced and the preceding phoneme is the s phoneme. The use of the function ChangePhoneme is also used to solve the problem of the moraic nasal n as was earlier described in this paper. For this, the different phonemes descriptions for the articulations of the syllable n (where the articulation point differs) have been imported from the ph consonant file. Rules have then been added in the form of: IF nextphw(b) OR nextphw(b) OR nextphw(b) THEN ChangePhoneme(m) ENDIF Whereas now the correct phoneme definitions are used for the different types of articulation of the syllable ん (n). An easier phoneme description, is the phoneme description describing the long vowel, which only (with respect to the previously shown phoneme u) changes in length: phoneme u: 16

17 ipa : vowel starttype #u endtype #u length 153 FMT(vowel/uu bck) endphoneme It is also possible to simply call other phonemes, for example this is the case with the y phoneme which is phonetically transcribed as /j/: phoneme y ipa j CALL base/j endphoneme 4.3 Input using Rōmaji Due to the fact that the modified Hepburn rōmaji writing system is already optimised for English pronunciation, only straightforward rules have to be implemented. For example long vowels which are inputted as ā, ī, ū, ē and ō, have groups like the following:.group a a a.group ā ā a: One thing to take in to consideration however is that phonemes with two (or more) syllables need to be explicitly denoted in the rules. Take for instance the phoneme ch, when we only have rules for c and h the espeak system will use two separate phonemes c and h instead of the desired phoneme ch. The fix for this will be to add the ch combination to the c group..group c c ch c ch The moriac nasal n is another example which needs some extra care. As explained previously in this paper (see section hepbrun), the character n needs to be disambiguated from characters in the n-row. The rules of the modified Hepburn romanization stated that only when the n is preceded by a vowel or y, should the character n be written as n. This is due to the fact that when the syllable n is preceded by a consonant it must be the character ん. Because of the fact that characters in the n-row are always followed by a vowel. This is also true for when the syllable n is at the end of a word, this will give us the following rules in order to disambiguate the character ん from the n-row:.group n n n n (C N n ( N n N Where C stands for any arbitrary consonant, N is the phoneme for the character ん and ( means that n is at the end of a word. 17

18 4.4 Latin characters for abbreviations In Japanese the capitalized Latin characters are used for abbreviations. Some frequently used ones are JR and NHK, these abbreviations in Latin characters are used next to the Japanese characters. The capitalized latin characters are supposed to be pronounced as isolated English characters. However, the pronunciation is changed to be able to be pronounced within the available Japanese characters. For example, JR is pronounced as / ジェイアール / which is /jeia:ru/. Therefore rules of the following form are required:.group J J.group R R jei a:ru However, the current version of espeak automatically decapitalises and due to the fact that we also allow uncapitalized Latin characters (the modified hepburn romanization system) there are already groups of the following form:.group j j.group r r j r Due to this (and the standardization to uncapitalized characters), the current implementation pronounces JR as /jr/ instead of /jeia:ru/. If however, at a later point it is possible to turn off automatic decapitalisation for the Japanese language, this implementation will be sufficient. 4.5 Kanji As previously stated, the current implementation does not support kanji. However there is in espeak a way to normalise kanji into hiragana, namely by using the look-up dictionary file (ja list). The problem with this however, is that all the possible kanji combinations and readings of the kanji need to inserted into this file, which does not seem like an optimal solution. However to illustrate how this would work, I added a single kanji combination ( 漢字 ) into the look-up dictionary file. $textmode 漢字かんじ The $textmode indicates that a conversion between text to text is wanted instead of the also possible text to phoneme conversion. When this kanji ( 漢字 ) is given as input it will be normalised to hiragana and then get parsed by the system like any other hiragana input. 18

19 5 Results and Evaluation The best way to get an idea of the actual results of this implementation, would be by simply listening to the output of the speech synthesiser, which was also the main evaluation method during each new implementation within the system. However, also the parsing of Japanese text was a big part of this project. For this our main focus was that all the syllables should be pronounced correctly (i.e. have the correct translation from Japanese text to phoneme transcription). For this we tested with the focus on gemination of syllables, the moriac nasal n and the right IPA notations given an input sentence (which would indicate that the right phonemes are used) as well as the normalisation from katakana and half-width katakana to hiragana, in which all conversions to hiragana are now done correctly. For this random words with the focus on the previous stated issues were inserted, given the knowledge that the system possesses on the rules of the pronunciation of Japanese characters the system transcribed the words correctly, however at this point of time not all the pronunciation rules of the language could be implemented. The main reason for this is the fact that this system does not yet have a lexical analysis system. The actual use of this implementation will be during the automatic segmentation process within praat. Although this implementation regarding Japanese language support is during this evaluation not yet supported by praat, this feature will be available at the time this implementation becomes available for espeak. Given is a manually segmented output of the system, when comparing this analysis with my own pronunciation of this Japanese text, many specific similarities can be found which gives high hopes for the dynamic time warping algorithm which is used during the segmentation process. 19

20 6 Conclusion The provided implementation is capable of pronouncing hiragana, katakana and rōmaji using the hepburn romanization system. However the implementation still comes with restrictions to the user, which are unnatural. For example the need to write tōkyō like / とーきょー / instead of / とうきょう / or even as / 東京 /. The main goal of this paper was however to provide assistance during the segmentation process of praat, where there is a high probability such a situation is the case. On the other hand this implementation already provides a good starting point for further research regarding the implementation of Japanese speech synthesis support within espeak, where this implementation could be used as a start-up block. 7 Future work The current implementation has its main focus on being able to correctly pronouncing Japanese characters and using this during the segmentation process in praat. However this implementation comes with various restrictions, by which the user has to alter the input in order to get correct pronunciation. This of course is not an optimal solution and in order to overcome these restriction some additional processing of the input is needed. There are two main component that could help overcome these restrictions and can provide to a more natural input, whereas any Japanese sentence (with kanji etc.) can be processed by the system. The first component is a lexical analyse system which can parse kanji and is capable of providing a grammatical analyse (i.e. show which characters are particles). This type of system should also be able to normalise the kanji to the hiragana writing system (or even rōmaji), which then can be parsed by the already present implementation for espeak. Various lexical analyse systems are already available for the Japanese language, however needed is a system which works offline and is in terms with the General Public Licence (GPL). Furthermore the Japanese language uses a pitch accent, which is not yet implemented in this current implementation. For this a phonetic database could be used, like the database described by Halpern (n.d.), where a so called binary pitch is used and the pitch is presented with a two-pitch-level model. In this model a pitch can be either high (H) or low (L), and each mora (for a specific word) has its own pitch level. This implementation of speech is needed due to presence of words that only differ in accent, but also to make the system more natural sounding. With these additional systems the input sentence can be normalised to hiragana characters, which the current implementation is already capable of pronouncing. With this the current restrictions (e.g. particles) can be solved making the input more natrual. 7.1 espeak functionality Additional functionality needs to be implemented within the espeak system in order to be able to correctly pronounce Japanese text are the following: In the rules (i.e. ja rules) it must be possible to check for a question mark in order to be able to implement the exception where the rise in pitch overrides the devocing 20

21 process. Also due to the automatic decapitalisation of the input text, it is not yet possible to have to both support rōmaji as capatalised latin characters for abbreviations. This due to the fact that the capatalised latin characters will be automatically decapatalised and will be parsed as rōmaji syllables instead of correctly pronouncing the abbreviations. 21

22 References [1] Ricardo A. H. Bion, Kouki Miyazawa, Hideaki Kikuchi, and Reiko Mazuka. Learning phonemic vowel length from naturalistic recordings of japanese infant-directed speech. PLoS ONE, 8(2):e51594, [2] Paul Boersma and David Weenink. Praat: doing phonetics by computer, [3] Jonathan Duddington. espeak text to speech, [4] Jack Halpern. The role of phonetics and phonetic databases in japanese speech technology. [5] I. Kawase, M. Sugihara, and Kokusai Kōryū Kikin. Nihongo, the pronunciation of Japanese. Japan Foundation, [6] Dennis H. Klatt. Software for a cascade/parallel formant synthesizer. 67: , [7] Dennis H. Klatt and Laura C. Klatt. Analysis, synthesis, and perception of voice quality variations among female and male talkers. 87: , [8] Kawahara Shigeto. The phonetics of obstruent geminates, sokuon. 2012/forthcoming. [9] T.J. Vance. The Sounds of Japanese with Audio CD. Cambridge University Press,

23 A How to use this initial implementation Currently, the implementation allows for hiragana and katagana characters but also input using the modified Hepburn romanization system. However, due to the fact that there is at this point no lexical analysis is available available, some ambiguity in pronunciation occurs. In order to overcome these ambiguities we force the user to make the input unambiguous with regard to pronunciation. 1. The system request the user to added word segmentation (section 4.1) 2. Particles need to written as pronounced (section 3.5.4) 3. Long vowels need to written with the prolonged sound mark (section 3.5.1) Therefore a sentence like: / これは東京です / (this is tōkyō) could be inserted as / これわとーきょーです / where there is no kanji, the user provides word segmentation and long vowels are explicitly written with a prolonged sound mark. However it is also possible to input the sentence using the modified hepbrun romanization system: /kore wa tōkyō desu/. Which has next to the rules of the modified hepbrun romanization system no additional restrictions. 23

24 B IPA for Japanese IPA for Japanese as found on wikipedia 6 IPA Japanese example English approximation b basho bog ç hito hue C shita, shugo sheep d dōmo dome dz, z zutto rods, zen dý, ý jibun, gojū jeep, garagist F fugu who g gakusei gape h hon hone j yakusha yak k kuru skate m mikan much n nattō not ð nihon long N ringo, rinku finger, pink p pan span R roku close to /t/ in auto in American English s suru sue t taberu stop ts tsunami cats tc chikai, kinchō itchy w wasabi was P (in Ryukyu languages) uh-oh! a aru roughly like father e eki roughly like met i iru need i yoshi, shita (almost silent) o oniisan roughly like sore unagi desu, sukiyaki roughly like foot (almost silent) 6 for more information see 24

Japanese Language Course 2017/18

Japanese Language Course 2017/18 Japanese Language Course 2017/18 The Faculty of Philosophy, University of Sarajevo is pleased to announce that a Japanese language course, taught by a native Japanese speaker, will be offered to the citizens

More information

My Japanese Coach: Lesson I, Basic Words

My Japanese Coach: Lesson I, Basic Words My Japanese Coach: Lesson I, Basic Words Lesson One: Basic Words Hi! I m Haruka! It s nice to meet you. I m here to teach you Japanese. So let s get right into it! Here is a list of words in Japanese.

More information

Consonants: articulation and transcription

Consonants: articulation and transcription Phonology 1: Handout January 20, 2005 Consonants: articulation and transcription 1 Orientation phonetics [G. Phonetik]: the study of the physical and physiological aspects of human sound production and

More information

Add -reru to the negative base, that is to the "-a" syllable of any Godan Verb. e.g. becomes becomes

Add -reru to the negative base, that is to the -a syllable of any Godan Verb. e.g. becomes becomes The "Passive." Formation i) Ichidan Verbs: Add -rareru to the negative base, e.g. remove from, add inflection to thus, ii. Godan Verbs: Add -reru to the negative base, that is to the "-a" syllable of any

More information

Phonetics. The Sound of Language

Phonetics. The Sound of Language Phonetics. The Sound of Language 1 The Description of Sounds Fromkin & Rodman: An Introduction to Language. Fort Worth etc., Harcourt Brace Jovanovich Read: Chapter 5, (p. 176ff.) (or the corresponding

More information

Emphasizing Informality: Usage of tte Form on Japanese Conversation Sentences

Emphasizing Informality: Usage of tte Form on Japanese Conversation Sentences DOI:10.217716/ub.icon_laterals.2016.001.1.42 Emphasizing Informality: Usage of tte Form on Japanese Conversation Sentences Risma Rismelati Universitas Padjadjaran Jatinangor, Faculty of Humanities Sumedang,

More information

The Interplay of Text Cohesion and L2 Reading Proficiency in Different Levels of Text Comprehension Among EFL Readers

The Interplay of Text Cohesion and L2 Reading Proficiency in Different Levels of Text Comprehension Among EFL Readers The Interplay of Text Cohesion and L2 Reading Proficiency in Different Levels of Text Comprehension Among EFL Readers Masaya HOSODA Graduate School, University of Tsukuba / The Japan Society for the Promotion

More information

What is the status of task repetition in English oral communication

What is the status of task repetition in English oral communication 32 The Language Teacher FEATURE ARTICLE A case for iterative practice: Learner voices Harumi Kimura Miyagi Gakuin Women s University What is the status of task repetition in English oral communication

More information

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.** **Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.** REANALYZING THE JAPANESE CODA NASAL IN OPTIMALITY THEORY 1 KATSURA AOYAMA University

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

Phonological Processing for Urdu Text to Speech System

Phonological Processing for Urdu Text to Speech System Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,

More information

Teaching intellectual property (IP) English creatively

Teaching intellectual property (IP) English creatively JALT2010 Conference Proceedings 619 Teaching intellectual property (IP) English creatively Kevin Knight Kanda University of International Studies Reference data: Knight, K. (2011). Teaching intellectual

More information

JAPELAS: Supporting Japanese Polite Expressions Learning Using PDA(s) Towards Ubiquitous Learning

JAPELAS: Supporting Japanese Polite Expressions Learning Using PDA(s) Towards Ubiquitous Learning Original paper JAPELAS: Supporting Japanese Polite Expressions Learning Using PDA(s) Towards Ubiquitous Learning Chengjiu Yin, Hiroaki Ogata, Yoneo Yano, Yasuko Oishi Summary It is very difficult for overseas

More information

Challenging Assumptions

Challenging Assumptions JALT2007 Challenging Assumptions Looking In, Looking Out Learner voices: Reflections on secondary education Joseph Falout Nihon University Tim Murphey Kanda University of International Studies James Elwood

More information

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

The analysis starts with the phonetic vowel and consonant charts based on the dataset: Ling 113 Homework 5: Hebrew Kelli Wiseth February 13, 2014 The analysis starts with the phonetic vowel and consonant charts based on the dataset: a) Given that the underlying representation for all verb

More information

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access Joyce McDonough 1, Heike Lenhert-LeHouiller 1, Neil Bardhan 2 1 Linguistics

More information

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud

More information

Phonology Revisited: Sor3ng Out the PH Factors in Reading and Spelling Development. Indiana, November, 2015

Phonology Revisited: Sor3ng Out the PH Factors in Reading and Spelling Development. Indiana, November, 2015 Phonology Revisited: Sor3ng Out the PH Factors in Reading and Spelling Development Indiana, November, 2015 Louisa C. Moats, Ed.D. (louisa.moats@gmail.com) meaning (semantics) discourse structure morphology

More information

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers October 31, 2003 Amit Juneja Department of Electrical and Computer Engineering University of Maryland, College Park,

More information

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech

Quarterly Progress and Status Report. VCV-sequencies in a preliminary text-to-speech system for female speech Dept. for Speech, Music and Hearing Quarterly Progress and Status Report VCV-sequencies in a preliminary text-to-speech system for female speech Karlsson, I. and Neovius, L. journal: STL-QPSR volume: 35

More information

Using a Native Language Reference Grammar as a Language Learning Tool

Using a Native Language Reference Grammar as a Language Learning Tool Using a Native Language Reference Grammar as a Language Learning Tool Stacey I. Oberly University of Arizona & American Indian Language Development Institute Introduction This article is a case study in

More information

Universal contrastive analysis as a learning principle in CAPT

Universal contrastive analysis as a learning principle in CAPT Universal contrastive analysis as a learning principle in CAPT Jacques Koreman, Preben Wik, Olaf Husby, Egil Albertsen Department of Language and Communication Studies, NTNU, Trondheim, Norway jacques.koreman@ntnu.no,

More information

Contrasting English Phonology and Nigerian English Phonology

Contrasting English Phonology and Nigerian English Phonology Contrasting English Phonology and Nigerian English Phonology Saleh, A. J. Rinji, D.N. ABSTRACT The thrust of this work is the fact that phonology plays a vital role in language and communication both in

More information

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula

Quarterly Progress and Status Report. Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Voiced-voiceless distinction in alaryngeal speech - acoustic and articula Nord, L. and Hammarberg, B. and Lundström, E. journal:

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Sounds of Infant-Directed Vocabulary: Learned from Infants Speech or Part of Linguistic Knowledge?

Sounds of Infant-Directed Vocabulary: Learned from Infants Speech or Part of Linguistic Knowledge? 21 1 2017 29 4 45 58 Journal of the Phonetic Society of Japan, Vol. 21 No. 1 April 2017, pp. 45 58 Sounds of Infant-Directed Vocabulary: Learned from Infants Speech or Part of Linguistic Knowledge? Reiko

More information

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Pobrane z czasopisma New Horizons in English Studies  Data: 18/11/ :52:20. New Horizons in English Studies 1/2016 LANGUAGE Maria Curie-Skłodowska University () in Lublin k.laidler.umcs@gmail.com Online Adaptation of Word-initial Ukrainian CC Consonant Clusters by Native Speakers of English Abstract. The phenomenon

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

THE PERCEPTIONS OF THE JAPANESE IMPERFECTIVE ASPECT MARKER TEIRU AMONG NATIVE SPEAKERS AND L2 LEARNERS OF JAPANESE

THE PERCEPTIONS OF THE JAPANESE IMPERFECTIVE ASPECT MARKER TEIRU AMONG NATIVE SPEAKERS AND L2 LEARNERS OF JAPANESE THE PERCEPTIONS OF THE JAPANESE IMPERFECTIVE ASPECT MARKER TEIRU AMONG NATIVE SPEAKERS AND L2 LEARNERS OF JAPANESE by YOSHIYUKI HARA A THESIS Presented to the Department of East Asian Languages and Literatures

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Phonological and Phonetic Representations: The Case of Neutralization

Phonological and Phonetic Representations: The Case of Neutralization Phonological and Phonetic Representations: The Case of Neutralization Allard Jongman University of Kansas 1. Introduction The present paper focuses on the phenomenon of phonological neutralization to consider

More information

Body-Conducted Speech Recognition and its Application to Speech Support System

Body-Conducted Speech Recognition and its Application to Speech Support System Body-Conducted Speech Recognition and its Application to Speech Support System 4 Shunsuke Ishimitsu Hiroshima City University Japan 1. Introduction In recent years, speech recognition systems have been

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Word Stress and Intonation: Introduction

Word Stress and Intonation: Introduction Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress

More information

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Stromswold & Rifkin, Language Acquisition by MZ & DZ SLI Twins (SRCLD, 1996) 1 Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin Dept. of Psychology & Ctr. for

More information

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona Parallel Evaluation in Stratal OT * Adam Baker University of Arizona tabaker@u.arizona.edu 1.0. Introduction The model of Stratal OT presented by Kiparsky (forthcoming), has not and will not prove uncontroversial

More information

MARK 12 Reading II (Adaptive Remediation)

MARK 12 Reading II (Adaptive Remediation) MARK 12 Reading II (Adaptive Remediation) The MARK 12 (Mastery. Acceleration. Remediation. K 12.) courses are for students in the third to fifth grades who are struggling readers. MARK 12 Reading II gives

More information

Consonant-Vowel Unity in Element Theory*

Consonant-Vowel Unity in Element Theory* Consonant-Vowel Unity in Element Theory* Phillip Backley Tohoku Gakuin University Kuniya Nasukawa Tohoku Gakuin University ABSTRACT. This paper motivates the Element Theory view that vowels and consonants

More information

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading ELA/ELD Correlation Matrix for ELD Materials Grade 1 Reading The English Language Arts (ELA) required for the one hour of English-Language Development (ELD) Materials are listed in Appendix 9-A, Matrix

More information

CAS LX 522 Syntax I. Long-distance wh-movement. Long distance wh-movement. Islands. Islands. Locality. NP Sea. NP Sea

CAS LX 522 Syntax I. Long-distance wh-movement. Long distance wh-movement. Islands. Islands. Locality. NP Sea. NP Sea 19 CAS LX 522 Syntax I wh-movement and locality (9.1-9.3) Long-distance wh-movement What did Hurley say [ CP he was writing ]? This is a question: The highest C has a [Q] (=[clause-type:q]) feature and

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Affricates. Affricates, nasals, laterals and continuants. Affricates. Affricates. Study questions

Affricates. Affricates, nasals, laterals and continuants. Affricates. Affricates. Study questions , nasals, laterals and continuants Phonetics of English 1 1. Tip artikulacije (type of articulation) /tʃ, dʒ/ su suglasnici (consonants) 2. Način artikulacije (manner of articulation) /tʃ, dʒ/ su afrikati

More information

<September 2017 and April 2018 Admission>

<September 2017 and April 2018 Admission> Waseda University Graduate School of Environment and Energy Engineering Special Admission Guide for International Students Master s and Doctoral Programs for Applicants from Overseas Partner Universities

More information

source or where they are needed to distinguish two forms of a language. 4. Geographical Location. I have attempted to provide a geographical

source or where they are needed to distinguish two forms of a language. 4. Geographical Location. I have attempted to provide a geographical Database Structure 1 This database, compiled by Merritt Ruhlen, contains certain kinds of linguistic and nonlinguistic information for the world s roughly 5,000 languages. This introduction will discuss

More information

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all

1. REFLEXES: Ask questions about coughing, swallowing, of water as fast as possible (note! Not suitable for all Human Communication Science Chandler House, 2 Wakefield Street London WC1N 1PF http://www.hcs.ucl.ac.uk/ ACOUSTICS OF SPEECH INTELLIGIBILITY IN DYSARTHRIA EUROPEAN MASTER S S IN CLINICAL LINGUISTICS UNIVERSITY

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

A Believable Accent: The Phonology of the Pink Panther

A Believable Accent: The Phonology of the Pink Panther William Pickett California State University, Fullerton A Believable Accent: The Phonology of the Pink Panther If the empirical data employed by a linguist is defined as that which is verifiable or provable

More information

CJS was honored to have Izukura share his innovative techniques with the larger UHM community, where he showcased indoor and outdoor

CJS was honored to have Izukura share his innovative techniques with the larger UHM community, where he showcased indoor and outdoor ʻ As the biggest program of the academic year, the Center for Japanese Studies hosted Mr. Akihiko Izukura, an internationally renown textile artist from Kyoto, Japan. From January 15 to February 15, Izukura

More information

Speaker Recognition. Speaker Diarization and Identification

Speaker Recognition. Speaker Diarization and Identification Speaker Recognition Speaker Diarization and Identification A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

DIBELS Next BENCHMARK ASSESSMENTS

DIBELS Next BENCHMARK ASSESSMENTS DIBELS Next BENCHMARK ASSESSMENTS Click to edit Master title style Benchmark Screening Benchmark testing is the systematic process of screening all students on essential skills predictive of later reading

More information

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English

An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Linguistic Portfolios Volume 6 Article 10 2017 An Acoustic Phonetic Account of the Production of Word-Final /z/s in Central Minnesota English Cassy Lundy St. Cloud State University, casey.lundy@gmail.com

More information

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 46 ( 2012 ) 3011 3016 WCES 2012 Demonstration of problems of lexical stress on the pronunciation Turkish English teachers

More information

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment Akiko Sakamoto, Kazuhiko Abe, Kazuo Sumita and Satoshi Kamatani Knowledge Media Laboratory,

More information

Part I. Figuring out how English works

Part I. Figuring out how English works 9 Part I Figuring out how English works 10 Chapter One Interaction and grammar Grammar focus. Tag questions Introduction. How closely do you pay attention to how English is used around you? For example,

More information

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations

To appear in the Proceedings of the 35th Meetings of the Chicago Linguistics Society. Post-vocalic spirantization: Typology and phonetic motivations Post-vocalic spirantization: Typology and phonetic motivations Alan C-L Yu University of California, Berkeley 0. Introduction Spirantization involves a stop consonant becoming a weak fricative (e.g., B,

More information

Information Session 13 & 19 August 2015

Information Session 13 & 19 August 2015 Information Session 13 & 19 August 2015 Mr Johnie Goh Office of Global Education & Mobility Increase career prospects Immerse in another culture Complement your language studies in NTU Earn AUs during

More information

Fluency is a largely ignored area of study in the years leading up to university entrance

Fluency is a largely ignored area of study in the years leading up to university entrance JALT2009 Conference Proceedings 662 Timed reading: Increasing reading speed and fluency Reference data: Atkins, A. (2010) Timed reading: Increasing reading speed and fluency. In A. M. Stoke (Ed.), JALT2009

More information

Lexical phonology. Marc van Oostendorp. December 6, Until now, we have presented phonological theory as if it is a monolithic

Lexical phonology. Marc van Oostendorp. December 6, Until now, we have presented phonological theory as if it is a monolithic Lexical phonology Marc van Oostendorp December 6, 2005 Background Until now, we have presented phonological theory as if it is a monolithic unit. However, there is evidence that phonology consists of at

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan James White & Marc Garellek UCLA 1 Introduction Goals: To determine the acoustic correlates of primary and secondary

More information

age, Speech and Hearii

age, Speech and Hearii age, Speech and Hearii 1 Speech Commun cation tion 2 Sensory Comm, ection i 298 RLE Progress Report Number 132 Section 1 Speech Communication Chapter 1 Speech Communication 299 300 RLE Progress Report

More information

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny By the End of Year 8 All Essential words lists 1-7 290 words Commonly Misspelt Words-55 working out more complex, irregular, and/or ambiguous words by using strategies such as inferring the unknown from

More information

UKLO Round Advanced solutions and marking schemes. 6 The long and short of English verbs [15 marks]

UKLO Round Advanced solutions and marking schemes. 6 The long and short of English verbs [15 marks] UKLO Round 1 2013 Advanced solutions and marking schemes [Remember: the marker assigns points which the spreadsheet converts to marks.] [No questions 1-4 at Advanced level.] 5 Bulgarian [15 marks] 12 points:

More information

CEFR Overall Illustrative English Proficiency Scales

CEFR Overall Illustrative English Proficiency Scales CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey

More information

Task Types. Duration, Work and Units Prepared by

Task Types. Duration, Work and Units Prepared by Task Types Duration, Work and Units Prepared by 1 Introduction Microsoft Project allows tasks with fixed work, fixed duration, or fixed units. Many people ask questions about changes in these values when

More information

Natural Language Processing. George Konidaris

Natural Language Processing. George Konidaris Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

More information

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 The NICT/ATR speech synthesis system for the Blizzard Challenge 2008 Ranniery Maia 1,2, Jinfu Ni 1,2, Shinsuke Sakai 1,2, Tomoki Toda 1,3, Keiichi Tokuda 1,4 Tohru Shimizu 1,2, Satoshi Nakamura 1,2 1 National

More information

REVIEW OF CONNECTED SPEECH

REVIEW OF CONNECTED SPEECH Language Learning & Technology http://llt.msu.edu/vol8num1/review2/ January 2004, Volume 8, Number 1 pp. 24-28 REVIEW OF CONNECTED SPEECH Title Connected Speech (North American English), 2000 Platform

More information

5. Margi (Chadic, Nigeria): H, L, R (Williams 1973, Hoffmann 1963)

5. Margi (Chadic, Nigeria): H, L, R (Williams 1973, Hoffmann 1963) 24.961 Tone-1: African Languages 1. Main theme the study of tone in African lgs. raised serious conceptual problems for the representation of the phoneme as a bundle of distinctive features. the solution

More information

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald

SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION. Adam B. Buchwald SOUND STRUCTURE REPRESENTATION, REPAIR AND WELL-FORMEDNESS: GRAMMAR IN SPOKEN LANGUAGE PRODUCTION by Adam B. Buchwald A dissertation submitted to The Johns Hopkins University in conformity with the requirements

More information

Automatic English-Chinese name transliteration for development of multilingual resources

Automatic English-Chinese name transliteration for development of multilingual resources Automatic English-Chinese name transliteration for development of multilingual resources Stephen Wan and Cornelia Maria Verspoor Microsoft Research Institute Macquarie University Sydney NSW 2109, Australia

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

Information Retrieval

Information Retrieval Information Retrieval Suan Lee - Information Retrieval - 02 The Term Vocabulary & Postings Lists 1 02 The Term Vocabulary & Postings Lists - Information Retrieval - 02 The Term Vocabulary & Postings Lists

More information

Handout #8. Neutralization

Handout #8. Neutralization Handout #8 Neutralization German obstruents ([-son]) [-cont, -delrel] [+lab, - cor, -back] p, b [-lab, +cor, -back] t, d [-lab, -cor, +back] k, g [-cont, +delrel] pf ts, ts [+cont, +delrel] f, v s, z,

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets

More information

Underlying Representations

Underlying Representations Underlying Representations The content of underlying representations. A basic issue regarding underlying forms is: what are they made of? We have so far treated them as segments represented as letters.

More information

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company Table of Contents Welcome to WiggleWorks... 3 Program Materials... 3 WiggleWorks Teacher Software... 4 Logging In...

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative English Teaching Cycle The English curriculum at Wardley CE Primary is based upon the National Curriculum. Our English is taught through a text based curriculum as we believe this is the best way to develop

More information

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010 1 Procedures and Expectations for Guided Writing Procedures Context: Students write a brief response to the story they read during guided reading. At emergent levels, use dictated sentences that include

More information

Possibility to Prevent Learning Disabilities (LD) in School by Performing Special Developmental Intervention to them in Preschool period

Possibility to Prevent Learning Disabilities (LD) in School by Performing Special Developmental Intervention to them in Preschool period Possibility to Prevent Learning Disabilities (LD) in School by Performing Special Developmental Intervention to them in Preschool period Kiyoshi Amano, Institute of Cultural Science. Chuo University. Tokyo

More information

Noisy Channel Models for Corrupted Chinese Text Restoration and GB-to-Big5 Conversion

Noisy Channel Models for Corrupted Chinese Text Restoration and GB-to-Big5 Conversion Computational Linguistics and Chinese Language Processing vol. 3, no. 2, August 1998, pp. 79-92 79 Computational Linguistics Society of R.O.C. Noisy Channel Models for Corrupted Chinese Text Restoration

More information

Voice conversion through vector quantization

Voice conversion through vector quantization J. Acoust. Soc. Jpn.(E)11, 2 (1990) Voice conversion through vector quantization Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara A TR Interpreting Telephony Research Laboratories,

More information

On Developing Acoustic Models Using HTK. M.A. Spaans BSc.

On Developing Acoustic Models Using HTK. M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. On Developing Acoustic Models Using HTK M.A. Spaans BSc. Delft, December 2004 Copyright c 2004 M.A. Spaans BSc. December, 2004. Faculty of Electrical

More information

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4 Lessons 1 4 Checklist Getting Started Lesson 1 Lesson 2 Lesson 3 Lesson 4 Introducing yourself Numbers 0 10 Names Indefinite articles: a / an this / that Useful expressions Classroom language Imperatives

More information

Unit 9. Teacher Guide. k l m n o p q r s t u v w x y z. Kindergarten Core Knowledge Language Arts New York Edition Skills Strand

Unit 9. Teacher Guide. k l m n o p q r s t u v w x y z. Kindergarten Core Knowledge Language Arts New York Edition Skills Strand q r s Kindergarten Core Knowledge Language Arts New York Edition Skills Strand a b c d Unit 9 x y z a b c d e Teacher Guide a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m

More information

3 Character-based KJ Translation

3 Character-based KJ Translation NICT at WAT 2015 Chenchen Ding, Masao Utiyama, Eiichiro Sumita Multilingual Translation Laboratory National Institute of Information and Communications Technology 3-5 Hikaridai, Seikacho, Sorakugun, Kyoto,

More information

Correspondence between the DRDP (2015) and the California Preschool Learning Foundations. Foundations (PLF) in Language and Literacy

Correspondence between the DRDP (2015) and the California Preschool Learning Foundations. Foundations (PLF) in Language and Literacy 1 Desired Results Developmental Profile (2015) [DRDP (2015)] Correspondence to California Foundations: Language and Development (LLD) and the Foundations (PLF) The Language and Development (LLD) domain

More information

Using computational modeling in language acquisition research

Using computational modeling in language acquisition research Chapter 8 Using computational modeling in language acquisition research Lisa Pearl 1. Introduction Language acquisition research is often concerned with questions of what, when, and how what children know,

More information

HISTORY COURSE WORK GUIDE 1. LECTURES, TUTORIALS AND ASSESSMENT 2. GRADES/MARKS SCHEDULE

HISTORY COURSE WORK GUIDE 1. LECTURES, TUTORIALS AND ASSESSMENT 2. GRADES/MARKS SCHEDULE HISTORY COURSE WORK GUIDE 1. LECTURES, TUTORIALS AND ASSESSMENT Lectures and Tutorials Students studying History learn by reading, listening, thinking, discussing and writing. Undergraduate courses normally

More information

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S N S ER E P S I M TA S UN A I S I T VER RANKING AND UNRANKING LEFT SZILARD LANGUAGES Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1997-2 UNIVERSITY OF TAMPERE DEPARTMENT OF

More information

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University

Linguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University Linguistics 220 Phonology: distributions and the concept of the phoneme John Alderete, Simon Fraser University Foundations in phonology Outline 1. Intuitions about phonological structure 2. Contrastive

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature 1 st Grade Curriculum Map Common Core Standards Language Arts 2013 2014 1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature Key Ideas and Details

More information

DegreeWorks Advisor Reference Guide

DegreeWorks Advisor Reference Guide DegreeWorks Advisor Reference Guide Table of Contents 1. DegreeWorks Basics... 2 Overview... 2 Application Features... 3 Getting Started... 4 DegreeWorks Basics FAQs... 10 2. What-If Audits... 12 Overview...

More information

TEKS Comments Louisiana GLE

TEKS Comments Louisiana GLE Side-by-Side Comparison of the Texas Educational Knowledge Skills (TEKS) Louisiana Grade Level Expectations (GLEs) ENGLISH LANGUAGE ARTS: Kindergarten TEKS Comments Louisiana GLE (K.1) Listening/Speaking/Purposes.

More information

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES MODELING IMPROVED AMHARIC SYLLBIFICATION ALGORITHM

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES MODELING IMPROVED AMHARIC SYLLBIFICATION ALGORITHM ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES MODELING IMPROVED AMHARIC SYLLBIFICATION ALGORITHM BY NIRAYO HAILU GEBREEGZIABHER A THESIS SUBMITED TO THE SCHOOL OF GRADUATE STUDIES OF ADDIS ABABA UNIVERSITY

More information

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present

More information