Speech Synthesis Using Android

Similar documents
LING 329 : MORPHOLOGY

Parsing of part-of-speech tagged Assamese Texts

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

English Language and Applied Linguistics. Module Descriptions 2017/18

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Longman English Interactive

Phonological Processing for Urdu Text to Speech System

Description: Pricing Information: $0.99

Florida Reading Endorsement Alignment Matrix Competency 1

Character Stream Parsing of Mixed-lingual Text

AQUA: An Ontology-Driven Question Answering System

Platform for the Development of Accessible Vocational Training

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Linking Task: Identifying authors and book titles in verbose queries

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

SIE: Speech Enabled Interface for E-Learning

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Modeling full form lexica for Arabic

2 User Guide of Blackboard Mobile Learn for CityU Students (Android) How to download / install Bb Mobile Learn? Downloaded from Google Play Store

Developing a TT-MCTAG for German with an RCG-based Parser

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Speech Recognition at ICSI: Broadcast News and beyond

Applications of memory-based natural language processing

Effect of Word Complexity on L2 Vocabulary Learning

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Appendix L: Online Testing Highlights and Script

Lectora a Complete elearning Solution

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

On-Line Data Analytics

Natural Language Processing. George Konidaris

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

An Interactive Intelligent Language Tutor Over The Internet

21st Century Community Learning Center

Mandarin Lexical Tone Recognition: The Gating Paradigm

ScienceDirect. Malayalam question answering system

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES MODELING IMPROVED AMHARIC SYLLBIFICATION ALGORITHM

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

Beyond the Pipeline: Discrete Optimization in NLP

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

A First-Pass Approach for Evaluating Machine Translation Systems

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Test Administrator User Guide

THE MULTIVOC TEXT-TO-SPEECH SYSTEM

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

5. UPPER INTERMEDIATE

Learning Methods in Multilingual Speech Recognition

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

GACE Computer Science Assessment Test at a Glance

Emmaus Lutheran School English Language Arts Curriculum

Letter-based speech synthesis

Busuu The Mobile App. Review by Musa Nushi & Homa Jenabzadeh, Introduction. 30 TESL Reporter 49 (2), pp

Oakland Unified School District English/ Language Arts Course Syllabus

Introduction to Moodle

Java Programming. Specialized Certificate

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

ASSISTIVE COMMUNICATION

Some Principles of Automated Natural Language Information Extraction

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Grade 11 Language Arts (2 Semester Course) CURRICULUM. Course Description ENGLISH 11 (2 Semester Course) Duration: 2 Semesters Prerequisite: None

Arabic Orthography vs. Arabic OCR

What the National Curriculum requires in reading at Y5 and Y6

Louisiana Free Materials List

CS 598 Natural Language Processing

Coast Academies Writing Framework Step 4. 1 of 7

English-German Medical Dictionary And Phrasebook By A.H. Zemback

Prentice Hall Literature: Timeless Voices, Timeless Themes, Platinum 2000 Correlated to Nebraska Reading/Writing Standards (Grade 10)

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

IMPROVING PRONUNCIATION DICTIONARY COVERAGE OF NAMES BY MODELLING SPELLING VARIATION. Justin Fackrell and Wojciech Skut

Demonstration of problems of lexical stress on the pronunciation Turkish English teachers and teacher trainees by computer

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

The Revised Math TEKS (Grades 9-12) with Supporting Documents

MERRY CHRISTMAS Level: 5th year of Primary Education Grammar:

Enter the World of Polling, Survey &

STATUS OF OPAC AND WEB OPAC IN LAW UNIVERSITY LIBRARIES IN SOUTH INDIA

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Smarter ELA/Literacy and Mathematics Interim Comprehensive Assessment (ICA) and Interim Assessment Blocks (IABs) Test Administration Manual (TAM)

Rental Property Management: An Android Application

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Oakland Unified School District English/ Language Arts Course Syllabus

Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge

Abstractions and the Brain

MISSISSIPPI OCCUPATIONAL DIPLOMA EMPLOYMENT ENGLISH I: NINTH, TENTH, ELEVENTH AND TWELFTH GRADES

HinMA: Distributed Morphology based Hindi Morphological Analyzer

Five Challenges for the Collaborative Classroom and How to Solve Them

Human Emotion Recognition From Speech

Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty

Transcription:

ISSN 2278 0211 (Online) Speech Synthesis Using Android Shailesh S. Sangle Assistant Professor, Department of Information Technology MCT s Rajiv Gandhi Institute of Technology, Mumbai, India Nilesh M. Patil Assistant Professor, Department of Information Technology MCT s Rajiv Gandhi Institute of Technology, Mumbai, India Abstract: Speech Synthesis is one of the most leading application areas in natural language processing (NLP). This is also known as Text- To-Speech (TTS) and is mainly the capability of the device to speak text of different languages. This application acts as an interface between two different representations of information, namely text and speech, to perform effective communication between two parties. Our main objective is to make an application of speech synthesis for Android based mobile phones. We have developed an application on the Android environment and the voice conversion libraries provided by Android environment are used. The application developed is user friendly and reliable and effective communication is performed. Keywords: NLP, TTS, Android, OS, SLR 1. Introduction Speech Synthesis is one of the major applications of NLP. We have developed an application using the Android operating system. Android is the open source OS developed by Google and is widely used within several types of embedded and mobile platforms, including mobile phones and tablets. Our work basically consists of three different aspects. First aspect is to convert English text to English speech. Second aspect is conversion of regional language text to regional voice. The third and most important aspect is the integration of the presented system on android environment. The android environment is the most common and the popular platform used in mobile devices so that the application can be attached to a mobile phone or the system so that the effective communication will be performed. 2. Text to Speech Conversion Our system consists of preprocessor, text analyzer, morphological analyzer, contextual analyzer, syntactic prosodic parser, letter to sound module and prosody generator. A preprocessor check for the correct syntax of the sentences and splits them into list of individual words. Text analyzer identifies numbers, abbreviations, and idioms and transforms them into full text as and when required. A morphological analyzer performs task to propose all possible part of speech categories for each word taken individually, on the basis of their spelling. Inflected, derived and compound words are decomposed into their elementary graphemes units by simple regular grammars exploiting lexicons of stems and affixes. The contextual analyzer module considers words in their context, which allows it to reduce the list of their possible part of speech of neighboring words. Finally a syntactic parser examines the remaining search space and finds the text structure which more closely relates to its expected prosodic realization. In this application we used an algorithmic approach to perform the TTS conversion. Speech synthesis is the artificial production of human speech. It converts normal language text into speech. A TTS engine converts written text to a phonemic representation and then converts the phonemic representation to waveforms that can be output as sound. A TTS engine is composed of front end and back end. At the earlier stage the preprocessing is done on input text. Front end is responsible for preprocessing by converting raw text (containing symbols like numbers and abbreviations) into equivalent of written out words. This process is also called normalization or tokenization. After the input text is split to the individual words, classification of the word is done. The front end assigns phonetic transcriptions to each word, divides and marks the text into prosodic units, likes phrases, clauses and sentences known as text to phoneme conversion. Once the phonetic equivalent is obtained, the next work is to connect it with the lookup library to identify the voice representation of that specific word. Phonetic transcriptions and prosody information together make up the sign language INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT Page 352

recognition (SLR) that is output by the front end. At the final stage, the library connected to produce the person specific voice. Back end converts the SLR into sound. 3. Related Work Er. Sheilly Padda and Er. Nidhi have discussed the text to speech conversion for Punjabi (Gurmukhi) language [1]. The paper also discusses various issues which were found when converting text to speech. Eyob B. Kaise proposed algorithms and methods that address critical issues in developing a general Amharic text to speech synthesizer [2]. Aidan Kehoe proposes a number of guidelines to assist in the creation and testing of help material that may be presented to users via speech synthesis engines [3]. Erik Blankenship describes handicapped accessible text to speech markup software developed for poetry and performance [4]. 4. Proposed Work The following steps were performed to develop the application. To get the natural quality in synthetic speech we adopted concatenative speech synthesis techniques. For speech synthesis, phonemes of the English language were used as the basic ingredients. Using these phonemes, speech database for English language was developed. The input text was then separated into English phonemes. Phonemes were searched in the database and corresponding phoneme sounds were concatenated to generate synthesized output speech. We developed this application to provide an efficient language translator in mobile phones which will provide hand-held device users with the advantage of instantaneous and non-mediated translation from one human language to another. Two way communications is possible between the users with minimum time lag. The communication is performed from English text to English voice and the Hindi text to English form and then to Hindi speech. However, a person can understand a sentence only if it is pronounced correctly. But still there are gaps in pronouncing in mobile computing. So this application has come up with a better and user understandable pronunciation mechanism. Current speech recognition API s are only capable of recognizing a single word. This application will enhance the speech recognition to recognize sentences. Next one is the homophone detection. A homophone is a word that is pronounced the same but differs in meaning (example, to too two). The speech recognition engine will be able to detect those words according to the sentence. 5. Algorithm 5.1. English Text to English Speech The application of speech synthesis is developed in Android 4.04. The following procedure was carried to convert the English text to English speech as shown in the flowchart A. First we took the text in English language as the input. By means of lexical analyzer, we split that text into individual words. Then we searched in the library for an equivalent phonetics of those individual words. After that as per the text in English, we arrange this phonetics. Then the corresponding phoneme sounds were concatenated to generate synthesized output speech. INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT Page 353

5.2. Hindi Text to Hindi Speech The application of speech synthesis is developed in Android 4.04. The following procedure was carried to convert the Hindi text to Hindi speech as shown in the flowchart B. First we took the text in Hindi language as the input. By means of lexical analyzer, we split that text into individual words. Then we map these tokens into English language. By means of lexical analyzer again, we split that text into individual words. Then we searched in the library for an equivalent phonetics of those individual words. After that as per the text in English, we arrange this phonetics. Then the corresponding phoneme sounds were concatenated to generate synthesized output speech. 6. Results Figure (a) Figure (a) above shows the textbox in Android mobile where the text in English language is given as the input. INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT Page 354

Figure (b) Figure (b) above shows screen in Android mobile when speak to me button is clicked and the audio output is given in English for text given in the text box. Figure (c) Figure (c) above shows the textbox in Android mobile where the text in Hindi language is given as the input. Figure (d) Figure (d) above shows screen in Android mobile when speak to me button is clicked and the audio output is given in Hindi for text given in the text box. 7. Conclusion and Future Scope We have developed an application of speech synthesis on the Android environment. The application developed is user friendly and reliable and effective communication is performed. This system can be a solution to the problems of various individuals in their busy life and especially for the people with low vision or reading disabilities as it would help them to listen to their emails while relaxing, INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT Page 355

listen ebooks, study for exams by listening to notes. The proposed work has been done for the English and Hindi language. This work can also be done for the other regional languages such as Tamil, Gujarati, etc. We can also integrate a person voice with the system. 8. References 1. Er. Sheilly Padda, Er. Nidhi; A Step towards Making an Effective Text to speech Conversion System, International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622, Vol. 2, Issue 2,Mar-Apr 2012, pp.1242-1244 2. Eyob B. Kaise; Concatenative Speech Synthesis for Amharic using Unit Selection Method, MEDES 12, October 25-31, 2012, Addisababa, Ethiopia. 3. Aidan Kehoe, Designing Help Topics for Use with Text to Speech, SIGDIC 06, October 18-20, 2006, Myrtle Beach, South Carolina, USA. 4. Erik Blankinship, Tools for Expressive Text to Speech Markup, UIST 01 Orlando FLO, November 11-14, 2001 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT Page 356