Speaker Independent Voice. Recognition Calculator

Similar documents
Computer Science. Embedded systems today. Microcontroller MCR

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

Infrared Paper Dryer Control Scheme

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Interpreting ACER Test Results

A Practical Approach to Embedded Systems Engineering Workforce Development

Language Acquisition Chart

First Grade Curriculum Highlights: In alignment with the Common Core Standards

C a l i f o r n i a N o n c r e d i t a n d A d u l t E d u c a t i o n. E n g l i s h a s a S e c o n d L a n g u a g e M o d e l

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

LEGO MINDSTORMS Education EV3 Coding Activities

REVIEW OF CONNECTED SPEECH

Extending Place Value with Whole Numbers to 1,000,000

Moderator: Gary Weckman Ohio University USA

Major Milestones, Team Activities, and Individual Deliverables

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

PowerTeacher Gradebook User Guide PowerSchool Student Information System

Large Kindergarten Centers Icons

Learning Methods in Multilingual Speech Recognition

Circuit Simulators: A Revolutionary E-Learning Platform

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

University of Toronto Physics Practicals. University of Toronto Physics Practicals. University of Toronto Physics Practicals

Function Tables With The Magic Function Machine

STUDENT MOODLE ORIENTATION

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Star Math Pretest Instructions

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

TotalLMS. Getting Started with SumTotal: Learner Mode

Ohio s Learning Standards-Clear Learning Targets

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

First Grade Standards

CEFR Overall Illustrative English Proficiency Scales

Speech Recognition at ICSI: Broadcast News and beyond

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

M55205-Mastering Microsoft Project 2016

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Answer each question by placing an X over the appropriate answer. Select only one answer for each question.

Developing a concrete-pictorial-abstract model for negative number arithmetic

Connecting Middle Grades Science and Mathematics with TI-Nspire and TI-Nspire Navigator Day 1

Spanish III Class Description

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Prentice Hall Literature: Timeless Voices, Timeless Themes, Platinum 2000 Correlated to Nebraska Reading/Writing Standards (Grade 10)

All Systems Go! Using a Systems Approach in Elementary Science

What's My Value? Using "Manipulatives" and Writing to Explain Place Value. by Amanda Donovan, 2016 CTI Fellow David Cox Road Elementary School

ON BEHAVIORAL PROCESS MODEL SIMILARITY MATCHING A CENTROID-BASED APPROACH

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Timeline. Recommendations

Appendix L: Online Testing Highlights and Script

SIE: Speech Enabled Interface for E-Learning

SOFTWARE EVALUATION TOOL

ENERGY WORLD: Electricity aro

Moodle Student User Guide

Think A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -

USING SOFT SYSTEMS METHODOLOGY TO ANALYZE QUALITY OF LIFE AND CONTINUOUS URBAN DEVELOPMENT 1

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Welcome to ACT Brain Boot Camp

Literature and the Language Arts Experiencing Literature

Using SAM Central With iread

Information for Candidates

ESSENTIAL SKILLS PROFILE BINGO CALLER/CHECKER

Five Challenges for the Collaborative Classroom and How to Solve Them

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Tour. English Discoveries Online

SCHOOL WITHOUT CLASSROOMS BERLIN ARCHITECTURE COMPETITION TO

SAM - Sensors, Actuators and Microcontrollers in Mobile Robots

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Situational Virtual Reference: Get Help When You Need It

Multidisciplinary Engineering Systems 2 nd and 3rd Year College-Wide Courses

One Stop Shop For Educators

Multimedia Courseware of Road Safety Education for Secondary School Students

Implementing the English Language Arts Common Core State Standards

Chapter 4 - Fractions

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Seminar - Organic Computing

OFFICIAL DOCUMENT. Foreign Credits, Inc. Jawaharlal Nehru Technological University

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A Process-Model Account of Task Interruption and Resumption: When Does Encoding of the Problem State Occur?

A Variation-Tolerant Multi-Level Memory Architecture Encoded in Two-state Memristors

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

South Carolina English Language Arts

Platform for the Development of Accessible Vocational Training

Phys4051: Methods of Experimental Physics I

Human Emotion Recognition From Speech

Reading Horizons. Organizing Reading Material into Thought Units to Enhance Comprehension. Kathleen C. Stevens APRIL 1983

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Linking Task: Identifying authors and book titles in verbose queries

Eliciting Language in the Classroom. Presented by: Dionne Ramey, SBCUSD SLP Amanda Drake, SBCUSD Special Ed. Program Specialist

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Software Maintenance

Transcription:

Contemporary Engineering Sciences, Vol. 5, 2012, no. 3, 119-125 Speaker Independent Voice Recognition Calculator Yahya S. H. Khraisat AL-Balqa' Applied University, Al-Huson University College, Electrical and Electronics Department, P.O. Box 50, Al-Huson 21510, Jordan yahya@huson.edu.jo Abstract The voice activated calculator is a speaker-independent system that is used to perform basic mathematical operations. It recognizes the isolated spoken digits from 0 to 9, and other words like plus, minus, times, equal and clear. It then performs the respective arithmetic operations, and displays the final answer on an LCD display. Keywords: PIC Microcontroller, Voice Recognition Calculator, LCD display and Speech detection and Recognition 1. Introduction Speaker-independent voice recognition systems have a very strong probability of becoming a necessity in the workplace in the future. Such systems would be able to improve productivity and would be more convenient to use. The idea of a hardware that can recognize any person's voice without the training time involved in currently employed systems is a very promising one, and possibly a marketable one too. At its most basic level speech recognition allows the user to perform parallel tasks, (i.e. hands and eyes are busy elsewhere) while continuing to work with the computer or appliance. Another aspect of this hardware would be in the assistance of hand-disabled people. The block diagram of this real-time system is shown below in Figure 1.

120 Y. S. H. Khraisat Microphone (Input) HM 2007 Speech Detection And Recognition Memory Static RAM 8k X 8 LCD Display (Output) Calculation (PIC Microcontroller) Figure 1 Block diagram of voice activated calculator 1.1. Specifications 1.1.1. Speaker Dependent / Speaker Independent Speech recognition is classified into two categories, speaker dependent and speaker independent. Speaker dependent systems are trained by the individual who will be using the system. These systems are capable of achieving a high command count and better than 95% accuracy for word recognition. The drawback to this approach is that the system only responds accurately only to the individual who trained the system. This is the most common approach employed in software for personal computers. Speaker independent is a system trained to respond to a word regardless of who speaks. Therefore the system must respond to a large variety of speech patterns, inflections and enunciation's of the target word. The command word count is usually lower than the speaker dependent however high accuracy can still be maintain within processing limits. Industrial requirements more often need speaker independent voice systems, such as the AT&T system used in the telephone systems [1]. Recognition Style Speech recognition systems have another constraint concerning the style of speech they can recognize. There are three styles of speech: isolated, connected and continuous. Isolated speech recognition systems can just handle words that are spoken separately. This is the most common speech recognition systems available today. The user must pause between each word or command spoken. The speech recognition circuit is set up to identify isolated words of 0.96 second lengths. Connected is a half way point between isolated word and continuous speech recognition. Allow users to speak multiple words. The HM2007 can be set up to

Speaker independent voice recognition calculator 121 identify words or phrases 1.92 seconds in length. This reduces the word recognition vocabulary number to 20. Continuous is the natural conversational speech we are used to in everyday life. It is extremely difficult for a recognizer to shift through the text as the words tend to merge together. For instance, "Hi, how are you doing?" sounds like "Hi,.howyadoin" Continuous speech recognition systems are on the market and are under continual development [2]. 2. Testing and Validation 2.1 Training the Speech Recognition Circuit The keypad and digital display are used to communicate with and program the HM2007 chip. The keypad is made up of 12 normally open momentary contact switches. When the circuit is turned on, 00 is on the digital display, the red LED (READY) is lit and the circuit waits for a command. Figure 2 The Speech recognition System The Complete Schematic of Speech Recognition Circuit is shown below in figure3.

122 Y. S. H. Khraisat Figure 3 The Complete Schematic of Speech Recognition Circuit The Complete Schematic of interfacing design is shown below in figure 4 Figure 4 the Complete Schematic of interfacing design 2.2 Training Words for Recognition Press 1 (display will show 01 and the LED will turn off) on the keypad, then press the TRAIN key (the LED will turn on) to place circuit in training mode,

Speaker independent voice recognition calculator 123 for word one. Say the target word into the onboard microphone (near LED) clearly. The circuit signals acceptance of the voice input by blinking the LED off then on. The word (or utterance) is now identified as the 01 word. If the LED did not flash, start over by pressing 1 and then TRAIN key. You may continue training new words in the circuit. Press 2 then TRN to train the second word and so on. The circuit will accept and recognize up to 20 words (numbers 1 through 20). It is not necessary to train all word spaces. If you only require 10 target words that are all you need to train. 2.3 Testing Recognition Repeat a trained word into the microphone. The number of the word should be displayed on the digital display. For instance, if the word directory was trained as word number 20, saying the word directory into the microphone will cause the number 20 to be displayed. 2.4. Error Codes The chip provides the following error codes. 55 = word too long 66 = word too short 77 = no match When interfacing the external circuit through its data bus, The decoding circuit must recognize the word numbers from error codes. So the circuit must be designed to recognize error codes 55, 66 and 77 and not confuse them with word spaces 5, 6 and 7. 2.4 Clearing Memory To erase all words in memory press 99 and then CLR. The numbers will quickly scroll by on the digital display as the memory is erased. 2.5 Changing & Erasing Words Trained words can easily be changed by overwriting the original word. For instances suppose word six was the word Capital and you want to change it to the word State. Simply retrain the word space by pressing 6 then the TRAIN key and saying the word State into the microphone. If one wishes to erase the word without replacing it with another word press the word number (in this case six) then press the CLR key. In our system we used the following 01 one 02 two 03 three 04 four 05 five 06 six 07 seven 08 eight 09 nine 10 plus

124 Y. S. H. Khraisat 11 minus 12 multiply 13 divide 3. Conclusion At the end of the design, the system performed most of its functions, although it did not work as a whole. The voice-activated calculator was able to recognize spoken words in natural environments, and was able to run the recognition code on it. The resulting output would get displayed on the LCD in real-time, and the calculations performed by the PIC microcontroller in real-time were also working properly. The time taken for the whole process was very acceptable (about 1 second). However, the major obstacle to successful implementation of the recognition system was the HM2007 itself. Due to the memory restrictions, the size of the training codebook could not be made larger. Only 10 numbers per classification word were used in this system. But, a larger, more controlled training set could result in higher recognition accuracy. 4. Future Work For future development of this system, we can implement more SRAM on our speech recognition circuit so that we can recognize more than one digit operation and we can also implement more operation such that log, ln, square root and much more by modifying the code of the PIC controller. We can also implement analog voice output instead of the LCD so it can be used for blind people too. While the voice activated calculator did not work as expected in the demonstration, enough testing has been done to prove that the system is still a feasible one. It is thus imperative to understand from this the importance of making the right engineering decisions for this design. References [1] http://mi.eng.cam.ac.uk/comp.speech/section6/q6.1.html. [2] Development of Isolated Word Speech Recognition System, Antennas LIPEIKA, Joana LIPEIKIEN E, Laimutis TELKSNYS [3] http://en.wikipedia.org/wiki/speech_recognition [4] http://itp.nyu.edu/physcomp/sensors/reports/hm2007voicerecognitionic [5] http://www.pcguide.com/ref/ram/typessram-c.html [6] http://www.datasheet4u.com/html/7/4/l/74ls373_motorolainc.pdf.html

Speaker independent voice recognition calculator 125 [7] http://www.doctronics.co.uk/4511.htm [8] http://www.datasheetcatalog.org/datasheet/motorola/74ls48.pdf [9] http://www.embedds.com/matrix-keypad-interfacing-with-microcontrollers/ [10] http://www.eidusa.com/electronics_voltage_regulator.htm [11] http://en.wikipedia.org/wiki/crystal_oscillator [12] http://www.massmind.org/images/www/hobby_elec/e_pic1.htm [13] http://www.futurlec.com/pic16f877_controller.shtml Received: November, 2011