CHAPTERl INTRODUCTION

Size: px

Start display at page:

Download "CHAPTERl INTRODUCTION"

Morgan Maxwell
5 years ago
Views:

1 CHAPTERl INTRODUCTION

2 1. INTRODUCTION The multifaceted system of speech involves different discipline of subjects in which its scientific study of speech science is one ofthe challenging tasks. Speech in the form of acoustic wave form transforms the linguistic message from a speaker to the hearer. In such a way, human beings depend on the speech as well as the text for their day to day communication. Speech is a complex process in which it shows the fluctuating values of speech parameters. Due to the dynamic nature of speech, it is observed that even a single speaker cannot utter the same word twice in a same manner. It is also somehow difficult to analyze the speech parameters and to get the static values for the theoretic explanation of speech. Therefore, if someone may try to do the scientific study of speech, it requires much more attention and also this study needs the subject knowledge of different disciplines likes acoustics, psychology, statistics, computer science etc. Speech conveys the information of speaker's intentional message, speaker's attitude oflife and the exact content of the discussed subject. Moreover, speech is considered as the primary mode of human communication. It is an amazing fact that there is no specific organ for the production of speech. The human organs like lungs, larynx, tongue, nose, lips and teeth have a primary function to support the life by breathing, tasting and eating. As it is concerned with the human speech production and perception, one can raise a question that why the speech is a highly complicated process in biological perspectives. Even a single speaker's utterance in different time will vary due to various factors likes state of the mind, emotions etc. Another important question is that how we perceive or recognize the speech units in the multifaceted system of speech. The scientific explanation is that the pressure fluctuation in the surrounding air makes the vibration in the eardrum. The middle ear can transform this vibrational energy into the inner ear. In such a way, the cochlea converts the mechanical vibration into nerve signal. Also, some of the areas in the left or right hemisphere of the brain like brocas area, vernic area and arcuate fasciculus could be able to detect this nerve signal to recognize speech. With reference to this complex biological process, one can make a statement that a scientist may have a limitation to make the model of the biological activities of speech in knowledge base. Intonation, one ofthe important suprasegmental features, is considered as one among the complex speech properties and its theoretical or scientific study got immense popularity in technological field. In order to model

3 2 intonation pattern for speech application, one should bear in mind that the intonation is also a complex and dynamic process like other speech properties. Intonation is defined as the speech melody occurred in the speech of a speaker due to the pitch fluctuation. Pitch fluctuation is closely related with fundamental frequency (to) variation. Because of this feature, intonation is also defined as the temporal changes of the fundamental frequency (fo). It occurs due to the frequency of vibration of the vocal codes. These vocal codes' vibration can be measured as fo contour in time domain. Some linguists view the fo contours or pitch levels as the vital information to determine the basic properties of intonation in speech. Pitch is perceived by the speaker and hearer in different levels to comprehend the message from the speech and the pitch is basically depending on the rate of vibration of vocal cords with in larynx. Pitch is a perceptual unit while fundamental frequency (fo) is an acoustic unit and in the case of pitch perception, one listener can judge whether the utterance is high or low and it is also possible to judge the voice quality of speech. It is stated that intonation is the fluctuation ofthe voice pitch as applied to the whole sentence. Languages generally use pitch variation to express the discourse meaning and emotional or attitudinal meaning. It is argued by some researchers that there is a clear cut distinction between prosody and intonation but other's view is that this distinction is minute in nature. Prosody refers to suprasegmental features (i.e. above the segmental level) or the rhythmic pattern of language and it also deals with the intonation aspect of language. In addition to the intonation aspect, prosodic analyses usually handle the temporal (duration) and air pressure (intensity) parameters. It is also true that the speech parameters like duration and intensity are considered as the essential acoustic features which will have a significant impact to produce the intonation of an utterance. Ladd (1996) defined that prosody is the suprasegmental features to convey sentence level pragmatic meaning. Sentences usually convey the lexical meaning, but when it combines with prosodic features, sentences will give the additional meaning to the utterance. Crystal (1975) says that prosodic features are the meaningful contrastive units in speech and it is occurred due to the presence of pitch, loudness and duration. In human communication, prosodic features playa key role in speech production and perception and these features make the system ofcommunication productive and accurate in nature. Since, speech is considered as the primary mode of human communication,

4 3 knowledge of prosodic features is essential to study the human communication in particular and speech in scientific way. Intonation study is mainly classified into two broad categories. One important classified study is the qualitative study or perceptual or psychological level of study and another one is the quantitative level of study. Perceptual analysis of intonation was the first approach in this research problem and later this study turned into the quantitative or acoustic level. The acoustic analysis of intonation became popular due to its scientific and quantitative nature and its scientific result of output. In order to study the basic and advanced features of human speech and human communication system, one should know prosodic features of speech especially intonation, thoroughly. It can be concluded in this section that since speech technology becomes an emerging technological field in this current era, intonation modeling and its scientific study get a significant research space in modern Natural Language Processing Technology AIM OF THE STUDY The aim of this study is to explore the area of Malayalam intonation with the help of speech software and to find out the intonation pattern of different sentence types in Malayalam language ie interrogative sentence, declarative sentence etc OBJECTIVE OF THE STUDY The objective of the present research is to develop the formal rules for Malayalam intonational phonology and phonetics. The phonetic features of Malayalam intonation are used to frame the rule of Malayalam intonational phonology. One important goal of this research is to do the statistical analysis of speech parameters like fundamental frequency (ill) and duration in syllable level. Shortly, this study will give a formal model of Malayalam intonation and this model and the result of this experiment can be useful for Text to speech system, Automatic speech recognition and other speech technology application HYPOTHESES OF THE STUDY The following hypotheses have been focused in this present study. In sentence level analysis of intonation, pitch level and pitch terminal commonly called pitch contour will form different intonation pattern for different types of

5 4 Malayalam sentences like Declarative sentence, Yes or No type interrogative sentence, Question word interrogative sentence, Imperative sentence, Debitive sentence l etc. But it will be possible to find out the uniform intonation pattern for all declarative sentences and this will be applicable to above mentioned sentence types also. Intonational phonology of Malayalam with reference to Malayalam syllable rules and structure, will frame the intonation patterns and this model is more appropriate to create the fundamental frequency (fd) contours. The analysis distinguishes the meaning bearing unit ie semantics of one type of sentence from other type by considering the different intonation pattern. In statistical analysis of speech parameters, there is more similarity occurred in fd values and pitch contour of different speakers. There will be similarity and dissimilarity in pitch variation and durational variation but the similarity is more compared with dissimilarity in speakers' speech SCOPE OF THE STUDY Even though intonation study has a significant role in the theoretical linguistics, this research will be useful to the field oftechnological disciplines like Computational linguistics, Forensic linguistics and Speech science. In Computational linguistics, the phonetic model of intonation in speech synthesis and speech recognition is one among the challenging tasks to overcome the robotic nature ofmachines to get the natural speech. The prosody implementation (mainly intonation and duration) in text to speech system has explored different research areas in intonation, recently. In order to get the intelligibility and naturalness of speech in speech synthesis system, some artificial intelligence approaches like neural networking can be used to model the intonation pattern. Although all these approaches have some limitations to give the naturalness to speech, some of the approaches have been succeeded to compute the perceptual quality of intonation in some extent. There is a new approach to incorporate the prosodic knowledge for feature extraction in speech recognition research. So, intonation study has got immense popularity in speech

6 5 recognition research, recently. Forensic Linguistics, an emerging research field is exploited the knowledge of intonation for speaker identification method. Intonation is one among the vital information to determine the intrinsic property of a speaker's speech. The study of intonation and its application in speech has been contributing in the research area of Speech and Hearing science. It is noted that most of speech technology applications like automatic speech recognition, speaker recognition and text to speech system are mostly designed with the knowledge of different statistical approach. So that the analytical and statistical study of intonation parameters especially fd and duration will be useful to model the speech application system under the design ofstatistical knowledge.moreover, intonation research is an ever-growing subject to explore the areas of speech in all directions and this research also tries to tackle the problems in speech technology METHODOLOGY The research methodology is primarily concerned with the collection and analysis of speech data for intonation study. Various sentence types like interrogative, declarative etc and emotive sentences are selected for recording. In order to achieve a high audio quality, these sentences have been recorded with 48 khz sample frequency and down sampled to 22 khz. Then the speech samples are quantized with 16 bits per sample. Seven female and seven male speakers are selected for recording different type of sentences and the paragraph of test battery with the subject of news is also included in the speech data. The proposed instrumental analysis is done in computer with the help of speech softwares. Different softwares are used for analysis as follows. Cool edit: a software tool is used to record the speech data and this tool is also being used for noise reduction and down sampling. One important point is in our mind before recording that the speech data should possess most of the prosodic information. The pitch information can be extracted from speech waves with the help ofspeech Analyzer software. The pitch information will determine the intonation contours to make the intonation pattern for various sentence types. Emotional sentences will also give the intonation contours of emotions but it is somehow difficult to map the intonation contours for different emotional states. Shortly, the analysis and experimentation will have a formal model of intonation as output to fit for various applications.

7 6 Speech Analyzer is a windows program speech tool which is designed to assist users in Speech analysis. Summer Institute of Linguistics (SIL), USA was developed the "Speech Analyzer" tool for computational analysis to extract the acoustic properties of speech sounds. In addition to various acoustic measurement of speech, Speech Analyzer is used to perform fundamental frequency, spectrographic and spectral analysis, and duration measurements. One important application of this software is in the field of annotation work like annotated speech corpus in corpora generation. However, Speech Ana(vzer is scientifically proved it's fitness to extract acoustic properties ofsounds in an accurate way. Cool edit developed by Syntrillium Software Corporation, Phoenix, AZ , USA is used to record the speech data for intonation study. It can also be used to examine the frequency components and other details like frequency Analysis, Statistics and Spectral view features. Acoustic analysis was also done with PRAAT speech software which was developed by Paul Boersma and David Weenink of the University of Amsterdam, Netherlands. The methodology adopted in this research is scientific in nature and this is the reason that the analysed Malayalam intonation patterns deals with a descriptive and formal rule of linguistic analysis even for prosody implementation ORGANISATION OF THE THESIS The thesis is organized into five chapters. The first chapter 'Introduction' discusses the key concept of speech science and intonation. This chapter also deals with aim, scope and methodology ofpresent study. The second chapter 'Speech science and Intonation' focuses on Acoustic phonetics, prosodic features of intonation and speech science and its application in technical field. This chapter also gives the broad idea of speech technology from the fundamental concept to advanced concept. Chapter third 'Review of Literature' deals with the review of previous work in intonation theory and intonation modeling. It is shown that the description of review of literature was scientific and critic in nature. The fourth chapter 'Malayalam Intonation Analysis' investigates the intonation patterns ofmalayalam with reference to different types of sentences. This chapter would be a broad description of Intonational phonology and phonetic features of intonation. The

8 7 statistical analysis of speech parameters and its observation is also a part of this fourth chapter. The fifth chapter 'Conclusion' gives a brief discussion of the subjects presented in this thesis. The research findings are discussed in a scientific manner LIMITATIONS OF THE STUDY The present study is mainly focused on Malayalam intonation and duration features of speech. As it is concerned with the prosody implementation of speech system, other prosodic features like tempo (speed rate of the speech), rhythm and voice quality should be considered and these are also to be incorporated into text to speech system or other speech application. The present study did not give much attention to the above said prosodic features. This limitation is occurred due to the space and time limitation ofthe thesis. Another limitation of this study is that it may not be able to deal the dialectal variation in speech. This intonation study is mainly concerned with the standard dialect ofmalayalam language only.

9 8 ENDNOTES Debitive sentence l Debitive is a sentence type to express the mood of a sentence. Malayalam sentences likes Qii pookaj;lam, Qii pookal}ta etc express the strong command.here al}am and anta express the mood or modality of the verb. The reference ofthis sentence type is available in R.E.Asher's book "Malayalam".

Word Stress and Intonation: Introduction

Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress