The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day : Creation Principles and Annotation
|
|
- Clare Burns
- 6 years ago
- Views:
Transcription
1 The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day : Creation Principles and Annotation Alexander Asinovsky, Natalia Bogdanova, Marina Rusakova, Anastassia Ryko, Svetlana Stepanova, and Tatiana Sherstinova St. Petersburg State University, St. Petersburg, Universitetsakaya nab. 11, , Russia {a.s.asinovsky,mvrusakova,sherstinova}@gmail.com, {nvbogdanova 2005,aryko,stsvet 2002}@mail.ru Abstract. The main aim of the ORD speech corpus is to fix Russian spontaneous speech in natural communicative situations. The corpus presents the unique linguistic material, allowing to perform fundamental research in many scientific aspects and to solve different practical tasks, especially in speech technologies. The paper concerns methodology and description of the ORD corpus creating and presents the system of annotations. 1 Introduction Beginning from 1990s national corpora of spontaneous speech are created in many countries of the world. The first audio corpus based on spontaneous oral speech of subjects organized in a demographically balanced sample is a part of the Spoken Component of the British National corpus [12]. So far, there is no really representative corpus of Russian everyday speech, though some databases of oral speech have been elaborated by various scientific groups for at least recent 40 years [1]. Although there is some progress achieved in the exploration of informal speech, this domain certainly calls for further investigation. For example, it is not known how many wordforms (morpheme, sentence) tokens are produced and perceived by a speaker per hour, day, month; how many different language units he (she) processes per various periods of time or what is the average duration of sounding speech within these periods. The lack of linguistic information of this kind is well comprehensible: carrying out investigations needed to achieve such results is extremely labour- and time-consuming, and technical tools for such explorations became available comparatively not long ago. Now both technical means and theoretical background for the exploration of spontaneous oral speech do exist. Nevertheless, these investigations should be treated as highly innovative, because they can be held only by professional teams including many researchers. During the last decades a great amount of Russian natural oral speech data was collected. Unfortunately, when gathering speech material researchers use different methods of collecting and pursue their individual scientific goals. As a result, the available resources are not uniform, uncoordinated and it is rather difficult to integrate the data V. Matoušek and P. Mautner (Eds.): TSD 2009, LNAI 5729, pp , c Springer-Verlag Berlin Heidelberg 2009
2 The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day 251 into a single representative corpus open for general use. Therefore, shaping of the fullfledged descriptive database of the Russian oral informal speech is the prerequisite for the progress in those linguistic spheres which are concerned with the native speaker, natural speech and, in a larger scale, communicative behaviour in general. 2 The Main Concepts of the ORD Corpus 2.1 Methodological Background The main goal of the present investigation is to fix Russian spontaneous speech in natural communicative situations. Firstly, it means that nothing should interfere with the usual habits of speakers communicative behaviour in particular speech situations during recording. For example, speech communication at breakfast should be realized in those circumstances that are habitual to every individual: in the same setting and with the same interlocutors as usual, with the same level of accompanying noise (with an open window, refrigerator functioning, etc.). Secondly, every subject should realize his speech ability in standard situations, not changing the usual length of utterances, as well as speech topics and repertoire during recording. For example, if a subject got used to read a newspaper, he should not give up this habit and should not organize an unusual for this moment communication (e.g., intentionally talking with other members of the family or inviting unusual guests) in order to enlarge the amount of speech material within the recorded time. At the same time at the first stage of our investigation the participants were asked to make a deliberate choice of a particular day for recordings. For example, it is preferable to make the recording on an ordinary day, rather than on a day when the subject is about to do something unusual (e.g., to go for an excursion, to be absent from home for the whole day because of a seasonal rush at his office, etc.) [2]. 2.2 Technical Equipment Recording is made with the help of a dictaphone. At first the subject makes all the necessary settings and fasten the dictaphone to his clothes (into a pocket or a special bag). Such a mode of recording inevitably causes non-uniformity of the quality of the obtained speech data. In our research we have been using Olympus WS-320M dictaphone, which ensures more that 35 hours of high-quality recordings. A relatively low level of quality in the recording obtained in this way (if compared to studio recordings) is an unavoidable consequence of undertaking a fieldwork aimed at the natural experiment with human speech behaviour. 2.3 Speakers Selection and Training At this stage of the research the study is not directly aimed at the description of the Russian language in the whole richness of its manifestations. There is only one form of its functioning that is being studied, viz. the speech of naïve native speakers of Russian who live in a city where the Standard language is dominating, where there is no significant impact on the part of dialects or other languages, and where the population is professionally heterogeneous and unbiased in terms of age and gender distribution.
3 252 A. Asinovsky et al. St. Petersburg is a model city of this type. This is the reason why it was St. Petersburg where the subjects were being selected. The objective of the present research supposes that subjects should never be chosen among those speakers whose occupation requires any specific level of speech control or self-consciousness in this respect; indeed, it is known that these professional skills might have a strong impact on speech production by individuals. The preparatory work with the subject consists of two stages: 1) providing the necessary degree of naturalness of the subject s behaviour in communicative situations; 2) providing proper technical quality of the recordings necessary for further speech analysis. The first of these two stages is indeed very complicated. During a pilot experiment, in which the very members of the research team (i.e., linguists themselves) recorded their days-of-speech, it was detected that despite being highly motivated in terms of striving for adequate results, the subjects were hardly able to forget that they were bearing a dictaphone: they could not abandon the idea that their communication with surrounding people, especially with their close relatives, was going to become available to their acquaintances/colleagues. Finally, it was decided to gather recordings under conditions of full and utter anonymity for the sake of obtaining their maximal naturalness. The following procedure has been elaborated. A non-member of the research team, a psycholinguist by specialization, joins the procedure. He addresses potential subjects, for example, a group of people working in a particular enterprise. When instructing potential subjects he guarantees that under no circumstances he would personally come in touch with the future materials obtained from those subjects. When the recordings are made, this intermediate collaborator hands them over to the research team. The subjects do not indicate in any form their actual names, but they have to fill in a questionnaire concerning their age, profession, place of birth and other sociological data. As a result the research team obtains recordings of speakers who are absolutely unknown to them; moreover they have never met one another in person. The shortcoming of this approach is that no control on the part of professional linguists is possible at recording stage, hence there is a relatively high ratio of spoilage and technical flaws in recorded files, which make impossible phonetic analysis of correspondent fragments. The second stage is implemented by way of instructing the subjects-to-be how to use dictaphones for the best results. 2.4 Creating the ORD Corpus The abbreviation ORD stems from Russian Odin Rechevoj Den, literally translated as one day of speech. A demographically balanced group of subjects was formed for the first series of recordings, including 30 participants (the base speakers or informants) representing various social and age strata in the population of St. Petersburg. After detailed instructions the subjects made recordings of all speech communications in which they took part during one day. Besides, all of them filled in the questionnaires and passed through psychological testing.
4 The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day 253 The overall length of the recorded material is 240 hours, from which 170 hours contain speech data quite suitable for further analysis. Beside the subjects speech, that of their 520 interlocutors was also recorded. Interlocutors were people of different ages (from 3 to 68 years), professions and occupations (e.g., salesmen, conductors, managers, lecturers, doctors, librarians, IT-professionals, students, and many others), being in friendly, family, professional or other relations with the subjects. Materials contain diverse genres and styles of speech: professional conversations with colleagues, telephone talks, lectures, practical lessons of foreign languages, communications with friends and relatives during airings, parties, breakfasts, dinners, etc. The topics in these conversations also range from discussions of teeth problems with a dentist to conversations about religion, life and death. Recordings were made at home, while traveling by the public transportation, while walking outside, at the university, in the military college, in coffee bars, in the shops, in the amusement park, etc. The corpus was divided into 2202 communication episodes. 134 episodes are already transcribed in detail [2]. The recordings are processed by professional linguists. The initial stage of this process consists in preliminary description of the material and its orthographic transcription. Generally, when studying oral speech, the burden of deciphering is often put onto the shoulders of the very participants of the dialogues; these are the persons who are often able to interpret the least hearable fragments and to provide the fullest extralinguistic information relevant for the production of the texts in question. In our research, the deciphers could not witness the communicative behaviour of speakers for the sake of speech naturalness. The fact that the recordings are transcribed by non-participants results in partial loss of information. This loss should be viewed as an inevitable toll for the naturalness of the material gathered. 3 System of Annotations 3.1 Annotation Software Two professional annotation tools ELAN and Praat are used to annotate the ORD corpus. The first one is ELAN (EUDICO Linguistic Annotator) developed at the Max Planck Institute for Psycholinguistics (Nijmegen, The Netherlands), with the aim to provide a sound technological basis for annotation and exploitation of multi-media recordings. This professional tool allows to create, edit, visualize and search annotations for video and audio data. Being specifically designed for the analysis of language, ELAN is a convenient tool for multi-level annotations of communication and speech [5]. Figure 1 shows a fragment of multi-level annotation of the ORD corpus for a number of tiers made in ELAN. ELAN is used for annotation of most tiers (Episodes, Frases, Events, Words, etc.) apart from phonetic ones (Sounds, Syllables, and others presented in phonetic transcription), which are made in Praat. Praat is a professional phonetic annotator which allows to analyze and manipulate speech. The Praat program was created by Paul Boersma and David Weenink (the Institute of Phonetics Sciences, University of Amsterdam, The Netherlands) [8]. The structures of ELAN and Praat annotations are fully compatible.
5 254 A. Asinovsky et al. Fig. 1. A fragment of multi-level annotation of the ORD corpus made in ELAN. Tiers: Frase (orthographic transcripts), SynIdeal (syntagmas), Words (orthographic), WordIdeal and WordReal (words in phonetic transcription), POS (part of speech), GramForm (grammatic form), SyntRoles (syntactical role of the word), Syllable (real phonetic transcription), MorphemeReal (morpheme in real phonetic transcription), SoundReal (sounds in real phonetic transcription). 3.2 Annotation Principles The main principles of multi-level annotation for a spoken corpus were given in [3]. Primary annotation of speech in the ORD corpus is made in ELAN and implies annotation of the followings data types: Frase - orthographic transcripts of phrases, which are the main units of description. Besides transcripts it contains references to pauses and noisy fragments. Independent type. Event - nonlanguage audio events (dog barking, squeak of a door, phone ring, radio program, etc.), including as well some voiced events (e.g., cough, yawn, laugh, moan, etc.) not connected with speech. Independent type. Speaker is a person who pronouncedcorrespondentphrase (Frase) either a base speaker (subject) or one of his/her communicators. This type depends on Frase (Symbolic Association). Voice is the special characteristic of speech for the given phrase (Frase) or its segment (e.g., hoarse, whisper, scanning, irritated, imitating, ironical, dramatic, etc.). Depends on Frase (Included In). FraseComment contains all kinds of comments on phrase realisation and researcher s remarks. Depends on Frase (Symbolic Association). Notes may contain other useful information. Independent type. Episode (Communication episode) refers to general communicational situation. Independent type. These main 7 data types are being annotated in ELAN on 7 correspondent tiers: Frase, Events, Speaker, Voice, FraseComments, Notes, Episodes.
6 The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day 255 Some spelling rules for transcripts (tier Frase): Speech is written in standard orthography. Speech is divided into conventional sentences [10]. Transcripts may include the following symbols: Symbol Meaning / (//) marks of syntagmatic (phrasal) division H fragment of unintelligible speech P barely intelligible speech Π pause () short hesitation pause (...) long hesitation pause, which may be filled (m-m), (a-a), hesitation pause filled by sounds (e-e), (a-m)!,? marks indicating exclamatory and interrogative utterances ca... interrupted word... unfinished phrases (?) questionable or ambiguous transcript # change of a speaker in overlapping remark inserted by another speaker in overlapping speech fragments Speaker s code is attached to each phrase (e.g., M2). If a person-interlocutor is unknown or unidentified, his/her code is marked by symbol X (e.g., MX1). In case interlocutors speak simultaneously we use symbol # as a delimiter for individual codes (e.g., S01#F5) or (e.g., S07@M2 means that while S07 is speaking his second male interlocutor pronounces an insert remark). Figure 2 shows a sample of primary annotation made for three tiers. Fig. 2. Sample of primary annotation for three tiers (Frase, Speaker, Events) The second stage of corpus annotation is made on lexical level and includes tagging for the following main tiers: Words (spelling), POS, GramForm (grammatical form), and SyntRole (syntactic role). Description of annotations made on phonetic levels in Praat may be found in [9].
7 256 A. Asinovsky et al. 4 Conclusion Though creating of the ORD corpus is still in progress, its diverse speech material and annotations have given birth to a number of linguistic and interdisciplinary researches. Let us mention just some of the papers. In [4] the most frequent reduced wordforms of spontaneous Russian speech are described. The work [10] discusses general problems of separation of various linguistic units from a real speech stream, whereas the paper [11] describes dynamics of communication episodes in everyday life. Article [6] relates to psycholinguistic studies and concerns the dependence of speech characteristics on speaker s mental state and personality. In [13] one may find sociological description of the speakers recorded for the ORD, and the paper [7] presents study on the everyday rhetoric and its techniques. The new project which have been started on the material of the ORD corpus is creating an audio dictionary of Russian morphemes. Acknowledgements The first recordings and database creating of the ORD corpus were supported by the Russian Foundation for Humanities within the framework of the project Speech Corpus of Russian Everyday Communication One Speaker s Day (project # e/Ya). Nowadays creating of the corpus is supported by the program of the Russian Ministry of Education titled Sound Form of Russian Grammar System in Communicative and Informational Approach and by the grant of the Russian Foundation for Humanities Development of an Information System for Monitoring of Russian Spoken Language (project # v). References 1. Asinovsky, A.S., Arkhipova, E.A., Bogdanova, N.V., Rusakova, M.V., Ryko, A.I., Stepanova, S.B., Sherstinova, T.Y.: Polevaya lingvisticheskaya praktika. Uchebno-metodicheskij kompleks slozhnoj struktury. Chast 1. Teoreticheskie osnovy i metodika sbora lingvisticheskikh dannykh dl a predstavlenia ikh v linguisticheskom korpuse russkogo yazyka. St. Petersburg (2007) 2. Asinovsky, A.S., Bogdanova, N.V., Rusakova, M.V., Stepanova, S.B., Sherstinova, T.Y.: Zvukovoj korpus russkogo yazyka povsednevnogo obschenia Odin rechevoj den : koncepcia i sosytoyanie formirovania. In: Kompjuternaya lingvistika i intellektualnye tekhnologii. Vypusk, Moscow. Po materialam mezhd. konferencii Dialog, vol. 7 (14), pp (2008) 3. Asinovsky, A.S., Koroleva, I.V., Rusakova, M.V., Ryko, A.I., Philippova, N.S., Stepanova, S.B.: On Integral Multilevel Annotation of a Spoken Russian Corpus. In: Proc. the XIIth International Conference Speech and Computer SPECOM 2007, Moscow (2007) 4. Bogdanova, N.V.: Allegrovye formy russkoj rechi: ot proiznositel noj redukcii k pis mennoj fiksacii i leksikalizacii v yazyke. Mat-ly XXXVII mezhd. filologicheskoj konferencii. Vypusk 18. Fonetika. St. Petersburg (2008) 5. ELAN - Linguistic Annotator. Version 3.6,
8 The ORD Speech Corpus of Russian Everyday Communication One Speaker s Day Koroleva, I.V.: Individual nye sostoyania i svoistva yazykovoj lichnosti: vliyanie na lingvisticheskuju strukturu vyskazyvanij. Mat-ly XXXVII mezhd. filologicheskoj konferencii. Vypusk 21. St. Petersburg. pp (2008) 7. Markasova, E.V.: Ritoricheskaya enantiosemia v korpuse russkogo yazyka povsednevnogo obschenia Odin rechevoj den. In: Kompjuternaya lingvistika i intellektualnye tekhnologii. Vypusk Dialog, Moscow, vol. 7(14), pp (2008) 8. Praat: Doing Phonetics by computer, 9. Ryko, A.I., Stepanova, S.B.: Mnogourovnevaya lingvisticheskaya razmetka zvukovogo korpusa russkogo yazyka. In: Kompjuternaya lingvistika i intellektualnye tekhnologii. Vypusk. Po materialam mezhd. konferencii Dialog, Moscow, vol. 7 (14), pp (2008) 10. Ryko, A.I., Stepanova, S.B.: Problemy vychlenenia jedinic analiza spontannogo ustnogo teksta. In: Mat-ly XXXVII mezhd. filologicheskoj konferencii. Vypusk, St. Petersburg, vol. 21, pp (2008) 11. Sherstinova, T.Y.: Odin rechevoj den na vremennoj shkale: o perspektivakh issledovania dinamicheskikh processov na materiale zvukovogo korpusa. In: Vestnik Sankt-Peterburgskogo universiteta, Seria 9: Filologia, Vostokovedenie, Zhurnalistika, Chast 2, St. Petersburg, vol. 4, pp (2008) 12. The British National Corpus Zobnina, E.A.: Social nye characteristiki govoriaschego: objektivnye dannye i ekspertnaya ocenka rechi (po materialam zvukovogo korpusa Odin rechevoj den. In: Mat-ly XXXVII mezhd. filologicheskoj konferencii. Vypusk, St. Petersburg, vol. 21, pp (2008)
The Structure of the ORD Speech Corpus of Russian Everyday Communication
The Structure of the ORD Speech Corpus of Russian Everyday Communication Tatiana Sherstinova St. Petersburg State University, St. Petersburg, Universitetskaya nab. 11, 199034, Russia sherstinova@gmail.com
More informationProblems of the Arabic OCR: New Attitudes
Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing
More informationEyebrows in French talk-in-interaction
Eyebrows in French talk-in-interaction Aurélie Goujon 1, Roxane Bertrand 1, Marion Tellier 1 1 Aix Marseille Université, CNRS, LPL UMR 7309, 13100, Aix-en-Provence, France Goujon.aurelie@gmail.com Roxane.bertrand@lpl-aix.fr
More informationCandidates must achieve a grade of at least C2 level in each examination in order to achieve the overall qualification at C2 Level.
The Test of Interactive English, C2 Level Qualification Structure The Test of Interactive English consists of two units: Unit Name English English Each Unit is assessed via a separate examination, set,
More informationCEFR Overall Illustrative English Proficiency Scales
CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey
More informationFOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.
CONTENTS FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8 УРОК (Unit) 1 25 1.1. QUESTIONS WITH КТО AND ЧТО 27 1.2. GENDER OF NOUNS 29 1.3. PERSONAL PRONOUNS 31 УРОК (Unit) 2 38 2.1. PRESENT TENSE OF THE
More informationОТЕЧЕСТВЕННАЯ И ЗАРУБЕЖНАЯ ПЕДАГОГИКА
ОТЕЧЕСТВЕННАЯ И ЗАРУБЕЖНАЯ ПЕДАГОГИКА 2 2107 Olga S. Andreeva, PhD (Philology), Associate Professor, Consultant, "Fund of Enterprise Restructuring and Financial Institutions Development" E-mail: osandreeva@yandex.ru
More informationThe development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach
BILINGUAL LEARNERS DICTIONARIES The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach Mark VAN MOL, Leuven, Belgium Abstract This paper reports on the
More informationELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading
ELA/ELD Correlation Matrix for ELD Materials Grade 1 Reading The English Language Arts (ELA) required for the one hour of English-Language Development (ELD) Materials are listed in Appendix 9-A, Matrix
More informationThe Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University
The Effect of Extensive Reading on Developing the Grammatical Accuracy of the EFL Freshmen at Al Al-Bayt University Kifah Rakan Alqadi Al Al-Bayt University Faculty of Arts Department of English Language
More informationMerbouh Zouaoui. Melouk Mohamed. Journal of Educational and Social Research MCSER Publishing, Rome-Italy. 1. Introduction
Acquiring Communication through Conversational Training: The Case Study of 1 st Year LMD Students at Djillali Liabès University Sidi Bel Abbès Algeria Doi:10.5901/jesr.2014.v4n6p353 Abstract Merbouh Zouaoui
More informationProcedia - Social and Behavioral Sciences 154 ( 2014 )
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 154 ( 2014 ) 452 456 THE XXV ANNUAL INTERNATIONAL ACADEMIC CONFERENCE, LANGUAGE AND CULTURE, 20-22 October
More informationArabic Orthography vs. Arabic OCR
Arabic Orthography vs. Arabic OCR Rich Heritage Challenging A Much Needed Technology Mohamed Attia Having consistently been spoken since more than 2000 years and on, Arabic is doubtlessly the oldest among
More informationAssessment and Evaluation
Assessment and Evaluation 201 202 Assessing and Evaluating Student Learning Using a Variety of Assessment Strategies Assessment is the systematic process of gathering information on student learning. Evaluation
More informationIN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.
6 1 IN THIS UNIT YOU LEARN HOW TO: ask and answer common questions about jobs talk about what you re doing at work at the moment talk about arrangements and appointments recognise and use collocations
More informationMERRY CHRISTMAS Level: 5th year of Primary Education Grammar:
Level: 5 th year of Primary Education Grammar: Present Simple Tense. Sentence word order (Present Simple). Imperative forms. Functions: Expressing habits and routines. Describing customs and traditions.
More informationWhat s in a Step? Toward General, Abstract Representations of Tutoring System Log Data
What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data Kurt VanLehn 1, Kenneth R. Koedinger 2, Alida Skogsholm 2, Adaeze Nwaigwe 2, Robert G.M. Hausmann 1, Anders Weinstein
More informationThink A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -
C.E.F.R. Oral Assessment Criteria Think A F R I C A - 1 - 1. The extracts in the left hand column are taken from the official descriptors of the CEFR levels. How would you grade them on a scale of low,
More informationLANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 11 : 12 December 2011 ISSN
LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume ISSN 1930-2940 Managing Editor: M. S. Thirumalai, Ph.D. Editors: B. Mallikarjun, Ph.D. Sam Mohanlal, Ph.D. B. A. Sharada, Ph.D.
More informationListening and Speaking Skills of English Language of Adolescents of Government and Private Schools
Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present
More informationANGLAIS LANGUE SECONDE
ANGLAIS LANGUE SECONDE ANG-5055-6 DEFINITION OF THE DOMAIN SEPTEMBRE 1995 ANGLAIS LANGUE SECONDE ANG-5055-6 DEFINITION OF THE DOMAIN SEPTEMBER 1995 Direction de la formation générale des adultes Service
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationWord Stress and Intonation: Introduction
Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress
More informationCourse Law Enforcement II. Unit I Careers in Law Enforcement
Course Law Enforcement II Unit I Careers in Law Enforcement Essential Question How does communication affect the role of the public safety professional? TEKS 130.294(c) (1)(A)(B)(C) Prior Student Learning
More informationThe Common European Framework of Reference for Languages p. 58 to p. 82
The Common European Framework of Reference for Languages p. 58 to p. 82 -- Chapter 4 Language use and language user/learner in 4.1 «Communicative language activities and strategies» -- Oral Production
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationProcedia - Social and Behavioral Sciences 146 ( 2014 )
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 146 ( 2014 ) 456 460 Third Annual International Conference «Early Childhood Care and Education» Different
More informationFormulaic Language and Fluency: ESL Teaching Applications
Formulaic Language and Fluency: ESL Teaching Applications Formulaic Language Terminology Formulaic sequence One such item Formulaic language Non-count noun referring to these items Phraseology The study
More informationEnglish Language and Applied Linguistics. Module Descriptions 2017/18
English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,
More informationSEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH
SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH Mietta Lennes Most of the phonetic knowledge that is currently available on spoken Finnish is based on clearly pronounced speech: either readaloud
More informationAtypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty
Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationAN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)
B. PALTRIDGE, DISCOURSE ANALYSIS: AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC. 2012. PP. VI, 282) Review by Glenda Shopen _ This book is a revised edition of the author s 2006 introductory
More informationGetting the Story Right: Making Computer-Generated Stories More Entertaining
Getting the Story Right: Making Computer-Generated Stories More Entertaining K. Oinonen, M. Theune, A. Nijholt, and D. Heylen University of Twente, PO Box 217, 7500 AE Enschede, The Netherlands {k.oinonen
More informationUniversity of Pittsburgh Department of Slavic Languages and Literatures. Russian 0015: Russian for Heritage Learners 2 MoWe 3:00PM - 4:15PM G13 CL
1 University of Pittsburgh Department of Slavic Languages and Literatures Russian 0015: Russian for Heritage Learners 2 MoWe 3:00PM - 4:15PM G13 CL Spring 2011 Instructor: Yuliya Basina e-mail basina@pitt.edu
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationTeachers: Use this checklist periodically to keep track of the progress indicators that your learners have displayed.
Teachers: Use this checklist periodically to keep track of the progress indicators that your learners have displayed. Speaking Standard Language Aspect: Purpose and Context Benchmark S1.1 To exit this
More informationRUSSIAN LANGUAGE, INTERMEDIATE LEVEL
Listening COMPREHENSION RUSSIAN LANGUAGE, INTERMEDIATE LEVEL Course tutor(s) Tatiana Batrakova, native speaker. Teaching experience at Social-Pedagogical Academy in Nizhny Tagil, Russia (2008 2012). Since
More informationAbstractions and the Brain
Abstractions and the Brain Brian D. Josephson Department of Physics, University of Cambridge Cavendish Lab. Madingley Road Cambridge, UK. CB3 OHE bdj10@cam.ac.uk http://www.tcm.phy.cam.ac.uk/~bdj10 ABSTRACT
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationSubject: Opening the American West. What are you teaching? Explorations of Lewis and Clark
Theme 2: My World & Others (Geography) Grade 5: Lewis and Clark: Opening the American West by Ellen Rodger (U.S. Geography) This 4MAT lesson incorporates activities in the Daily Lesson Guide (DLG) that
More informationRole of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation
Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,
More informationUnvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition
Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese
More informationEntrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany
Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International
More informationPossessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand
1 Introduction Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand heidi.quinn@canterbury.ac.nz NWAV 33, Ann Arbor 1 October 24 This paper looks at
More informationReview in ICAME Journal, Volume 38, 2014, DOI: /icame
Review in ICAME Journal, Volume 38, 2014, DOI: 10.2478/icame-2014-0012 Gaëtanelle Gilquin and Sylvie De Cock (eds.). Errors and disfluencies in spoken corpora. Amsterdam: John Benjamins. 2013. 172 pp.
More informationLanguage Acquisition Chart
Language Acquisition Chart This chart was designed to help teachers better understand the process of second language acquisition. Please use this chart as a resource for learning more about the way people
More informationCOMPETENCY-BASED STATISTICS COURSES WITH FLEXIBLE LEARNING MATERIALS
COMPETENCY-BASED STATISTICS COURSES WITH FLEXIBLE LEARNING MATERIALS Martin M. A. Valcke, Open Universiteit, Educational Technology Expertise Centre, The Netherlands This paper focuses on research and
More informationChapter 9: Conducting Interviews
Chapter 9: Conducting Interviews Chapter 9: Conducting Interviews Chapter Outline: 9.1 Interviewing: A Matter of Styles 9.2 Preparing for the Interview 9.3 Example of a Legal Interview 9.1 INTERVIEWING:
More informationMandarin Lexical Tone Recognition: The Gating Paradigm
Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition
More informationDevelopment of the First LRs for Macedonian: Current Projects
Development of the First LRs for Macedonian: Current Projects Ruska Ivanovska-Naskova Faculty of Philology- University St. Cyril and Methodius Bul. Krste Petkov Misirkov bb, 1000 Skopje, Macedonia rivanovska@flf.ukim.edu.mk
More informationThe Impact of the Multi-sensory Program Alfabeto on the Development of Literacy Skills of Third Stage Pre-school Children
The Impact of the Multi-sensory Program Alfabeto on the Development of Literacy Skills of Third Stage Pre-school Children Betina von Staa 1, Loureni Reis 1, and Matilde Conceição Lescano Scandola 2 1 Positivo
More informationText Type Purpose Structure Language Features Article
Page1 Text Types - Purpose, Structure, and Language Features The context, purpose and audience of the text, and whether the text will be spoken or written, will determine the chosen. Levels of, features,
More informationProcedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 141 ( 2014 ) 124 128 WCLTA 2013 Using Corpus Linguistics in the Development of Writing Blanka Frydrychova
More informationCommunicative Language Teaching (CLT): A Critical and Comparative Perspective
ISSN 1799-2591 Theory and Practice in Language Studies, Vol. 3, No. 9, pp. 1579-1583, September 2013 Manufactured in Finland. doi:10.4304/tpls.3.9.1579-1583 Communicative Language Teaching (CLT): A Critical
More informationProgram Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading
Program Requirements Competency 1: Foundations of Instruction 60 In-service Hours Teachers will develop substantive understanding of six components of reading as a process: comprehension, oral language,
More informationFilms for ESOL training. Section 2 - Language Experience
Films for ESOL training Section 2 - Language Experience Introduction Foreword These resources were compiled with ESOL teachers in the UK in mind. They introduce a number of approaches and focus on giving
More informationCLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction
CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets
More informationELP in whole-school use. Case study Norway. Anita Nyberg
EUROPEAN CENTRE FOR MODERN LANGUAGES 3rd Medium Term Programme ELP in whole-school use Case study Norway Anita Nyberg Summary Kastellet School, Oslo primary and lower secondary school (pupils aged 6 16)
More informationLower and Upper Secondary
Lower and Upper Secondary Type of Course Age Group Content Duration Target General English Lower secondary Grammar work, reading and comprehension skills, speech and drama. Using Multi-Media CD - Rom 7
More informationOutreach Connect User Manual
Outreach Connect A Product of CAA Software, Inc. Outreach Connect User Manual Church Growth Strategies Through Sunday School, Care Groups, & Outreach Involving Members, Guests, & Prospects PREPARED FOR:
More informationIntensive English Program Southwest College
Intensive English Program Southwest College ESOL 0352 Advanced Intermediate Grammar for Foreign Speakers CRN 55661-- Summer 2015 Gulfton Center Room 114 11:00 2:45 Mon. Fri. 3 hours lecture / 2 hours lab
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationFort Lewis College Institutional Review Board Application to Use Human Subjects in Research
Fort Lewis College Institutional Review Board Application to Use Human Subjects in Research Submit this application by email attachment to IRB@fortlewis.edu I believe this research qualifies for a Full
More informationArchitecture of Creativity and Entrepreneurship: A Participatory Design Program to Develop School Entrepreneurship Center in Vocational High School
Architecture of Creativity and Entrepreneurship: A Participatory Design Program to Develop School Entrepreneurship Center in Vocational High School Yandi Andri Yatmo & Paramita Atmodiwirjo Department of
More informationPart I. Figuring out how English works
9 Part I Figuring out how English works 10 Chapter One Interaction and grammar Grammar focus. Tag questions Introduction. How closely do you pay attention to how English is used around you? For example,
More informationTutoring First-Year Writing Students at UNM
Tutoring First-Year Writing Students at UNM A Guide for Students, Mentors, Family, Friends, and Others Written by Ashley Carlson, Rachel Liberatore, and Rachel Harmon Contents Introduction: For Students
More information1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature
1 st Grade Curriculum Map Common Core Standards Language Arts 2013 2014 1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature Key Ideas and Details
More informationAn Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District
An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District Report Submitted June 20, 2012, to Willis D. Hawley, Ph.D., Special
More informationOne Stop Shop For Educators
Modern Languages Level II Course Description One Stop Shop For Educators The Level II language course focuses on the continued development of communicative competence in the target language and understanding
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationDEPARTMENT OF JAPANESE LANGUAGE AND STUDIES
FCC Curriculum 98 DEPARTMENT OF JAPANESE LANGUAGE AND STUDIES The Department of Japanese Language and Studies has two majors: Japanese Linguistics and Teaching Methods Japanese Studies Students entering
More informationIntroduction to the Common European Framework (CEF)
Introduction to the Common European Framework (CEF) The Common European Framework is a common reference for describing language learning, teaching, and assessment. In order to facilitate both teaching
More informationLING 329 : MORPHOLOGY
LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,
More informationMonticello Community School District K 12th Grade. Spanish Standards and Benchmarks
Monticello Community School District K 12th Grade Spanish Standards and Benchmarks Developed by the Monticello Community High School Spanish Department Primary contributors to the 9 12 Spanish Standards
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationIndividual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION
L I S T E N I N G Individual Component Checklist for use with ONE task ENGLISH VERSION INTRODUCTION This checklist has been designed for use as a practical tool for describing ONE TASK in a test of listening.
More informationCOMMUNICATION & NETWORKING. How can I use the phone and to communicate effectively with adults?
1 COMMUNICATION & NETWORKING Phone and E-mail Etiquette The BIG Idea How can I use the phone and e-mail to communicate effectively with adults? AGENDA Approx. 45 minutes I. Warm Up (5 minutes) II. Phone
More informationEconomics. Nijmegen School of Management, Radboud University Nijmegen
Economics Nijmegen School of Management, Radboud University Nijmegen QANU, October 2012 Quality Assurance Netherlands Universities (QANU) Catharijnesingel 56 PO Box 8035 3503 RA Utrecht The Netherlands
More informationOffice: Colson 228 Office Hours: By appointment
1 Welcome to English 101: Composition and Rhetoric Section: 300 CRN# 82076 Fall 2015 1:00 PM to 2:15 PM Tuesdays, we meet in in Clark 410 Thursdays, we meet in Clark 212 Instructor: Shaun Turner Phone:
More informationFlorida Reading Endorsement Alignment Matrix Competency 1
Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending
More informationReviewed by Florina Erbeli
reviews c e p s Journal Vol.2 N o 3 Year 2012 181 Kormos, J. and Smith, A. M. (2012). Teaching Languages to Students with Specific Learning Differences. Bristol: Multilingual Matters. 232 p., ISBN 978-1-84769-620-5.
More informationThe development and implementation of a coaching model for project-based learning
The development and implementation of a coaching model for project-based learning W. Van der Hoeven 1 Educational Research Assistant KU Leuven, Faculty of Bioscience Engineering Heverlee, Belgium E-mail:
More informationLinguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis
International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:
More informationStefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio
Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds
More informationCourse Syllabus Advanced-Intermediate Grammar ESOL 0352
Semester with Course Reference Number (CRN) Course Syllabus Advanced-Intermediate Grammar ESOL 0352 Fall 2016 CRN: (10332) Instructor contact information (phone number and email address) Office Location
More informationSummer in Madrid, Spain
Summer in Madrid, Spain with the Coast Community College District Program dates: July 2 - July 31, 2007 ACCENT International Consortium for Academic Programs Abroad Immerse yourself in experiential learning
More informationLiteracy THE KEYS TO SUCCESS. Tips for Elementary School Parents (grades K-2)
Literacy THE KEYS TO SUCCESS Tips for Elementary School Parents (grades K-2) Randi Weingarten president Lorretta Johnson secretary-treasurer Mary Cathryn Ricker executive vice president OUR MISSION The
More informationThe Isett Seta Career Guide 2010
The Isett Seta Career Guide 2010 Our Vision: The Isett Seta seeks to develop South Africa into an ICT knowledge-based society by encouraging more people to develop skills in this sector as a means of contributing
More informationREG. NO. 2010/003266/08 SNAP EDUCATION (ASSOCIATION INC UNDER SECTION 21) PBO NO PROSPECTUS
REG. NO. 2010/003266/08 SNAP EDUCATION (ASSOCIATION INC UNDER SECTION 21) PBO NO. 930035281 PROSPECTUS Member: Mrs AM Van Rijswijk Principal +27 (0)83 236 1766 9 De Dam St, Vierlanden, Durbanville, 7550
More informationOn Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC
On Human Computer Interaction, HCI Dr. Saif al Zahir Electrical and Computer Engineering Department UBC Human Computer Interaction HCI HCI is the study of people, computer technology, and the ways these
More informationFirst Grade Curriculum Highlights: In alignment with the Common Core Standards
First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationTeaching Task Rewrite. Teaching Task: Rewrite the Teaching Task: What is the theme of the poem Mother to Son?
Teaching Task Rewrite Student Support - Task Re-Write Day 1 Copyright R-Coaching Name Date Teaching Task: Rewrite the Teaching Task: In the left column of the table below, the teaching task/prompt has
More informationVorlesung Mensch-Maschine-Interaktion
Vorlesung Mensch-Maschine-Interaktion Models and Users (1) Ludwig-Maximilians-Universität München LFE Medieninformatik Heinrich Hußmann & Albrecht Schmidt WS2003/2004 http://www.medien.informatik.uni-muenchen.de/
More informationAuthor: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015
Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) www.angielskiwmedycynie.org.pl Feb 2015 Developing speaking abilities is a prerequisite for HELP in order to promote effective communication
More informationFIGURE 8.2. Job Shadow Workplace Supervisor Feedback Form.
JOB SHADOW FEEDBACK FORM Student: Date of Job Shadow: Job Shadow Site: Phone: Email: Job Shadow Contact: 1. Did you have any concerns or comments about the student s behavior? Yes No 2. Would you be willing
More informationNottingham Trent University Course Specification
Nottingham Trent University Course Specification Basic Course Information 1. Awarding Institution: Nottingham Trent University 2. School/Campus: Nottingham Business School / City 3. Final Award, Course
More information