International Conference on Language Documentation and Conversation. March 2009

Similar documents
Dictionaries of under-researched languages

English Language and Applied Linguistics. Module Descriptions 2017/18

Construction Grammar. University of Jena.

Modeling full form lexica for Arabic

Chapter 9 Banked gap-filling

LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 11 : 12 December 2011 ISSN

QUT Library 7 Step Plan for Writing

Automated Identification of Domain Preferences of Collocations

Ontological spine, localization and multilingual access

L1 and L2 acquisition. Holger Diessel

1. Introduction. 2. The OMBI database editor

Learning and Retaining New Vocabularies: The Case of Monolingual and Bilingual Dictionaries

DESIGNING NARRATIVE LEARNING MATERIAL AS A GUIDANCE FOR JUNIOR HIGH SCHOOL STUDENTS IN LEARNING NARRATIVE TEXT

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach

Analysis of Lexical Structures from Field Linguistics and Language Engineering

TITLE: Shakespeare: The technical words. DATE(S): Project will run for four weeks during June or July

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

The role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning

CS 598 Natural Language Processing

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Word Sense Disambiguation

UNIT 1. Unit 1. I m only human

Beyond constructions:

Lexicology and Lexicography

California Department of Education English Language Development Standards for Grade 8

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Corpus Linguistics (L615)

An Analysis of Practical Lexicography: A Reader (Ed. Fontenelle 2008)

An Analysis of Practical Lexicography: A Reader (Ed. Fontenelle 2008)

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

Unbalanced, Idle, Canonical and Particular: Polysemous Adjectives in English Dictionaries

Lemmatization of Multi-word Lexical Units: In which Entry?

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

Prentice Hall Literature: Timeless Voices, Timeless Themes, Platinum 2000 Correlated to Nebraska Reading/Writing Standards (Grade 10)

EDUCATING TEACHERS FOR CULTURAL AND LINGUISTIC DIVERSITY: A MODEL FOR ALL TEACHERS

MERRY CHRISTMAS Level: 5th year of Primary Education Grammar:

Collocations of Nouns: How to Present Verb-noun Collocations in a Monolingual Dictionary

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

Lesson Plan Title Aquatic Ecology

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

From general dictionaries to terminological glossaries. User expectations vs editorial aims

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

GERMAN STUDIES (GRMN)

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Kentucky s Standards for Teaching and Learning. Kentucky s Learning Goals and Academic Expectations

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Participate in expanded conversations and respond appropriately to a variety of conversational prompts

been each get other TASK #1 Fry Words TASK #2 Fry Words Write the following words in ABC order: Write the following words in ABC order:

Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)

Adaptations and Survival: The Story of the Peppered Moth

Derivational and Inflectional Morphemes in Pak-Pak Language

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

An Introduction to the Minimalist Program

Language Acquisition Chart

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Some Principles of Automated Natural Language Information Extraction

Procedia - Social and Behavioral Sciences 200 ( 2015 )

Stakeholder Debate: Wind Energy

University of Pittsburgh Department of Slavic Languages and Literatures. Russian 0015: Russian for Heritage Learners 2 MoWe 3:00PM - 4:15PM G13 CL

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Referencing the Danish Qualifications Framework for Lifelong Learning to the European Qualifications Framework

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Natural Language Processing. George Konidaris

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Revision and Digitisation of the Early Volumes of Norsk Ordbok: Lexicographical Challenges

Myths, Legends, Fairytales and Novels (Writing a Letter)

Argument structure and theta roles

CONTENUTI DEL CORSO (presentazione di disciplina, argomenti, programma):

Topic: Making A Colorado Brochure Grade : 4 to adult An integrated lesson plan covering three sessions of approximately 50 minutes each.

Heritage Korean Stage 6 Syllabus Preliminary and HSC Courses

Cheeky Monkey COURSES FOR CHILDREN. Kathryn Harper and Claire Medwell

Developing a TT-MCTAG for German with an RCG-based Parser

2. Theoretical framework of Simultaneous Feedback

Ontologies vs. classification systems

VOCABULARY INSTRUCTION

Practical Learning Tools (Communication Tools for the Trainer)

WEBSITES TO ENHANCE LEARNING

Controlled vocabulary

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

CORPUS ANALYSIS CORPUS ANALYSIS QUANTITATIVE ANALYSIS

Literature and the Language Arts Experiencing Literature

Procedia - Social and Behavioral Sciences 154 ( 2014 )

All Systems Go! Using a Systems Approach in Elementary Science

Cross Language Information Retrieval

About this unit. Lesson one

NAME: East Carolina University PSYC Developmental Psychology Dr. Eppler & Dr. Ironsmith

Slovak Synonym Dictionary

TWO OLD WOMEN (An Alaskan Legend of Betrayal, Courage and Survival) By Velma Wallis

New Ways of Connecting Reading and Writing

This publication is also available for download at

LITERACY ACROSS THE CURRICULUM POLICY

Introduction to Swahili Language and East African Tribal Communities SFS 2060

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

The English Monolingual Dictionary: Its Use among Second Year Students of University Technology of Malaysia, International Campus, Kuala Lumpur

Transcription:

Turning a linguist s lexical data base into a community dictionary Ulrike Mosel, University Kiel International Conference on Language Documentation and Conversation 11 March 2009

The Teop language documentation project Teop 22

Language documentation Corpus SKETCH GRAMMAR examples Recordings with transcriptions, translations, comments, pictures Edited versions of recordings LEXICAL DATABASE written texts headwords translations/ examples Beispiele Belegstellenangabe Collection of single sentences with translation Photos drawings Videos

Typology about the world about language dictionaries monolingual native speaker learner encyclopedias bilingual active passive 44

Typology dictionaries monolingual encyclopedia bilingual community dictionaries 55

Dictionaries for endangered languages are special dictionaries ordinary dict. community dict. economic basis commercial funding agencies time frame decades 3-10 years, part time purpose translating, L2 learning general public language maintenance users lexicographers professionals linguistic resources huge corpora, old dictionaries small community, academics linguists, community members language documentation 66

Dictionaries for endangered speech communities: users & purpose Linguists Native speakers understand and analyse texts passive dictionary for translation Preservation of cultural memory education language maintenance, passive & active learner dictionary linguistic information linguistic & encyclopedic information 77

Content and structure of the TLD Not an end product, but a dynamic tool, containing information on the inherent grammatical features of lexical units (gender, valency) - grammatical relations between words (conversion, derivation, composition) - semantic features - semantic relations to other lexical units - translation equivalents -extralinguistic reality - 88

Lexical database Dictionary Space unlimited restricted Macrostructure multi-dimensional linear Purpose Content constantly growing unbiased resource and tool for researchers moderately selective user oriented, for the speech community highly selective Meanings text meanings meaning potentials examples citations illustrative examples 99

Lexical database Dictionary orthography can easily deal with variation standardization preferable grammar citations contain speech errors, interferences from dominant languages standard forms preferable 1010

Problem: Text meaning vs. meaning potential Not the meaning of the lexemes captured, but only the various senses the word has in the particular contexts of the corpus. And what we consider as distinct senses is influenced by the translation equivalents. Example: naovana and Gaivaa became a naovana that we call seagull then you get two cockatoo feathers, this is a white naovana,... naovana bird but other naovana: flying foxes, insects 1111

Text meanings arise from combinations, not from any one word individually. Problem: citations are not suitable as example sentences. stabbed him... with her hand? 1212 Obj100

This old woman, Sharphand, stabbed him to death with her hand. 1313

the creation of a dictionary is a different job - revision of sense discrimination - systematic ordering of senses - revision of examples moderate standardization of orthography and grammar - The same applies to electronic lexica like LEXUS You need two versions: a) database as a tool for researchers and lexicographers b) community version 1414

Problem: time management The lexical database is too big to be transformed into a dictionary within a short time.. no comprehensive dictionary never promise such a dictionary 1515

Nessessitiy is the mother of invention The first monolingual Samoan dictionary Ministry of Youth, Culture and Sports, Western Samoa Australian South Pacific Culture Fund 1 year - 10 000 AUD (6000 ) compiled1994 pubished 1997 How much can be done in one year? Jakob Grimm, Wilhelm Grimm et al. 1852-1960. Deutsches Wörterbuch (German Dictionary) 16 volumes 1863 (death of Jakob Grimm) : A - Frucht 1616

Alphabetical method Thematic method A, B, C, D,... How do you collect headwords? start with A house building, fishing,... - filter the database - interview experts Time planning? Setting priorities? 1717

Themes of Teop mini-dictionaries: house, canoe, fishes, fishing, the sea, shellfish, trees, other plants, gardening, cooking, body, health, kinship, ceremonies Further advantages of mini-dictionaries: specialised vocabulary is less frequently used more endangered Ø Ø less polysemous 1818

Problem : specialised vocabulary is under-represented in the corpus supplementary recordings Supplementary recordings in Teop for the House-Dictionary: How to build a men s house How to make bamboo walls How to make the floor from the wood of the kabuu palm How to make the thatch 1919

Further problems of specialised vocabulary Ø it is difficult to translate Ø linguists have no expertise in ethno-sciences Ø indigenous experts lack the necessary proficiency in the target lang. vernacular encyclopedic descriptions with translations 2020

Problem: descriptions are not an indigenous genre but show the expressive potential of the language reflect the native speaker's metalinguistic knowledge can show the native speaker's conceptualisation of extralinguistic phenomena, e.g. taxonomies are useful to understand the meaning of words training: Explain what X is to a child. 2121

Sensitive training in writing definitions marahiri 'The marahiri is a fish, that lives in the sea. The marahiri has no scales, its body is slippery ' (in Teop language) What characteristics are essential? 2222

Sensitive training avoid patronising - encourage them to keep and/or develop their own way of explaining words and things - avoid style guides - naovana bird We eat many birds. Birds are a good food. Only the birds that have a story are the ones that we do not eat. These birds are Pasukokoreo, Topeipei, Toai und Koo. Content of encyclopaedic information: cf. Coward & Grimes 2000: 138-153) consult anthropologists, ethnobotanists, etc. 2323

encyclopaedic information: The tuna is a big fish. These fish only stay in the ocean and eat little ocean fishes. The small tunas also stay in the ocean. This fish is eaten by the people. This fish has a white belly and its sides are also white, but its back has black and white stripes. 2424

Conventional and idiosyncratic language use in indigenous lexicography Let different people work on items that are presumably described in a similar way; e.g. house & canoe or fishes & birds similarities of house and canoe descriptions: topic = thing predicate = put s.th. somewhere similarities of fishes and birds descriptions: topic = animal predicate = habit = properties like size, colour, shape 2525

Linguistic observations conventionalized constructions for descriptions of things, properties and events systematic patterns of polysemy and word formation Example for systematic heterosemy/conversion Noun part of the house Verb add this part (e.g. wall) onto the house rafter put up the rafters bamboo wall put up the bamboo wall fence put up the fence 2626

Conclusions A lexical database and a dictionary are two very different things a dictionary is not a by-product a dictionary requires hard work and active involvement of the speech community Start with a mini-dictionary. Use vernacular encyclopedic descriptions. Be rewarded by a culturally and linguistically interesting, and completed little dictionary! 2727

Fishdictionary

References Atkins, B.T. Sue & Rundell, Michael. 2008. The Oxford Guide to Practical Lexicography. OUP. Coward, David F. & Grimes, Charles E. 2000. Making Dictionaries, SIL International Waxhaw. North Carolina. Hanks, Patrick. 2008. Do word meanings exist? In Thierry Fotenelle. Practical Lexicography. A Reader. OUP, pp. 125-134 Kilgarriff, Adam. 2008. I don t believe in word senses! In Thierry Fotenelle. Practical Lexicography. A Reader. OUP, pp. 135-151 2929