CHAPTER 2 THESAURUS CONSTRUCTION AND ITS ROLE IN INDEXING. 2.1 Introduction Composition of Thesaurus... 14

Similar documents
Controlled vocabulary

Ontological spine, localization and multilingual access

Use of Online Information Resources for Knowledge Organisation in Library and Information Centres: A Case Study of CUSAT

AQUA: An Ontology-Driven Question Answering System

Test Blueprint. Grade 3 Reading English Standards of Learning

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Ontologies vs. classification systems

Big Fish. Big Fish The Book. Big Fish. The Shooting Script. The Movie

Multilingual access to information using an intermediate language

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

Literature and the Language Arts Experiencing Literature

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

21st CENTURY SKILLS IN 21-MINUTE LESSONS. Using Technology, Information, and Media

New Jersey Department of Education

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

A process by any other name

Prentice Hall Literature: Timeless Voices, Timeless Themes, Platinum 2000 Correlated to Nebraska Reading/Writing Standards (Grade 10)

10.2. Behavior models

User education in libraries

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Master Program: Strategic Management. Master s Thesis a roadmap to success. Innsbruck University School of Management

Digital Storytelling:Great Depression

Cross-Lingual Text Categorization

Impact of Digital India program on Public Library professionals. Manendra Kumar Singh

Writing a composition

Library Reference Services textbook Chapter 7

A Note on Structuring Employability Skills for Accounting Students

MAHATMA GANDHI KASHI VIDYAPITH Deptt. of Library and Information Science B.Lib. I.Sc. Syllabus

Achievement Level Descriptors for American Literature and Composition

STATUS OF OPAC AND WEB OPAC IN LAW UNIVERSITY LIBRARIES IN SOUTH INDIA

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Word Segmentation of Off-line Handwritten Documents

Prentice Hall Literature Common Core Edition Grade 10, 2012

Teaching ideas. AS and A-level English Language Spark their imaginations this year

Clumps and collection description in the information environment in the UK with particular reference to Scotland

A Bayesian Learning Approach to Concept-Based Document Classification

The College Board Redesigned SAT Grade 12

MISSISSIPPI OCCUPATIONAL DIPLOMA EMPLOYMENT ENGLISH I: NINTH, TENTH, ELEVENTH AND TWELFTH GRADES

ScienceDirect. Malayalam question answering system

Longman English Interactive

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

What the National Curriculum requires in reading at Y5 and Y6

Modeling full form lexica for Arabic

Guidelines for Writing an Internship Report

Student Name: OSIS#: DOB: / / School: Grade:

Journal Article Growth and Reading Patterns

SOC 1500 (Introduction to Rural Sociology)

Rottenberg, Annette. Elements of Argument: A Text and Reader, 7 th edition Boston: Bedford/St. Martin s, pages.

correlated to the Nebraska Reading/Writing Standards Grades 9-12

English-German Medical Dictionary And Phrasebook By A.H. Zemback

Copyright Corwin 2015

HDR Presentation of Thesis Procedures pro-030 Version: 2.01

The future of metadata: open, linked and multilingual The YSO case

Exploring the Development of Students Generic Skills Development in Higher Education Using A Web-based Learning Environment

Holy Family Catholic Primary School SPELLING POLICY

Exploring classification as conversation

Senior Stenographer / Senior Typist Series (including equivalent Secretary titles)

An Interactive Intelligent Language Tutor Over The Internet

A Case Study: News Classification Based on Term Frequency

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Rule Learning With Negation: Issues Regarding Effectiveness

Linking Task: Identifying authors and book titles in verbose queries

Taking into Account the Oral-Written Dichotomy of the Chinese language :

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

5 th Grade Language Arts Curriculum Map

Diploma in Library and Information Science (Part-Time) - SH220

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Grade 11 Language Arts (2 Semester Course) CURRICULUM. Course Description ENGLISH 11 (2 Semester Course) Duration: 2 Semesters Prerequisite: None

BULATS A2 WORDLIST 2

IMPROVING SPEAKING SKILL OF THE TENTH GRADE STUDENTS OF SMK 17 AGUSTUS 1945 MUNCAR THROUGH DIRECT PRACTICE WITH THE NATIVE SPEAKER

VOCABULARY FOR THE HIGH SCHOOL STUDENT ANSWERS PDF

Valdosta State University Master of Library and Information Science MLIS 7130 Humanities Information Services Syllabus Fall 2011 Three Credit Hours

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

This publication is also available for download at

1/25/2012. Common Core Georgia Performance Standards Grade 4 English Language Arts. Andria Bunner Sallie Mills ELA Program Specialists

California Department of Education English Language Development Standards for Grade 8

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Higher Education Review (Embedded Colleges) of Kaplan International Colleges UK Ltd

International Conference on Education and Educational Psychology (ICEEPSY 2012)

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

Primary English Curriculum Framework

Software Maintenance

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

Text Type Purpose Structure Language Features Article

MYP Language A Course Outline Year 3

and secondary sources, attending to such features as the date and origin of the information.

Florida Reading Endorsement Alignment Matrix Competency 1

I. INTRODUCTION. for conducting the research, the problems in teaching vocabulary, and the suitable

Effect of Cognitive Apprenticeship Instructional Method on Auto-Mechanics Students

Loughton School s curriculum evening. 28 th February 2017

The role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning

International Examinations. IGCSE English as a Second Language Teacher s book. Second edition Peter Lucantoni and Lydia Kellas

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

Some Principles of Automated Natural Language Information Extraction

MSW POLICY, PLANNING & ADMINISTRATION (PP&A) CONCENTRATION

Introduction to CRC Cards

Information Retrieval

Planning a research project

English Language and Applied Linguistics. Module Descriptions 2017/18

Transcription:

CHAPTER 2 THESAURUS CONSTRUCTION AND ITS ROLE IN INDEXING 2.1 Introduction... 14 2.2 Composition of Thesaurus... 14 2.2.1 Preferred Terms/Descriptor... 15 2.2.2 Non-preferred Terms/Non-descriptor... 15 2.2.3 Related Terms... 16 2.2.4 Semantic Relations... 16 2.2.5 Meaning of USE and Used For (UF) in a Thesaurus... 16 2.2.6 Scope Notes... 17 2.3 How to Build a Thesaurus... 17 2.3.1 Collecting Terms... 17 2.3.2 Modification of Terms as Per Local Requirement... 18 2.3.3 Establishing Relations... 18 2.3.4 Thesaurus Display Format... 19 2.4 Role of Thesaurus in Indexing... 19 2.5 Conclusion... 20 References...... 20

CHAPTER - 2 THESAURUS CONSTRUCTIONS AND ITS ROLE IN INDEXING 2.1 Introduction A thesaurus is a representation of keywords associated with a subject domain(s). It is also known as a tool for vocabulary control that guides indexers as well as end users to know the use of terms and helps in improving quality of search results. Generally, a thesaurus is designed for indexing and subsequent retrieval of documents in a specific subject domain. Examples of subject areas covered by thesauri are ERIC for education resources, macrothesaurus for economic resources, GESIS thesaurus for social science, legal thesaurus developed by R.L. Burton, etc,. Thesaurus plays an important role in the organization of information and its subsequent retrieval. According to Aitchison et al. (2000), primary role of the thesaurus is information retrieval. Information retrieval is one of the most important aspects of librarianship as it deals with needs of end users of the library. 2.2 Composition of Thesaurus A thesaurus provides different types of information for indexers as well as for end users. A thesaurus will comprise of terminologies which are used for describing terms and semantic relations with each other. These terms are preferred terms/descriptors, non-preferred terms/non-descriptors, related terms, narrow (14)

terms(nt), broad terms(bt), USE and Used For(UF), etc,. For effective use of thesaurus while indexing documents and allocating appropriate keywords one has to have a clear understanding of relations in it. An attempt has been made to define all the relations used in a thesaurus. 2.2.1 Preferred Terms/Descriptor It is obvious that a thesaurus needs to indicate what all terms could be used by indexers for indexing documents. The most appropriate terms selected while indexing a bibliographic record are known as descriptors or preferred terms. These terms (descriptors) are the basis for any thesaurus and become major part of controlled vocabulary. A descriptor becomes starting point and it will guide the indexer to choose appropriate related terms and narrow terms. 2.2.2 Non-preferred Terms/Non-Descriptor A thesaurus should also bespeak some terms that cannot be used by indexers and questers when a term is associated with synonyms. Such synonymous terms which cannot be used or assigned to a bibliographic record as subject heading are known as non-descriptors. These terms are known as non-descriptor or non-preferred terms. A non-preferred term in a thesaurus provides link to a most preferred term or descriptor to be assigned to a record. In other words, a non-descriptor in a thesaurus guides the indexer to choose appropriate and most preferred term. (15)

2.2.3 Related Terms A standard thesaurus, in addition to providing pointer to the most preferred term in lieu of non-preferred terms, also provides list of terms which are related to a specific descriptor or a preferred term. These terms are called related terms or RTs. 2.2.4 Semantic Relations According to Weinberg (1998), the thesaurus structure embodies rigorous semantic relationships and reflects the principle of post-coordination of terms. Weinberg (1998) also opines that rigorous semantic relationships allow a user to enter the thesaurus and to identify the appropriate search term(s). Thesauri contain three types of semantic relationship: equivalence hierarchy association BT (Broader Term), NT (Narrower Term), USE, UF (Used For) are some of the examples of semantic relations in a thesaurus. 2.2.5 Meaning of USE and Used For (UF) in a Thesaurus A non-preferred term is normally linked to a corresponding preferred term by a USE reference. The corresponding reference in the opposite direction i.e. UF ( Used For ). For example, Gender Discrimination UF: Sex Discrimination (16)

Preferred term is Gender Discrimination and the corresponding nonpreferred term is Sex Discrimination. 2.2.6 Scope Notes A scope note in a thesaurus guides an indexer to understand exact meaning of a descriptor and assists him/her in appropriate selection. More the number of scope notes better will be the quality of a thesaurus. Scope notes are very important as most of the indexers are not subject experts. A scope note also improves the indexing skill of a professional and augments quality of index. 2.3 How to Build a Thesaurus A thesaurus which only lists all the preferred and non-preferred terms is known as enumerative thesaurus. The building processes in a thesaurus include: collecting terms, modifying terms, decision for descriptor or non-descriptor, establishing semantic relations and scope notes for defining a concept, etc,. An attempt has been made to describe the steps involved in construction of a thesaurus. 2.3.1 Collecting Terms First and foremost step in construction of a thesaurus is collecting a set of terms. While some of these terms, thus collected become preferred terms, rest of these may become non-preferred. Before collecting sets of terms, one has to decide the sources from which such terms are identified. It could be existing thesauri, indexes, dictionaries, glossaries, etc., or it could be extracted from the textual metadata such as title, abstract, full text, etc., or it could be derived from the (17)

discussion with a subject expert. Generally, a term in a thesaurus should include nouns or noun phrases and it should exclude proper nouns. 2.3.2 Modification of Terms as Per the Local Requirements Majority of the terms collected for a thesaurus may be used as nouns or as noun phrases and adjectives. While building a thesaurus, it is extremely important to use such terms which are most sought by the end users at the time of retrieval. Such terms may vary from one country to another. For example, the descriptor Reservation Policy which is most accepted term in India is known as Affirmative Action in United States of America (USA). Similarly, there will be spelling variations across the countries, especially in USA and United Kingdom (UK). The term Labour will be spelt as Labor in USA. Therefore, while constructing a thesaurus it is also important to take into cognition the above mentioned and similar variations. 2.3.3 Establishing Relations The third step in building a thesaurus is establishing relationships between terms. There are three types of relationships in any thesaurus - equivalence, hierarchy and association. According to Aitchison et al., (2000), the equivalence relationship is generally established between a descriptor and a non-descriptor. She also opined that the hierarchical relationship deals with a topical term and subordinate terms to establish relationship between superior terms and hyponym term. This type of concordant relation is established using BTs (Broad Terms) and NTs (Narrow Terms) which in turn establish hierarchical relation in a thesaurus. (18)

According to Weinberg (1998) associative relationship means two terms overlapping with each other with same meaning. The associative may be symmetrical, e.g., gold is related to money and money is related to gold asymmetrical, e.g., population control is related to family planning, but there is no related-term reference in the opposite direction. (Someone searching for family planning is unlikely to be interested in population control.) 2.3.4 Thesaurus Display Format The final step in constructing a thesaurus is to create display format according to the nature of thesaurus. The thesaurus could be displayed in two ways I) Alphabetic Sequence and II) Classified Sequence. In alphabetical thesaurus, terms are arranged in one alphabetic order and its associated terms are displayed under each descriptor. In case of a classified thesaurus, all the terms related to same concept will appear at same place and entire thesaurus is arranged in faceted manner with all the relations. According to Soergel (1974), the relationships in an alphabetic thesaurus should be in a specific sequence. As per his recommendations, a descriptor need to appear first in the sequence followed by Scope Note (SN), Broad Term (BT), Narrow Term (NT), Related Term (RT) in the same consistent order. 2.4 Role of Thesaurus in Indexing Generally, an index represents a concept elaborated in a piece of information. Indexing is done to describe or identify the document using preferred term of subject content. Thesaurus (Controlled vocabulary), classification system and subject (19)

headings are three most recognized and widely accepted indexing tools. The major role of a thesaurus is to help indexers to understand general comprehension of the subject area, outline inter-relationships between concepts, and provide definitions of terms as described by Aitchison, Gilchrist & Bawden (2000). Use of thesaurus in indexing of specific collection improves the quality of information retrieval in a particular subject domain. 2.5 Conclusion Thesaurus plays a vital role in indexing and its subsequent retrieval of indexed documents. A bibliographic database, print or electronic, contains large number of records which need to have surrogates for effective and exhaustive retrieval. A descriptor is a surrogate for subject which embodies thought content of a document. Therefore, the technique of indexing using a standard thesaurus with appropriate search terms would fetch highly relevant documents at the time of search. Thesaurus construction involves multiple tasks which need to be executed in a logical sequence. All the above mentioned steps have been followed while constructing the thesaurus for Indian social science literature under the present research study. References 1. Aitchison, J., Gilchrist, A., & Bawden, D. (2000). Thesaurus construction and use: a practical manual (4. ed). London: Aslib. 2. Cleveland, D. B., & Cleveland, A. D. (2001). Introduction to indexing and abstracting (3rd ed). Englewood, Colo: Libraries Unlimited. (20)

3. Craven, T. (2008). Thesaurus Construction. Retrieved June 13, 2016, from http://publish.uwo.ca/~craven/677/thesaur/main00.htm 4. ERIC - Education Resources Information Center. (2016). Retrieved June 13, 2016, from https://eric.ed.gov/ 5. Redmond-Neal, A., Hlava, M. M. K., Milstead, J. L., & American Society for Information Science and Technology (Eds.). (2005). ASIS&T thesaurus of information science, technology, and librarianship (3rd ed). Medford, N.J: Published for the American Society for Information Science and Technology by Information Today, Inc. 6. Soergel, D. (1974). Indexing languages and thesauri: Construction and maintenance. Los Angeles: Melville Publishing Company. 7. Weinberg, B. H. (1998). Thesaurus Design for Information Systems. Seminar presented at the Vocabulary Links:// Thesaurus Design for Information Systems, New York. Retrieved from http://www.allegrotechindexing.com/ article02.htm (21)