On the inadequateness of the existing model

Similar documents
Reading Project. Happy reading and have an excellent summer!

Guidelines for Writing an Internship Report

Test Blueprint. Grade 3 Reading English Standards of Learning

Coast Academies Writing Framework Step 4. 1 of 7

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Writing for the AP U.S. History Exam

Rhode Island College

Statewide Framework Document for:

Critical Thinking in Everyday Life: 9 Strategies

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

Holy Family Catholic Primary School SPELLING POLICY

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

STANISLAUS COUNTY CIVIL GRAND JURY CASE #08-04 LA GRANGE ELEMENTARY SCHOOL DISTRICT

Word Stress and Intonation: Introduction

SOCIAL SCIENCE RESEARCH COUNCIL DISSERTATION PROPOSAL DEVELOPMENT FELLOWSHIP SPRING 2008 WORKSHOP AGENDA

b) Allegation means information in any form forwarded to a Dean relating to possible Misconduct in Scholarly Activity.

The Algebra in the Arithmetic Finding analogous tasks and structures in arithmetic that can be used throughout algebra

HISTORY COURSE WORK GUIDE 1. LECTURES, TUTORIALS AND ASSESSMENT 2. GRADES/MARKS SCHEDULE

1. Introduction. 2. The OMBI database editor

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

Grade 4. Common Core Adoption Process. (Unpacked Standards)

The ABCs of O-G. Materials Catalog. Skills Workbook. Lesson Plans for Teaching The Orton-Gillingham Approach in Reading and Spelling

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Myths, Legends, Fairytales and Novels (Writing a Letter)

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark

Dakar Framework for Action. Education for All: Meeting our Collective Commitments. World Education Forum Dakar, Senegal, April 2000

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Just Because You Can t Count It Doesn t Mean It Doesn t Count: Doing Good Research with Qualitative Data

Empiricism as Unifying Theme in the Standards for Mathematical Practice. Glenn Stevens Department of Mathematics Boston University

UML MODELLING OF DIGITAL FORENSIC PROCESS MODELS (DFPMs)

Date Re Our ref Attachment Direct dial nr 2 februari 2017 Discussion Paper PH

Physics 270: Experimental Physics

Heritage Korean Stage 6 Syllabus Preliminary and HSC Courses

MASTER S THESIS GUIDE MASTER S PROGRAMME IN COMMUNICATION SCIENCE

Phonological and Phonetic Representations: The Case of Neutralization

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

ACADEMIC POLICIES AND PROCEDURES

Problems of the Arabic OCR: New Attitudes

TU-E2090 Research Assignment in Operations Management and Services

Phonological Processing for Urdu Text to Speech System

TEKS Comments Louisiana GLE

Text and task authenticity in the EFL classroom

What the National Curriculum requires in reading at Y5 and Y6

Ontologies vs. classification systems

Unit 8 Pronoun References

MOODLE 2.0 GLOSSARY TUTORIALS

Introduction 1 MBTI Basics 2 Decision-Making Applications 44 How to Get the Most out of This Booklet 6

5 Star Writing Persuasive Essay

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Multi-genre Writing Assignment

Writing a composition

Notetaking Directions

Rendezvous with Comet Halley Next Generation of Science Standards

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Plenary Session The School as a Home for the Mind. Presenters Angela Salmon, FIU Erskine Dottin, FIU

Consonants: articulation and transcription

4 th Grade Number and Operations in Base Ten. Set 3. Daily Practice Items And Answer Keys

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Rule-based Expert Systems

M.S. in Environmental Science Graduate Program Handbook. Department of Biology, Geology, and Environmental Science

Sacramento State Degree Revocation Policy and Procedure

Building a Sovereignty Curriculum

Sri Lanka. On the scale of a world map, Sri Lanka previously known as Ceylon appears to hang like a Pearl over the Indian Ocean.

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

Description: Pricing Information: $0.99

Teacher: Mlle PERCHE Maeva High School: Lycée Charles Poncet, Cluses (74) Level: Seconde i.e year old students

PHILOSOPHY & CULTURE Syllabus

IMPROVING SPEAKING SKILL OF THE TENTH GRADE STUDENTS OF SMK 17 AGUSTUS 1945 MUNCAR THROUGH DIRECT PRACTICE WITH THE NATIVE SPEAKER

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CEFR Overall Illustrative English Proficiency Scales

What is Thinking (Cognition)?

candidates) in aggregate in M.Com./MIB/ MHROD/ MFC/ MBA and other such

The Political Engagement Activity Student Guide

A Diverse Student Body

Pobrane z czasopisma New Horizons in English Studies Data: 18/11/ :52:20. New Horizons in English Studies 1/2016

Proof Theory for Syntacticians

AB104 Adult Education Block Grant. Performance Year:

The Strong Minimalist Thesis and Bounded Optimality

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

An Evaluation of E-Resources in Academic Libraries in Tamil Nadu

UKLO Round Advanced solutions and marking schemes. 6 The long and short of English verbs [15 marks]

Arabic Orthography vs. Arabic OCR

Practice Examination IREB

Unit 7 Data analysis and design

Critical Thinking in the Workplace. for City of Tallahassee Gabrielle K. Gabrielli, Ph.D.

Mercer County Schools

Pearson Longman Keystone Book D 2013

REGULATIONS RIGHTS AND OBLIGATIONS OF THE STUDENT

Oakland Schools Response to Critics of the Common Core Standards for English Language Arts and Literacy Are These High Quality Standards?

GRADUATE PROGRAM IN ENGLISH

Northern Virginia Alumnae Chapter of Delta Sigma Theta Sorority, Incorporated Scholarship Application Guidelines and Requirements

Section 3 Scope and structure of the Master's degree programme, teaching and examination language Appendix 1

Grade 7. Prentice Hall. Literature, The Penguin Edition, Grade Oregon English/Language Arts Grade-Level Standards. Grade 7

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales

Transcription:

Title: On the unsuitability of the COENG encoding model for Khmer Source: Cambodian Committee for Standardization of Khmer Characters in Computers Date: 2002-05-03 We welcome Mr. Michael Everson s recent submission (ISO/IEC JTC1/SC2/WG2 N2412) on the suitability of the COENG encoding model for Khmer, though we cannot agree with him on the main points. We would also appreciate it if he could bring counterarguments, if any, to the remaining points we raised before in our documents (ISO/IEC JTC1/SC2/WG2 N2380R and N2406). First of all, we have to reconfirm a basic point. The model he calls COENG encoding model had been called virama model until recently. The critical decision to adopt the existing model in 1998 was made principally on the reasoning that (t)he main benefit of the virama model was ease of implementation as it is a well-known model (ISO/IEC JTC1/SC2/WG2 N1729). We have previously shown that there is no virama sign as a general killer in Khmer script, unlike, for example, in Devanagari script. So the proponents of the current model had to invent a fictional character as just a control code, which led to a different model from the virama model. The fact that they had to change the name of the model when justifying it for Khmer supports our position that it does not correspond to the Khmer reality. Moreover, the ease of implementation of the existing model is even denied by implementers themselves, nullifying the reasoning of N1729. For both rendering and sorting, the explicitly encoded subscript model is better than the existing model. In sum, the existing model was decided based on critical misunderstandings. Now we wish to turn to refuting the new points raised in N2412. On the inadequateness of the existing model Mr. Everson quoted a figure from Daniels & Bright 1990 to show that Khmer script came from Indian Pallava prototype, a descendent of Brahmi. (We have found the figure 55 rather on p.448 of Peter T. Daniels and William Bright, eds., The World s Writing Systems, Oxford University Press, 1996.) We have never argued against the point that Brahmi script is an ancestor of Khmer script. We can, however, Page 1

refer to significant differences, too, with regard to each of the five points of similarity advanced to justify utilizing the same model. 1 While Khmer does indeed have independent vowel characters, their use is limited. Khmer script has another way to represent the same initial vowel sounds as them by using a consonant character QA for a glottal stop and a dependent vowel sign. The independent vowel characters are used almost only for old words. Some of them can be written using QA + dependent vowel sign, too. (give) can be written as. The other words, especially new words, are written using QA + dependent vowel sign only. (what, something) cannot be written as. (hotel) cannot be written as T. The existence of the consonant character QA is one proof of the unique development of Khmer script. 2 While in Khmer each consonant does have an inherent vowel, the Khmer system introduces a new feature with categorizing the consonant characters into two series, and varying the inherent vowel sound for a consonant character depending on which series it belongs to. There are many pairs of characters whose consonant sounds are the same but whose inherent vowel sounds are different. 3 While vowel signs are added to change the inherent vowel sound, because of the unique system of Khmer script mentioned above, the sound of the same vowel sign changes according to the series of the consonant character it is attached to. 4 & 5 They are important points. Another figure in p.380 of the 1996 Daniels & Bright book referred to above shows that Brahmi script diverged into northern scripts and southern scripts before the third century. Pallava is among the southern ones, while Devanagari belongs to the northern group. Page 2

The northern scripts generally constitute a ligature-like conjunct to represent a consonant cluster, where the original entities cannot be seen separately. Sometimes there are multiple representation forms for a single conjunct. These scripts have utilized a killer sign (virama) to suppress the preceding inherent vowel sound. Historically its use was limited to denote the absence of the inherent vowel sound of a final consonant of a syllable, but in the modern age it is also used to suppress the inherent vowel of the first consonant(s) in a consonant cluster in order to simplify complex conjuncts. It is not always the case with the southern scripts. For them, complex conjuncts to represent consonant clusters are rather exceptional. Tamil script has a real general killer sign (pulli), which makes most of such conjuncts unnecessary. Telugu developed another way. It developed consonant signs independent from consonant characters, and put them to the first consonant character to denote consonant clusters. Such differences between northern and southern scripts can be easily seen in the examples of kta, as Mr. Everson showed in p.1 of N2412. Khmer script came from the southern line, but has had its own history of development for more than 1400 years. It developed a complete system of consonant signs that are positioned below a consonant character. Because of this vertical positioning, a consonant sign is called coeng (leg, foot). Please note that coeng means a subscript consonant sign as a whole, not a killer sign like virama. A consonant character and a subscript consonant sign are completely independent entities. In most cases you can combine them as you like without changing their shapes. Complex conjuncts to represent consonant clusters are not necessary at all. This system also widened the use of the consonant signs. Sometimes they are used to denote a final consonant sound in a syllable, as follows: ƒ = ƒ (name) ƒ = (both) The existing COENG encoding model is based on a fictional general killer sign arbitrarily named COENG. This model was invented on the ground that a Khmer subscript consonant sign can be interpreted as a combination of COENG and a consonant character because a subscript consonant kills the preceding inherent vowels Page 3

like virama. This reasoning, however, is not adequate for Khmer script. Please see the examples above. In these cases, the consonant sign DOES NOT KILL any preceding inherent vowel sound. Subscript consonant signs in Khmer have more roles than was expected by those who invented COENG encoding model. We can refer to another example. Not only a consonant character but also an independent vowel character can have a consonant sign below it. There is no change of the initial vowel sound in the following cases. (give) (exclamation of solemn affirmation) These features show the uniqueness of Khmer compared with Indic scripts, especially Devanagari. The logic of the virama model is artificial. As Mr. Everson himself admits, there is no virama in the original Brahmi script itself, which means it is not a common or natural feature of those scripts derived from Brahmi. It is just one possible way to deal with complex conjuncts for consonant clusters efficiently by a system of ligature control paying attention to the phonetic function of the virama to kill the preceding inherent vowels. Thus Mr. Everson s assertion that all the scripts rooted in Brahmi should use the existing model is groundless. It is clear that such logic is not adequate for Khmer. As shown above, Khmer script has its own unique structure. The existence of subscript consonant signs independent from consonant characters is the core of the structure. Consequently, the explicitly encoded subscript model is far better than the existing model, not only for storing data but also for sorting, searching and rendering precisely because it fits the structure of the script itself. On the process As for the lack of due process that is necessary in making international standards, we wrote basic important facts in ISO/IEC JTC1/SC2/WG2 N2406, so we will not repeat them here, and will limit ourselves to saying that we stand by our position that an irregular and unacceptable process was followed, without proper consultation with the designated national body. Page 4

The tentative results of the five meetings Mr. Everson mentioned were summarized in a private report of National Higher Education Task Force dated on August 14, 1996, addressed to Mr. Maurice Bauhahn. Although it is true that eminent linguists gathered, they did not decide any official or final stance of Cambodia. The report itself says it is not sufficient. This task force was not given a mandate to make an official decision on this issue. It had nothing to do with the national standards body of Cambodia that had already been registered with ISO in 1995. Nevertheless, it is still useful to confirm here that the report clearly listed subscript consonant signs independently from consonant characters among the necessary characters that should be encoded. While non-cambodians might have suggested to them to accept virama model they evidently refused to do so. Mr. Everson s assertion that they were not explicitly against virama model is not supported by the facts shown in the report. We would like to add that some of the scholars mentioned by Mr. Everson are clearly supporting the current Cambodian stance. On ROBAT In modern Khmer script, ROBAT has no active role like repha in Devanagari, and is seen just a diacritic. In some old loan words from Sanskrit/Pali, it is pronounced according to its original rule i.e. just before the base character it is attached above. In the other such old loan words, however, it is not pronounced at all. It is kept just for information of the original spelling. It is sure that we can see words containing ROBAT even now, it is not a rule for Khmer script itself to spell a consonant cluster beginning with RO by ROBAT. The rule is to spell a consonant character RO and a subscript consonant sign of another consonant below it. More than a hundred examples can be found in the Cambodian standard Chuon Nath s dictionary. Some words are written in both ways. = (civilized) Š (king hermit) etc. Thus Mr. Everson s proposal to deprecate ROBAT based on the premise that the Page 5

consonant character RO cannot have a subscript consonant sign is not acceptable. On other points Mr. Everson is trying to play down some of the strong points of the explicitly encoded subscript model we are proposing, but he cannot deny them. That is enough for us. The ultimate reasons for not adopting our model seem to be procedural ones. We also have much to say about procedures, as we wrote in N2406. Mr. Everson asserts that UCS as a universal encoding standard and interchange platform would be compromised if our requests are accepted. We do not think so. Universal does not mean all the same. It should mean everyone can enjoy it. For that purpose, the credibility of Unicode for everyone should be important. Please understand that we are making our proposal to make UCS/Unicode better, not to put it down. We would like to add another point finally. Even if an encoding is not a good one, there is no problem once it was approved by the concerned parties. This Khmer case is an irregular one where such due process was not followed. Cambodia has never approved the existing standard. It was never informed or consulted officially. We believe this is a special case. So nobody needs to worry about possible changes in the standard of other scripts that were established with the approval of concerned parties. Page 6