Representing Sumbawa in Unicode

Similar documents
L2/ Introduction. 2 Background. 3 Script Details

Proposal to Encode the Old Makassarese Script in Unicode

Year 4 National Curriculum requirements

Sari locative noun classes Contents

The analysis starts with the phonetic vowel and consonant charts based on the dataset:

TEKS Comments Louisiana GLE

English for Life. B e g i n n e r. Lessons 1 4 Checklist Getting Started. Student s Book 3 Date. Workbook. MultiROM. Test 1 4

Rhode Island College

IS SABAH MALAY A REAL LANGUAGE? By: Jane Wong Kon Ling, Ph.D Centre for the Promotion of Knowledge and Language Learning Universiti Malaysia Sabah

Sounds of Infant-Directed Vocabulary: Learned from Infants Speech or Part of Linguistic Knowledge?

REGIONAL CAPACITY BUILDING ON ICT FOR DEVELOPMENT

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Information Session 13 & 19 August 2015

SEGMENTAL FEATURES IN SPONTANEOUS AND READ-ALOUD FINNISH

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

DIBELS Next BENCHMARK ASSESSMENTS

Universal contrastive analysis as a learning principle in CAPT

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

Arabic Orthography vs. Arabic OCR

Large Kindergarten Centers Icons

Word Stress and Intonation: Introduction

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Date Re Our ref Attachment Direct dial nr 2 februari 2017 Discussion Paper PH

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

Using a Native Language Reference Grammar as a Language Learning Tool

The Demographic Wave: Rethinking Hispanic AP Trends

The ABCs of O-G. Materials Catalog. Skills Workbook. Lesson Plans for Teaching The Orton-Gillingham Approach in Reading and Spelling

Eye Level Education. Program Orientation

Tap vs. Bottled Water

Chapter 5: Language. Over 6,900 different languages worldwide

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

Highlighting and Annotation Tips Foundation Lesson

Margaret Parnell Hogan. Focus Areas. Overview

Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge

Phonological Encoding in Sentence Production

JOB OUTLOOK 2018 NOVEMBER 2017 FREE TO NACE MEMBERS $52.00 NONMEMBER PRICE NATIONAL ASSOCIATION OF COLLEGES AND EMPLOYERS

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Coast Academies Writing Framework Step 4. 1 of 7

Fisk Street Primary School

Regional Capacity-Building on ICT for Development Item 7 Third Session of Committee on ICT 21 November, 2012 Bangkok

EMPLOYMENT APPLICATION Legislative Counsel Bureau and Nevada Legislature 401 S. Carson Street Carson City, NV Equal Opportunity Employer

DOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY?

Custom Program Title. Leader s Guide. Understanding Other Styles. Discovering Your DiSC Style. Building More Effective Relationships

EVERYTHING DiSC WORKPLACE LEADER S GUIDE

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

TROPICAL LIVING in Southeast Asia

Why Is the Chinese Curriculum Difficult for Immigrants Children from Southeast Asia

The Efficacy of PCI s Reading Program - Level One: A Report of a Randomized Experiment in Brevard Public Schools and Miami-Dade County Public Schools

Automatic English-Chinese name transliteration for development of multilingual resources

CFAN 3504 Vertebrate Research Design and Field Survey Techniques

Language. Name: Period: Date: Unit 3. Cultural Geography

Sri Lanka. On the scale of a world map, Sri Lanka previously known as Ceylon appears to hang like a Pearl over the Indian Ocean.

The Bruins I.C.E. School

Mandarin Lexical Tone Recognition: The Gating Paradigm

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

University of Indonesia

Reading Project. Happy reading and have an excellent summer!

**Note: this is slightly different from the original (mainly in format). I would be happy to send you a hard copy.**

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

CDE: 1st Grade Reading, Writing, and Communicating Page 2 of 27

DOWNSTEP IN SUPYIRE* Robert Carlson Societe Internationale de Linguistique, Mali

Basic concepts: words and morphemes. LING 481 Winter 2011

Language contact in East Nusantara

A Hybrid Approach to Lao Word Segmentation using Longest Syllable Level Matching with Named Entities Recognition

Erin M. Evans PhD Candidate Department of Sociology University of California, Irvine

GEB 6930 Doing Business in Asia Hough Graduate School Warrington College of Business Administration University of Florida

1. Introduction. 2. The OMBI database editor

Missouri GLE FIRST GRADE. Communication Arts Grade Level Expectations and Glossary

Phonological Processing for Urdu Text to Speech System

Nicole M. Rosa, PhD. Department of Psychology Worcester State University 486 Chandler Street Worcester, MA

MIAO WANG. Articles in Refereed Journals and Book Volumes. Department of Economics Marquette University 606 N. 13 th Street Milwaukee, WI 53233

Unit 9. Teacher Guide. k l m n o p q r s t u v w x y z. Kindergarten Core Knowledge Language Arts New York Edition Skills Strand

Florida Reading Endorsement Alignment Matrix Competency 1

Written by: YULI AMRIA (RRA1B210085) ABSTRACT. Key words: ability, possessive pronouns, and possessive adjectives INTRODUCTION

Phonological and Phonetic Representations: The Case of Neutralization

Modeling full form lexica for Arabic

Copyright 2002 by the McGraw-Hill Companies, Inc.

MARK 12 Reading II (Adaptive Remediation)

Competition in Information Technology: an Informal Learning

Dinesh K. Sharma, Ph.D. Department of Management School of Business and Economics Fayetteville State University

Building an HPSG-based Indonesian Resource Grammar (INDRA)

Listener-oriented phonology

Morphotactics as Tier-Based Strictly Local Dependencies

MARK¹² Reading II (Adaptive Remediation)

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

MABEL ABRAHAM. 710 Uris Hall Broadway mabelabraham.com New York, New York Updated January 2017 EMPLOYMENT

INTERCULTURAL EXCHANGE & DISTANT COMMUNICATION TECHNOLOGIES: DESIGNING SOLUTIONS FOR THE GLOBAL EDUCATION FOR YOUNGER GENERATIONS

Classifying combinations: Do students distinguish between different types of combination problems?

K-12 Blueprint Logo Placement

Rhythm-typology revisited.

Acoustic correlates of stress and their use in diagnosing syllable fusion in Tongan. James White & Marc Garellek UCLA

Setting the Scene and Getting Inspired

TRAVEL TIME REPORT. Casualty Actuarial Society Education Policy Committee October 2001

Considerations for Aligning Early Grades Curriculum with the Common Core

Transportation Equity Analysis

AP PSYCHOLOGY VACATION WORK PACKET UNIT 7A: MEMORY

SIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS. Chris Adams Bachelor of Arts, Asbury College, May 2006

Transcription:

L2/16-096 2016-04-29 Representing in Unicode pandey@umich.edu April 29, 2016 1 Introduction This document offers an approach for representing the Satera Jontal or script in Unicode. This script is used for writing (ISO 639: smw), a Malayo-Polynesian language spoken on, Indonesia. It is an extension of with language-specific letters and alternate forms. 2 Script Details contains the following letters: /k/ /t ʃ/ /h/ /g/,, /d ʒ/ /z/ /ŋ/ /ɲ/ /x/ /p/ /j/ /sj/ /b/, /r/ /f/ /m/ /l/ /q/ /t/ /w/ /ɗ/ /d/ /s/ /n/,, /ʔ/, /a/, 0 and the following vowel signs: 1

Representing in Unicode /i/ /e/ /u/ /o/ and a vowel-silencing sign: -0 The structure of is similar to that of. Each consonant letter possesses the inherent vowel /a/. This vowel is changed by applying dependent vowel signs, which attach to the left, right, above, and below the base consonant. A bare consonant is indicated by use of a sign, which indicates the silencing of the inherent vowel. Some prenasalized consonants are represented using distinctive letters. Several consonants have alternate forms that may co-occur with the regular forms of letters. The letter is a vowel carrier and represents the independent form of the vowel /a/. Independent forms of vowels are represented by attaching vowels signs to, as shown below. /a/ /i/ + /i/ + /e/ + /o/ + The vowel-silencing sign is used as follows: /ka/ /k/ + /ga/ /g/ + 3 Comparison of and repertoires Several letters are shared between and, but there are differences in the forms and values of letters, as well as letters used in for sounds that are not represented in the standard script. A comparison is shown below, using as the basis for comparison as it is already encoded in Unicode. 2

Representing in Unicode 3.1 Current repertoire for 3.1.1 Consonants The block contains 23 letters. Of these, 12 are identical or nearly similar in : ᨀ ᨆ ᨈ ᨉ ᨊ ᨍ ᨎ ᨑ ᨓ ᨔ ᨖ ᨕ Seven consonants have different forms: ᨁ ᨂ ᨄ ᨌ ᨐ ᨒ ᨅ The block contains four letters that are not used in. These represent prenasalized consonants of the Bugis language: ᨃ ᨇ ᨋ ᨏ 3.1.2 Vowel signs contains 5 vowel signs, of which 2 are identical in : 3

Representing in Unicode while 2 of which have different forms: The following sign is not used in : 3.1.3 Punctuation Both marks of punctuation are used in : The ꧏ +A9CF is used for marking repetition of syllables. 3.2 Missing characters The block does not have characters that correspond to the following 6 letters required for representing : /z/ /f/ /x/ /q/ /sj/ /ɗ/ Additionally, there are alternate forms of 5 letters that have the potential of being treated as distinctive characters rather than as glyphic variants: 4

Representing in Unicode /d ʒ/ /d ʒ/ /r/ /a/ /a/ does not have a vowel-silencing sign, but such a character is used in : -0 4 Approach for encoding The block contains 30 characters: 23 consonant letters, 5 vowel signs, and 2 punctuation signs. Representing in Unicode requires 30 characters: 25 letters, 4 combining vowel signs, and 1 -. Of these letters, 13 are distinctive, while 12 can be represented using existing characters. Of the vowel signs, 2 are identical, 2 may be considered to be alternate forms, and 1 does not occur in. In total, a minimum of 14 new characters is required for. There is a potential to encode an additional 7 characters: 5 alternate letters and 2 alternate vowel signs. The following actions are required: 1. As the block in the BMP has only two spaces remaining, and as there is no free space in the BMP, a new block should be created in the SMP with the name Extensions. The block should encompass at least 5 columns to accommodate characters from other orthographies. 2. Encode the following 13 letters in Extensions. As some letters may also used in orthographies for other languages, character names should be generic and not specific to : ga nga pa ba ca ya la za kha 5

Representing in Unicode sya fa qa dda 3. Encode the following combining sign in the existing block (see L2/16-075): -1 4. Determine whether the following vowel signs are distinctive characters or glyphic variants: vowel sign e vowel sign o 5. Identify the status of the following alternate forms as distinctive letters or glyphic variants: western ja eastern ja western ra western a eastern a A formal proposal for encoding letters of and other -based scripts is forthcoming. 5 References Miller, Christopher. 2010. Unicode Technical Note #35: Indonesian and Philippine Scripts and Extensions. http://www.unicode.org/notes/tn35/ Pandey, Anshuman. 2016. Proposal to encode VIRAMA signs for. L2/16-075. http://www.unicode.org/l2/l2016/16075-buginese-virama-signs Shiohara, Asako. 2014. The Satera Jontal Script in the District in Eastern Indonesia. Presented at the International Workshop on Endangered Scripts of Island Southeast Asia, Tokyo University of Foreign Studies, February March 2014. http://lingdy.aacore.jp/doc/endangered-scripts-issea/asako_shiohara_paper.pdf 6

Representing in Unicode Figure 1: Title page of a script primer (from Shirohara 2014). 7

Representing in Unicode Figure 2: Road signs in script (from Shirohara 2014). 8

Representing in Unicode Source: http://omniglot.com/writing/sumbawa.htm Figure 3: Chart showing characters of Satera Jontal or the script. 9