Feedback on Draft Devanagari Script Behaviour for Hindi Version

Similar documents
DCA प रय जन क य म ग नद शक द र श नद श लय मह म ग ध अ तरर य ह द व व व लय प ट ह द व व व लय, ग ध ह स, वध (मह र ) DCA-09 Project Work Handbook

S. RAZA GIRLS HIGH SCHOOL

क त क ई-व द य लय पत र क 2016 KENDRIYA VIDYALAYA ADILABAD

CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE


HinMA: Distributed Morphology based Hindi Morphological Analyzer

Question (1) Question (2) RAT : SEW : : NOW :? (A) OPY (B) SOW (C) OSZ (D) SUY. Correct Option : C Explanation : Question (3)

Transliteration Systems Across Indian Languages Using Parallel Corpora

ENGLISH Month August

The Prague Bulletin of Mathematical Linguistics NUMBER 95 APRIL

F.No.29-3/2016-NVS(Acad.) Dated: Sub:- Organisation of Cluster/Regional/National Sports & Games Meet and Exhibition reg.

व रण क ए आ दन-पत र. Prospectus Cum Application Form. न दय व kऱय सम त. Navodaya Vidyalaya Samiti ਨਵ ਦ ਆ ਦਵਦ ਆਦ ਆ ਸਦ ਤ. Navodaya Vidyalaya Samiti

ह द स ख! Hindi Sikho!

Detection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features

English to Marathi Rule-based Machine Translation of Simple Assertive Sentences

On-Screen Font in Telugu

COMMISSIONER AND DIRECTOR OF SCHOOL EDUCATION ANDHRA PRADESH :: HYDERABAD NOTIFICATION FOR RECRUITMENT OF TEACHERS 2012

August 14th - 18th 2005, Oslo, Norway. Code Number: 001-E 117 SI - Library and Information Science Journals Simultaneous Interpretation: Yes

NAVODAYA VIDYALAYA SAMITI PROSPECTUS FOR JAWAHAR NAVODAYA VIDYALAYA SELECTION TEST- 2014

Improved Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge

STUDENT MOODLE ORIENTATION

UNIVERSITY OF CALCUTTA

NAVODAYA VIDYALAYA SAMITI PROSPECTUS FOR JAWAHAR NAVODAYA VIDYALAYA SELECTION TEST- 2016

DLM NYSED Enrollment File Layout for NYSAA

Improving the Quality of MT Output using Novel Name Entity Translation Scheme

NAVODAYA VIDYALAYA SAMITI PROSPECTUS FOR JAWAHAR NAVODAYA VIDYALAYA SELECTION TEST- 2018

INDIAN INSTITUTE OF SCIENCE EDUCATION AND RESEARCH KOLKATA Mohanpur Ref.No.: IISER-K/Rectt.NT-01/2016/Admn Date:

Dated Shimla-1 the 4 th December,2015. To All the Deputy Directors of Higher Education, Himachal Pradesh

Initial steps to be followed before filling Online Application Form

Phonological Processing for Urdu Text to Speech System

Richardson, J., The Next Step in Guided Writing, Ohio Literacy Conference, 2010

Investigation of Indian English Speech Recognition using CMU Sphinx

MAINTAINING CURRICULUM CONSISTENCY OF TECHNICAL AND VOCATIONAL EDUCATIONAL PROGRAMS THROUGH TEACHER DESIGN TEAMS

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

KSKV Kachchh University Invites Applications for PhD Program

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

Arabic Orthography vs. Arabic OCR

ESIC Advt. No. 06/2017, dated WALK IN INTERVIEW ON

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

Date : Controller of Examinations Principal Wednesday Saturday Wednesday

DIBELS Next BENCHMARK ASSESSMENTS

Sl. No. Name of the Post Pay Band & Grade Pay No. of Post(s) Category

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Government of Tamil Nadu TEACHERS RECRUITMENT BOARD 4 th Floor, EVK Sampath Maaligai, DPI Campus, College Road, Chennai

End-to-End SMT with Zero or Small Parallel Texts 1. Abstract

DEPARTMENT OF EXAMINATIONS, SRI LANKA GENERAL CERTIFICATE OF EDUCATION (ADVANCED LEVEL) EXAMINATION - AUGUST 2016

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Re-Advertisement No.: 01/2017 Dated:

NAVODAYA VIDYALAYA SAMITI PROSPECTUS FOR JAWAHAR NAVODAYA VIDYALAYA SELECTION TEST- 2015

NAVODAYA VIDYALAYA SAMITI PROSPECTUS FOR JAWAHAR NAVODAYA VIDYALAYA SELECTION TEST- 2015

Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages

Named Entity Recognition: A Survey for the Indian Languages

Florida Reading Endorsement Alignment Matrix Competency 1

Highlighting and Annotation Tips Foundation Lesson

Many instructors use a weighted total to calculate their grades. This lesson explains how to set up a weighted total using categories.

Section V Reclassification of English Learners to Fluent English Proficient

GENERAL COMMENTS Some students performed well on the 2013 Tamil written examination. However, there were some who did not perform well.

Core Values Engagement and Recommendations October 20, 2016

Advertisement No. 2/2013

A Simple Surface Realization Engine for Telugu

Culinary Arts and Foodservice Management

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

DOWNSTEP IN SUPYIRE* Robert Carlson Societe Internationale de Linguistique, Mali

The Indian English of Tibeto-Burman language speakers*

Sri Lanka. On the scale of a world map, Sri Lanka previously known as Ceylon appears to hang like a Pearl over the Indian Ocean.

Mandarin Lexical Tone Recognition: The Gating Paradigm

MANGALORE UNIVERSITY

ODS Portal Share educational resources in communities Upload your educational content!

Approved Foreign Language Courses

Senior Stenographer / Senior Typist Series (including equivalent Secretary titles)

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

SARDNET: A Self-Organizing Feature Map for Sequences

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

INTERMEDIATE PHASE (GRADES 4 TO

Fisk Street Primary School

The ABCs of O-G. Materials Catalog. Skills Workbook. Lesson Plans for Teaching The Orton-Gillingham Approach in Reading and Spelling

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

A process by any other name

PROGRESS MONITORING FOR STUDENTS WITH DISABILITIES Participant Materials

Examinations Officer Part-Time Term-Time 27.5 hours per week

1. Introduction. 2. The OMBI database editor

Khairul Hisyam Kamarudin, PhD 22 Feb 2017 / UTM Kuala Lumpur

Off-line handwritten Thai name recognition for student identification in an automated assessment system

The IDN Variant Issues Project: A Study of Issues Related to the Delegation of IDN Variant TLDs. 20 April 2011

SECTION I: Strategic Planning Background and Approach

Technical Report #1. Summary of Decision Rules for Intensive, Strategic, and Benchmark Instructional

BRAZOSPORT COLLEGE LAKE JACKSON, TEXAS SYLLABUS. POFI 1301: COMPUTER APPLICATIONS I (File Management/PowerPoint/Word/Excel)

Distinguished Teacher Review

MARK 12 Reading II (Adaptive Remediation)

Asked Questions (FAQs) and Answers

MARK¹² Reading II (Adaptive Remediation)

FA 201 Workbook Techniques for Exploring Personal Markets

Literacy Level in Andhra Pradesh and Telangana States A Statistical Study

Correspondence between the DRDP (2015) and the California Preschool Learning Foundations. Foundations (PLF) in Language and Literacy

INTERVIEW FORM FOR DIRECT CARE POSITIONS. Interviewer(s) Name(s)

RURAL LIBRARY AS COMMUNITY INFORMATION CENTRE: A STUDY OF KARNATAKA STATE

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

MBA6941, Managing Project Teams Course Syllabus. Course Description. Prerequisites. Course Textbook. Course Learning Objectives.

HIGH COURT OF HIMACHAL PRADESH, SHIMLA No.HHC/Admn.2(31)/87-IV- Dated:

Transcription:

Feedback on Draft Devanagari Script Behaviour for Hindi Version 1.4.10 S. No. Feedback/ Remark From TDIL-D Portal Users omments 1. The definition of Indic syllable has been revised as under : V[m] {H}[v][m] H The Linguistic definition of Indic syllable has been mapped to ABNF(Augmented Backus Naur Form) for the purpose of text segmentation, Line breaking, Drop letter, letter spacing in horizontal text and vertical text representation. The definition has been elaborated taking Hindi as an example. The definition is combination of 3 rules : Rule 1 : V[m] Rule 2 : {H}[v][m] Rule 3 : H (This rule is applicable only at the end of the word) V(Upper case) is complete vowel m is modifier(anusvara/visarga/handrabindu) is onsonant as per Unicode definition which may or may not include nukta v (lower case) is any dependent vowel or vowel sign (mātrā) H is halant / virama is a rule seperator [ ] - The enclosed items is optional under this bracket {} - The enclosed item/items occurs once or repeated multiple times Examples: Rule 1 : V[m] Sl. No.

Examples Definition 1. अ, ई, उ V (Vowel) is a syllable 2. अ, उ, आ V+ Modifier is a syllable Rule 2 : {H}[v][m] Sl. No. Examples Definition 1. र, क, ज, ऱ, म onsonant is a syllable 2.

प प,क ख,च त, ज जज जव, त कक ऱ,त क न Zero or more onsonant + Virama sequences followed by consonant is a syllable 3. तत, त क त, त क नत, त क नयत, फ क Zero or more onsonant (Nukta) +Virama followed by consonant is a syllable 4. ततत, त क नयतत, फ कज, क यत Zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by a vowel sign is a syllable 5. त, त, स त र, त, फ कज zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by modifier is a syllable 6. त क नयतत: त क नयय, त क नयय, फ कज,ह

zero or more consonant+ (Nukta)+ virāma sequences followed by a consonant (+Nukta) followed by a vowel sign and modifier is a syllable 7. स,स ज जज,ख वत Zero or more onsonant +halant sequences followed by a consonant followed by vowel sign is a syllable Rule 3 : H त, व, म, भ etc are syllable in Hindi only at the end of the word Examples of combination of the rules : 1. वतगतम - Hv + + + H has following syllables : वत Hv ग त

म H 2. भरतनतट यम- + + + v + H + भ र त नत v ट य H म

3. द बयद ध - + Hv + Hv द बय Hv द ध Hv The proposed definition is generic in nature and has already being tested for 11 Indian languages i.e Hindi, Marathi, Bengali, Nepali, Tamil, Telugu, Kannada, Gujarati, Punjabi, Oriya & Malayalam. The new rule for H(onsonant+ Halant) occurrence at the end of the word has been introduced.the testing of the remaining languages is underway. From Prashant Verma I Sr. Software Engineer W3 India 2. Please refer Annexure-1 for for suggestion on Draft Standard Devanagari Script Behavior. From Mahesh hander Vashisth

3. 1. Insofar as Akshar is concerned, once it is frozen, we can modify the text acordingly. 2. Also the current Indic definition included is having Devanagri examples only and they should be language specific. (larification sought on the above statement: which section is being referred to?) 3. Section for developers was possible for Hindi because HD had the Manak Hindi document for Hindi. Such documents do not exist for other languages and hence the section for developers has been left out. Request TDIL, DeitY to help us to To locate such documents, if available from the respective states. As far as our knowledge goes most of them do not have it. From Mahesh Kulkarni Associate Director and HoD, GIST 4. DA has provided script behaviour documents (Part A) for six languages + Hindi and others are in the pipe-line. However as mentioned in our earlier mail, unlike Hindi, Part B (Guidelines for Developers) cannot be provided for the other languages, since to the best of our knowledge State IT s do not have documentation pertaining to the same. In a discussion with you on 25th July 2014, the following was decided. 1. Part B comprises of two sections: 1. Technical Guidelines 2. Linguistic Information pertaining to the language

1. Insofar as the technical guidelines were concerned, it was decided as under: SETION HEAD RESPONSIBILITY/ONTAT INSTITUTION 1.1 1.2 Script and Historical approach ommunity State IT Secretary 1.3 TO 1.6. Encoding Principles and Akshar Information about Akshar is provided in each Language in Part A in Section 6.2. It is recommended that the same be removed from Part B in the case of Hindi. TDIL to provide the same 1.7 UAX Segmentation Rules

TDIL to provide the same 1.8 Rendering rules Detailed description not available for other languages in h. 9.0 of Unicode. IT secretaries be requested to provide the same. 1.9 ZWJ/ZWNJ Made available in Annexure 2 for each language. 1.10 ursor Movement and Deletion These are derived from Akshar and it was felt that Microsoft be contacted to provide the same for other languages. The Hindi rules were taken from the Microsoft Site 1.11 Normalization This section was specifically provided for Hindi where this affects the script. No such rules exist for other languages. However Unicode Normalisation rules be

referred to. http://unicode.org/reports/tr15/ 1. Inosofar as Linguistic Information is confirmed, it is requested that each State IT secretary be requested to provide the same, as mandated by the respective state government for that particular language. It was also suggested that the Style Guides prepared by IIL be referred to by the State Government for mandating the said linguistic information. It was felt that the following disclaimer be put on the Website for the Script Behaviour documents: DISLAIMER The script behaviour document comprises a set of recommendations laid down by the experts for the use of the community. It does not purport in any way to be prescriptive in nature nor is it to be interpreted as a Standard. In case of any difference of opinion, please provide feedback on the portal. Your contribution will be appreciated. From Mahesh Kulkarni Associate Director and HoD, GIST

Annexure-1