English to Marathi Rule-based Machine Translation of Simple Assertive Sentences

Size: px
Start display at page:

Download "English to Marathi Rule-based Machine Translation of Simple Assertive Sentences"

Transcription

1 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 English to Marathi Rule-based Machine Translation of Simple Assertive Sentences G.V. Garje, G.K. Kharate and M.L. Dhore Abstract This paper presents a proposed system for machine translation of simple assertive sentences in English to their Marathi counterpart using rule-based approach. System takes simple assertive English sentence as an input and performs its lexical analysis to produce tokens (lexemes). Each token generated by lexical analyzer is searched in the English dictionary (lexicon). If the token is present in the lexicon, it is retrieved along with its morphological information. The local word grouping is performed based on the morphological information of a token, if required. Local word groups thus formed are checked against grammar rules of English language for syntactic validity of input sentence. The syntax is checked by using bottom up parsing technique. For the syntactically correct input, corresponding target language (Marathi) token is searched in target language dictionary. If all such Marathi tokens corresponding to English tokens are found, then Marathi sentence is generated using Marathi grammar rules. The paper emphasizes on developing production rules for simple English sentences and innovative rearrangement of target language tokens. Index Terms Language Translation, Lexical Analysis, Local Word Grouping, Local Word Separator. Machine Translation, Morphological Analysis, NLP. I I. INTRODUCTION NDIA is a multilingual, multicultural country where 22 official languages and approximately 2000 dialects are spoken by different communities [2]. English and Hindi are used for official work in majority of the states of India while state governments predominantly carry out their official work in their regional language such Hindi, Bengali, Marathi, Tamil, Kannada, Telugu, Punjabi, Gujarati, Oriya etc. The people of different states make use of these regional languages for oral as well as written communication. The entire official documents and reports of Union government are published in English or in both English and Hindi. Translating these documents manually into a regional language is very time consuming and costly task. Hence there is need to develop good machine translation (MT) systems in order to establish a G.V. Garje, He is working as a associate Professor and Head of Department of Computer Engineering. PVG s College of Engineering & Technology, Pune, Maharashtra, , India (gvg_comp@pvgcoet.ac.in) G.K. Kharate, He is working as Principal at Matoshri College of Engineering & Research Centre, Eklahare Nashik, Maharashtra, India (gkkharate@rediffmail.com) M.L. Dhore, He is working as a Professor of Computer Engineering at Vishwakarma Institute of Technology, Pune, Maharashtra, India (manikdhore@vit.edu) better communication between states and Union governments and exchange of information amongst the people of different states with different regional languages. English continues to be the link language in India. Machine translation has a much greater significance in breaking the language barrier within the sociological and regional structure [1]. Few MT systems for English to Indian languages for specific domain are developed and the work is still going on [2]. It is a tough task to develop general purpose English to Indian languages Machine Translation systems due to the complex and free-word order nature of different Indian languages. English language has simple, complex and compound sentences. The simple sentences are further subdivided into Interrogative, Assertive, Negative, and Exclamatory. Developing a tool for each sub-type of simple sentences and integrating them to form a full-fledged MT tool could be a better option. Marathi is a low resource Indian language. The tool for Simple Interrogative English sentences to Marathi has been already attempted [1]. We are proposing a system for translating Simple English Assertive sentences into Marathi sentences. Machine translation has different architectures such as Direct, Transfer-Based, Interlingua, Statistical, Example- Based and Hybrid. Each of them has its advantages and disadvantages and selection of the approach can be made based on the domain of the application. We have selected Transfer-Based architecture for the development using and Rule-based approach of implementation. [2][8]. II. RELATED WORK The field of Natural Language Processing has emerged in its own right and a large number of research groups around the world are working on it. In India also continuous efforts by individual researchers as well as organizations and group of organizations (consortium) are on from last 15 years for the development of MT systems for English to Indian languages and for Indian languages to Indian languages. Few noteworthy systems include Anusaaraka project (1995) started by Akshar Bharti at IIT Kanpur and now being continued at IIIT Hyderabad for Indian language to Indian language MT and is tested for translating simple Telugu sentences into corresponding Hindi sentences. The MANTRA (1997,1999) a MT system started by Akshar Bharti and further developed by Hemant Darbari and Manish Kumar Pande for the Rajya Sabha Secretariat, the Upper House of Parliament of India to translate the proceedings of parliament such as study to be laid

2 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2 on the Table of the house, Bulletin Part-I and Part-II. Anglabharati(2001) and Anglabharti-II(2004) MT system developed by R.M.K. Sinha et al for English to Indian languages. Shiva and Shakti MT System (2003) designed using an Example-based and combination of rule based and statistical approaches. The Shakti system works for three target languages Hindi, Marathi and Telgu. Shiva and Shakti are the two Machine Translation systems from English to Hindi developed jointly by Carneige Mellon University of USA, IIIT Hyderabad and IISc, Bangalore, India. The MaTra System (2004, 2006) developed by Ananthakrishnan R et al uses transfer-based approach to translate news, annual reports and technical phrases from English to Hindi. A consortium of Nine institutions namely C-DAC Mumbai, IISc Bangalore, IIIT Hyderabad, C-DAC Pune, IIT Mumbai, Jadavpur University Kolkata, IIIT Allahabad, Utkal University Bangalore, Amrita University Coimbatore and Banasthali Vidyapeeth, Banasthali are developing EILMT(2006-) a MT System for English to Indian Languages in Tourism and Healthcare Domains. This project is funded by Department of Information Technology, MCIT Government of India. They have developed Sampark MT system (2009). The role of C- DAC Mumbai is to develop statistical models and resources for a statistical MT (SMT) system from English to Hindi/Marathi/Bengali. Rule based machine translation from English to Urdu using transfer approach is developed by Naila Ata et al at Karachi, Pakistan [5]. They handled case phrases and verb postpositions through concept of Pannian Grammar. Dilshan De Silva et al have developed Sinhala to English translator with various inbuilt tools like grammar tool, dictionary, Unicode fonts, debugging tool, add word tool [3]. There are many more MT systems developed in India and abroad. III. SYSTEM OVERVIEW English language has a SVO (Subject-Verb-Object) structure whereas Marathi language follows SOV structure and is relatively free word order [1][8]. The overall architecture of system is depicted in Fig. 1. The input to the system is simple assertive English sentence like He is going to school. Lexical Analyzer splits the input sentence into tokens/lexemes separated by delimiters. Lexicon (Dictionary) contains a set of known words along with their complete morphological information such as its root, category, case, gender, number etc. Morphological Analyzer accepts a token and checks whether that tokens is present in the lexicon or not. If the token is present, system retrieves complete morphological information about it. IV. COMPONENT OF SYSTEM The proposed system composed of following components Lexical Analyzer Morphological Analyzer Parser Fig. 1. Architecture of proposed system Mapping module Local word Grouper Tokens rearrangement Transliteration A. Lexical Analyzer This module splits the given English sentence into the tokens and removes delimiters. Input: Sentence Output: Tokens/Lexemes. B. Morphological Analyzer Given a word, the morphological analyzer identifies the root and the grammatical features of the word. For languages that are not rich in inflections, a simple lookup dictionary that contains all the word forms would be sufficient. But creating such a dictionary for inflectionally rich languages is nontrivial and requires huge storage and high performance computing. The best alternative is to have a dictionary of root words and attaching the grammatical prefixes/suffixes is taken care in the target language generation. Part of Speech Tagging (POS) is the process of assigning a part of speech to each word in the sentence. Identification of the parts of speech such as nouns, verbs, adjectives, adverbs for each word of the sentence helps in analyzing the role of each constituent in a sentence Input: Token Output: Gender, Number, Person, Tense, Root. Example: Input: He is going to school Output of morphological analyzer: He [Root : he

3 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3 Gender : male Person : third Number : singular Category: pronoun] is [ Root : is Category: auxiliary] going [Root : go Category: verb ] to [Root : to Category: Preposition school [Root : school Category : noun Gender : Female Number : singular Person : third Type : common ] C. Shallow Parser This involves identifying simple noun phrases, verb groups, adjectival phrase, and adverbial phrase in a sentence. It also involves identifying the boundary of chunks and the labeling. Normally in language processing, sentences are parsed to identify the syntactic structure of the sentence. There are more similarities than differences between Indian languages. For example Marathi and Hindi language pair does not require a full parse. In this system, MT would be performed without a full sentential parser. The structural transformation is required when the source language structure does not have an equivalent structure in the target language. A partial parse or shallow parse is sufficient to identify the specific constituents in the sentence that has to undergo transformation. This component will also include the task of transliteration. The transliteration is done among Indian languages. Transliteration allows a word or words to be rendered in the script of the reader. Input: Tokens Output: Syntactically checked sentence D. Mapping Module In this module, each English word is mapped to the corresponding Marathi word. Verbs and Nouns are attached pratyay according to Gender Number Person and Tense. Adjectives are attached pratyay according to the Noun and Pronoun E. Local word Grouper Adjectives are grouped with their corresponding Nouns and Pronouns. (All need not be present in every case). If only pronoun or only noun exists, then it is considered as one group. Adverbs are grouped with their corresponding Verbs. Adverbs may be absent. In such cases a single verb may act as one group. Prepositions are grouped with the respective Noun or Pronoun. Transforming Algorithm: In Local Word Grouper (LWG)[3], grouping of all the tokens after assigning morphological information to them will be carried out. The Pronoun your and Noun name becomes Noun Phrase (NP) according to the rules given below. The NP and VP-PPS becomes NP-VP-PPS. Finally the Pronoun what, Auxiliary verb is and NP-VP-PPS forms a sentence S. This way LWG takes place using bottom-up parsing technique. The rules used in LWG for What is your name? are as follows: PRONOUN: = he NULL AUX: = is NULL NP: = name NULL NP: = PRONOUN NP VP: = VERB VERB: = NULL PP: = PREP NP PREP: = NULL VP-PPS: = VP PP NP-VP-PPS: = NP VP-PPS S: = PRONOUN AUX NP-PP-VPS Fig. 2. Parse tree generated by syntax analysis The structure of the valid sentence represented as parse tree is shown in Fig. 2. Syntactic analysis exploits the result of morphological analysis to build a structural description of the sentence. A shallow parser is designed for source language and the lexicon is built to store the root words of source language followed by target language. The rules mentioned above show a simple context free phrase structure grammar for English. Local Word Separator separates tokens from the sentence generated in LWG to search corresponding token in target language dictionary. Mapping Block maps the tokens of source language (English) to corresponding tokens of target language (Marathi).

4 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4 Though Marathi is relatively free word order language, there are few units which occur in fixed order. Noun phrase (NP) and verb phrase (VP) can be formed using only local information and more importantly they provide sufficient information for further processing of the sentence. This is how the LWG provides all the necessary information with minimum computational efforts. F. Tokens rearrangement Algorithm The Local Word Groups need to be rearranged in order to generate a valid target language (Marathi) sentence. After going through a number of research papers and analyzing a set of sentences, we proposed an algorithm for rearrangement of words. After analyzing a number of sentences, a regular pattern is observed in most of the sentences. It is found that, by keeping the first group (noun phrase) as it is and then reversing the remaining groups in sequence; it produces a valid Marathi sentence. This technique is apt for most of the sentences tested using the algorithm presented in this paper. It is found that the rearrangement technique suggested in this paper is the most optimized and simplistic in terms of understanding and implementation as compared to other methods. This comprises of major part of research work. Algorithm: 1. Read in English Sentence 2. Perform Lexical Analysis i.e. obtain tokens 3. Perform morphological analysis i.e. retrieve each token from dictionary along with its morphology 4. Check the syntax of the input sentence using production rules If (syntax is okay) { Retrieve corresponding Marathi tokens Perform mapping of English tokens to corresponding Marathi tokens Attach infections (pratyays) to Marathi tokens } else go to step 1 5. Apply Tokens rearrangement algorithm 6. Perform transliteration (use any transliteration tool) 7. Generate target language(marathi) sentence 8. End Tokens rearrangement Algorithm 1. Traverse input sentence from left to right 2. While(not end of sentence) { Keep position of NP/Noun intact and traverse a sentence by skipping NP/Noun till the end Reverse the sentence till NP/Noun Map source language tokens to target language tokens } 3. End Example: Input English Sentence: Word to Word Transliterated Mapping for Local Language: Lexical Word Grouping: After Applying Rearrangement Algorithm: Translation in Marathi using Transliteration: G. Transliteration After the rearrangement, text which is transliterated in Western script is converted into Devanagari script. There are many transliteration tools available to convert source language script to target language (English to Devanagari in this case). It is done using Akshara Bridge tool. V. MATHEMATICAL MODEL The overall process of a Rule Based Translation is carried out by using a following mathematical model. Terminologies used in the mathematical model are: STT - Source Translation Token equivalent to single word of English TTT - Target Translation Token is equivalent to word in Marathi. S STT 1, STT2,..., STT n represented as a set of Tokens in English T TTT1, TTT2,... TTT n represented as a set of Tokens in Marathi Ts T is a subset of Token of the category Noun T v T is a subset of Token of category Verb T o T is a subset of Token of category Object T a T is a subset of Token of category Article T p T is a subset of Token of category Preposition T u T is a subset of Token of category Auxiliary verb I went to the market with my mother to buy apples. mi gelo la bajara barobara mazhya AI NyAsAThi kharida saparchamda mi gelo bajarala mazhya AIbarobara kharidanyasathi saparchamda mi saparchamda kharidanyasathi mazhya AIbarobara bajarala gelo S - Legitimate Input Sentence in the source language English T - Output Sentence in the target language Marathi.

5 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5 C S T is a Bilingual corpus/dictionary of root words M STT i TTT i is a Translation Model used to directly map STT i S to TTT i T using bilingual corpus C LW G T is the Language Model used to perform the lexicon word grouping using the grammar of the target language Marathi based on whether article or proposition or auxiliary verb appears before TTT. Step 1: Input sentence Step 2: n = count( S the sentence n S Step 3: T M STT TTT i1 ) counts the number of tokens(words) in Translate individual tokens i i from source to destination language using corpus C s if TTTi Ts v if TTTi Tv o if TTTi To Step 4: i 1 tontagttti a if TTTi Ta p if TTTi T p u if TTTi Tu Tag individual tokens (words) of translated sentence T as sa aps sap Step 5: LW G T pv vp ps sp uv vu Language Model is used for lexicon word grouping using the grammar of Marathi. Step 6: T TTT1 i n to2ttti wherettt1 T and TTTi T Step 7: Translation in Marathi using Transliteration A. Noun VI. TARGET LANGUAGE (MARATHI) GENERATION 1. karant to Noun : In the sentence, if the preposition is present in noun phrase (NP), then karant pratyaya is applied to NP as per Table I e.g.: He came with me For me:= preposition-noun maza becomes mazya. 2. Shasti pratyaya to noun : If noun in the sentence has ( s) then depending on the gender of noun following it, pratyaya(cha, chi, che etc.) are applied to it. e.g.: That is Goraksha s book? Goraksha s become Gorakshache. 3. karant to Proper Noun: No karant is applied to proper noun. e.g.: He came for Alekh? Here Alekh is proper noun hence no karant is applied to it. Likewise for all the cases, rules are written for nouns, pronouns, adjectives and participles. Some of the rules are shown in Table I. B. Verb and Auxiliary: The pratyaya of verb depends on tense of the sentence, gender, number and the person. As shown in Table II & Table III. The output is generated in Roman script. It can be converted to Marathi language script by using existing software available. TABLE I KARANT Male The Noun of a karant changes to aa karant e.g. Ram Rama (singular) Nag Nagan (plural) The Noun of aa karant changes to ya karant e.g. Amba Ambya (singular) Killa Killya (plural) Female The Noun of a karant changes to aa in singular and e in plurale.g. Jibh Jibhe (singular) and Jibha (plural) The Noun of aa karant changes to e karant e.g. Shala Shale (singular) and Shala (plural) Bhasha Bhashe (singular) and Bhasha (plural) Number TABLE II ADJECTIVE/PARTICIPLE/PRONOUN Gender Male Female Neuter Singular a i e Plural e ya e TABLE III VERB FOR PRESENT TENSE IN MARATHI ROOT JA Number Singular Male Plural Ist Person mi ja-to amhi ja-to IInd Person tu ja-tos tumhi ja-ta IIIrd Person to ja-to te ja-tat VII. RESULTS Following are some of the test cases with output according to algorithm presented in this paper: 1) I am nice 2) She chose him

6 > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6 3) It was done by his mother 4) My father told me to work myself for his movie 5) They themselves will be very enthusiastic for their nice and amazing movie 6) I went to market with my mother to buy apples 7) She has a nice doll having blue eyes 8) My favourite book was stolen 9) My dog should go running for food 10) I have been speaking with him 11) My father had been eating mango 12) I did it for my father compound and complex sentences. REFERENCES [1] Goraksh V. Garje, Manisha Marathe, Urmila Adsule, Translation of Simple Interrogative English Sentences to Marathi Sentences, proceedings of ICWET 10, Mumbai, Maharashtra, India February 26 27, 2010, pp. [2] G.V. Garje, G.K. Kharate,. (2013, Oct) Survey of Machine Translation Systems in India, International Journal on Natural Language Computing (IJNLC) [Online],Vol. 2, No.4, pp. Available: [3] Ray, P.; Harish V.; Sarkar, S.; and Basu, A.; Part of Speech Tagging and Local Word Grouping Techniques for Natural Language Parsing in Hindi ; Proceedings of the 1st International Conference on Natural Language Processing (ICON), Mysore, India, 2003, pp. [4] Dilshan De Silva, Asanga Alahakoon, Imesha Udayangani, Vishva Kumara, Devinda Kolonnage, Harindu Perera, and Samantha Thelijjagoda, Sinhala to English Language Translator, proceedings of 4 th International Conference on Information and Automation for Sustainability(ICIAFS), Colombo, Srilanka, 2008, pp. [5] Dr. Shridhar Shanvare, Abhinav Marathi Vyakaran, Marathi Lekhan, Vidya Vikas Mandal, Nagpur. [6] Naila Ata, Bushra Jawaid, Amir Kamarn, Rule based English to Urdu Machine Translation, in proceedings of Conference on Language and Technology(CLT 07], University of Karachi, Pakistan, 2007, pp. [7] Rajiv Sangal, Vineet Chaitanya, Natural Language Processing- a Paninian Perspective, Akshar Bharati Group, PHI publication. [8] R.M.K. Sinha, A.Jain, AnglaHindi: an English to Hindi Machine- Aided Translation System, in proceedings of the 9 th MT Summit(MTS), New Orieans, USA, Sep , 2003, pp [9] Uzair Muhammad, Atif Khan, Handling Proper nouns in machine translation from English into Urdu, Journal of Information & Communication Technology, Vol. 1, No. 2, 2007, pp VIII. CONCLUSION AND FUTURE SCOPE This paper presents a system for machine translation of Simple English assertive sentences to their Marathi counterpart. It follows the transfer approach with rule-based translation and emphasizes on assertive sentences and reordering algorithm for target language generation. It is difficult to frame the generalized rules for Marathi because grammar of English and Marathi are out of line. The system is successfully tested on 115 different simple assertive sentences using our production rules and produced satisfactory results. The major challenge in Machine Translation is to resolve the ambiguity in the meaning of words in the sentence. e.g.- He is standing near the bank? two possible contexts of the word bank - bank of river or the money bank. Resolving lexical and structural ambiguity would be big challenge for researchers. Grammar of the English Language sometimes allows the change in the sequence of words without changing the meaning of the sentence. e. g.- Should Ram have gone to the store? can be written as Should have Ram gone to the store?. The former sentence is translated by our system correctly but the latter is not. To allow such flexibility, there is a need to make rules more generalized. Translation of simple assertive sentences discussed in this paper can be extended for other sub-types of simple sentences such as imperative, negative and exclamatory sentences. The scope can be further expanded for

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.

More information

A Simple Surface Realization Engine for Telugu

A Simple Surface Realization Engine for Telugu A Simple Surface Realization Engine for Telugu Sasi Raja Sekhar Dokkara, Suresh Verma Penumathsa Dept. of Computer Science Adikavi Nannayya University, India dsairajasekhar@gmail.com,vermaps@yahoo.com

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions. to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

More information

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight. Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material

More information

Developing Grammar in Context

Developing Grammar in Context Developing Grammar in Context intermediate with answers Mark Nettle and Diana Hopkins PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge, United

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit Unit 1 Language Development Express Ideas and Opinions Ask for and Give Information Engage in Discussion ELD CELDT 5 EDGE Level C Curriculum Guide 20132014 Sentences Reflective Essay August 12 th September

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

Ch VI- SENTENCE PATTERNS.

Ch VI- SENTENCE PATTERNS. Ch VI- SENTENCE PATTERNS faizrisd@gmail.com www.pakfaizal.com It is a common fact that in the making of well-formed sentences we badly need several syntactic devices used to link together words by means

More information

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class If we cancel class 1/20 idea We ll spend an extra hour on 1/21 I ll give you a brief writing problem for 1/21 based on assigned readings Jot down your thoughts based on your reading so you ll be ready

More information

Natural Language Processing. George Konidaris

Natural Language Processing. George Konidaris Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Grammar Extraction from Treebanks for Hindi and Telugu

Grammar Extraction from Treebanks for Hindi and Telugu Grammar Extraction from Treebanks for Hindi and Telugu Prasanth Kolachina, Sudheer Kolachina, Anil Kumar Singh, Samar Husain, Viswanatha Naidu,Rajeev Sangal and Akshar Bharati Language Technologies Research

More information

Modeling full form lexica for Arabic

Modeling full form lexica for Arabic Modeling full form lexica for Arabic Susanne Alt Amine Akrout Atilf-CNRS Laurent Romary Loria-CNRS Objectives Presentation of the current standardization activity in the domain of lexical data modeling

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

Writing a composition

Writing a composition A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a

More information

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

Indian Institute of Technology, Kanpur

Indian Institute of Technology, Kanpur Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar

More information

Emmaus Lutheran School English Language Arts Curriculum

Emmaus Lutheran School English Language Arts Curriculum Emmaus Lutheran School English Language Arts Curriculum Rationale based on Scripture God is the Creator of all things, including English Language Arts. Our school is committed to providing students with

More information

Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

More information

SAMPLE PAPER SYLLABUS

SAMPLE PAPER SYLLABUS SOF INTERNATIONAL ENGLISH OLYMPIAD SAMPLE PAPER SYLLABUS 2017-18 Total Questions : 35 Section (1) Word and Structure Knowledge PATTERN & MARKING SCHEME (2) Reading (3) Spoken and Written Expression (4)

More information

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &

More information

Derivational and Inflectional Morphemes in Pak-Pak Language

Derivational and Inflectional Morphemes in Pak-Pak Language Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes

More information

Some Principles of Automated Natural Language Information Extraction

Some Principles of Automated Natural Language Information Extraction Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract

More information

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading ELA/ELD Correlation Matrix for ELD Materials Grade 1 Reading The English Language Arts (ELA) required for the one hour of English-Language Development (ELD) Materials are listed in Appendix 9-A, Matrix

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

Character Stream Parsing of Mixed-lingual Text

Character Stream Parsing of Mixed-lingual Text Character Stream Parsing of Mixed-lingual Text Harald Romsdorfer and Beat Pfister Speech Processing Group Computer Engineering and Networks Laboratory ETH Zurich {romsdorfer,pfister}@tik.ee.ethz.ch Abstract

More information

National Literacy and Numeracy Framework for years 3/4

National Literacy and Numeracy Framework for years 3/4 1. Oracy National Literacy and Numeracy Framework for years 3/4 Speaking Listening Collaboration and discussion Year 3 - Explain information and ideas using relevant vocabulary - Organise what they say

More information

Today we examine the distribution of infinitival clauses, which can be

Today we examine the distribution of infinitival clauses, which can be Infinitival Clauses Today we examine the distribution of infinitival clauses, which can be a) the subject of a main clause (1) [to vote for oneself] is objectionable (2) It is objectionable to vote for

More information

Grammars & Parsing, Part 1:

Grammars & Parsing, Part 1: Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review

More information

First Grade Curriculum Highlights: In alignment with the Common Core Standards

First Grade Curriculum Highlights: In alignment with the Common Core Standards First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features

More information

Using a Native Language Reference Grammar as a Language Learning Tool

Using a Native Language Reference Grammar as a Language Learning Tool Using a Native Language Reference Grammar as a Language Learning Tool Stacey I. Oberly University of Arizona & American Indian Language Development Institute Introduction This article is a case study in

More information

What the National Curriculum requires in reading at Y5 and Y6

What the National Curriculum requires in reading at Y5 and Y6 What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

Comprehension Recognize plot features of fairy tales, folk tales, fables, and myths.

Comprehension Recognize plot features of fairy tales, folk tales, fables, and myths. 4 th Grade Language Arts Scope and Sequence 1 st Nine Weeks Instructional Units Reading Unit 1 & 2 Language Arts Unit 1& 2 Assessments Placement Test Running Records DIBELS Reading Unit 1 Language Arts

More information

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3 Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection

More information

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

More information

Advanced Grammar in Use

Advanced Grammar in Use Advanced Grammar in Use A self-study reference and practice book for advanced learners of English Third Edition with answers and CD-ROM cambridge university press cambridge, new york, melbourne, madrid,

More information

BULATS A2 WORDLIST 2

BULATS A2 WORDLIST 2 BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is

More information

COMMISSIONER AND DIRECTOR OF SCHOOL EDUCATION ANDHRA PRADESH :: HYDERABAD NOTIFICATION FOR RECRUITMENT OF TEACHERS 2012

COMMISSIONER AND DIRECTOR OF SCHOOL EDUCATION ANDHRA PRADESH :: HYDERABAD NOTIFICATION FOR RECRUITMENT OF TEACHERS 2012 COMMISSIONER AND DIRECTOR OF SCHOOL EDUCATION ANDHRA PRADESH :: HYDERABAD NOTIFICATION FOR RECRUITMENT OF TEACHERS 2012 INFORMATION BULLETIN 1. In pursuance of the orders of the Government in G.O.Ms.No.159,

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Specifying a shallow grammatical for parsing purposes

Specifying a shallow grammatical for parsing purposes Specifying a shallow grammatical for parsing purposes representation Atro Voutilainen and Timo J~irvinen Research Unit for Multilingual Language Technology P.O. Box 4 FIN-0004 University of Helsinki Finland

More information

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words, First Grade Standards These are the standards for what is taught in first grade. It is the expectation that these skills will be reinforced after they have been taught. Taught Throughout the Year Foundational

More information

CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE

CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE Pratibha Bajpai 1, Dr. Parul Verma 2 1 Research Scholar, Department of Information Technology, Amity University, Lucknow 2 Assistant

More information

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark Theme 2: My World & Others (Geography) Grade 5: Lewis and Clark: Opening the American West by Ellen Rodger (U.S. Geography) This 4MAT lesson incorporates activities in the Daily Lesson Guide (DLG) that

More information

Adjectives tell you more about a noun (for example: the red dress ).

Adjectives tell you more about a noun (for example: the red dress ). Curriculum Jargon busters Grammar glossary Key: Words in bold are examples. Words underlined are terms you can look up in this glossary. Words in italics are important to the definition. Term Adjective

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

A First-Pass Approach for Evaluating Machine Translation Systems

A First-Pass Approach for Evaluating Machine Translation Systems [Proceedings of the Evaluators Forum, April 21st 24th, 1991, Les Rasses, Vaud, Switzerland; ed. Kirsten Falkedal (Geneva: ISSCO).] A First-Pass Approach for Evaluating Machine Translation Systems Pamela

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative English Teaching Cycle The English curriculum at Wardley CE Primary is based upon the National Curriculum. Our English is taught through a text based curriculum as we believe this is the best way to develop

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

HinMA: Distributed Morphology based Hindi Morphological Analyzer

HinMA: Distributed Morphology based Hindi Morphological Analyzer HinMA: Distributed Morphology based Hindi Morphological Analyzer Ankit Bahuguna TU Munich ankitbahuguna@outlook.com Lavita Talukdar IIT Bombay lavita.talukdar@gmail.com Pushpak Bhattacharyya IIT Bombay

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English. Basic Syntax Doug Arnold doug@essex.ac.uk We review some basic grammatical ideas and terminology, and look at some common constructions in English. 1 Categories 1.1 Word level (lexical and functional)

More information

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Takako Aikawa, Lee Schwartz, Ronit King Mo Corston-Oliver Carmen Lozano Microsoft

More information

Two methods to incorporate local morphosyntactic features in Hindi dependency

Two methods to incorporate local morphosyntactic features in Hindi dependency Two methods to incorporate local morphosyntactic features in Hindi dependency parsing Bharat Ram Ambati, Samar Husain, Sambhav Jain, Dipti Misra Sharma and Rajeev Sangal Language Technologies Research

More information

CORPUS ANALYSIS CORPUS ANALYSIS QUANTITATIVE ANALYSIS

CORPUS ANALYSIS CORPUS ANALYSIS QUANTITATIVE ANALYSIS CORPUS ANALYSIS Antonella Serra CORPUS ANALYSIS ITINEARIES ON LINE: SARDINIA, CAPRI AND CORSICA TOTAL NUMBER OF WORD TOKENS 13.260 TOTAL NUMBER OF WORD TYPES 3188 QUANTITATIVE ANALYSIS THE MOST SIGNIFICATIVE

More information

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature 1 st Grade Curriculum Map Common Core Standards Language Arts 2013 2014 1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature Key Ideas and Details

More information

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Minimalism is the name of the predominant approach in generative linguistics today. It was first Minimalism Minimalism is the name of the predominant approach in generative linguistics today. It was first introduced by Chomsky in his work The Minimalist Program (1995) and has seen several developments

More information

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

More information

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,

More information

LING 329 : MORPHOLOGY

LING 329 : MORPHOLOGY LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,

More information

An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

More information

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present

More information

Pseudo-Passives as Adjectival Passives

Pseudo-Passives as Adjectival Passives Pseudo-Passives as Adjectival Passives Kwang-sup Kim Hankuk University of Foreign Studies English Department 81 Oedae-lo Cheoin-Gu Yongin-City 449-791 Republic of Korea kwangsup@hufs.ac.kr Abstract The

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

1. Introduction. 2. The OMBI database editor

1. Introduction. 2. The OMBI database editor OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper

More information

Chapter 4: Valence & Agreement CSLI Publications

Chapter 4: Valence & Agreement CSLI Publications Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).

More information

Unit 8 Pronoun References

Unit 8 Pronoun References English Two Unit 8 Pronoun References Objectives After the completion of this unit, you would be able to expalin what pronoun and pronoun reference are. explain different types of pronouns. understand

More information

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Named Entity Recognition: A Survey for the Indian Languages

Named Entity Recognition: A Survey for the Indian Languages Named Entity Recognition: A Survey for the Indian Languages Padmaja Sharma Dept. of CSE Tezpur University Assam, India 784028 psharma@tezu.ernet.in Utpal Sharma Dept.of CSE Tezpur University Assam, India

More information

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions. 6 1 IN THIS UNIT YOU LEARN HOW TO: ask and answer common questions about jobs talk about what you re doing at work at the moment talk about arrangements and appointments recognise and use collocations

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

UKLO Round Advanced solutions and marking schemes. 6 The long and short of English verbs [15 marks]

UKLO Round Advanced solutions and marking schemes. 6 The long and short of English verbs [15 marks] UKLO Round 1 2013 Advanced solutions and marking schemes [Remember: the marker assigns points which the spreadsheet converts to marks.] [No questions 1-4 at Advanced level.] 5 Bulgarian [15 marks] 12 points:

More information

Improving the Quality of MT Output using Novel Name Entity Translation Scheme

Improving the Quality of MT Output using Novel Name Entity Translation Scheme Improving the Quality of MT Output using Novel Name Entity Translation Scheme Deepti Bhalla Department of Computer Science Banasthali University Rajasthan, India deeptibhalla0600@gmail.com Nisheeth Joshi

More information

Written by: YULI AMRIA (RRA1B210085) ABSTRACT. Key words: ability, possessive pronouns, and possessive adjectives INTRODUCTION

Written by: YULI AMRIA (RRA1B210085) ABSTRACT. Key words: ability, possessive pronouns, and possessive adjectives INTRODUCTION STUDYING GRAMMAR OF ENGLISH AS A FOREIGN LANGUAGE: STUDENTS ABILITY IN USING POSSESSIVE PRONOUNS AND POSSESSIVE ADJECTIVES IN ONE JUNIOR HIGH SCHOOL IN JAMBI CITY Written by: YULI AMRIA (RRA1B210085) ABSTRACT

More information

Thornhill Primary School - Grammar coverage Year 1-6

Thornhill Primary School - Grammar coverage Year 1-6 Thornhill Primary School - Grammar coverage Year 1-6 Year Topic Examples Terminology Importance Using full stops and capital letters to demarcate s We sailed to the land where the wild things are. Sentence

More information

Interactive Corpus Annotation of Anaphor Using NLP Algorithms

Interactive Corpus Annotation of Anaphor Using NLP Algorithms Interactive Corpus Annotation of Anaphor Using NLP Algorithms Catherine Smith 1 and Matthew Brook O Donnell 1 1. Introduction Pronouns occur with a relatively high frequency in all forms English discourse.

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks 3rd Grade- 1st Nine Weeks R3.8 understand, make inferences and draw conclusions about the structure and elements of fiction and provide evidence from text to support their understand R3.8A sequence and

More information

ScienceDirect. Malayalam question answering system

ScienceDirect. Malayalam question answering system Available online at www.sciencedirect.com ScienceDirect Procedia Technology 24 (2016 ) 1388 1392 International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015) Malayalam

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

MAHATMA GANDHI KASHI VIDYAPITH Deptt. of Library and Information Science B.Lib. I.Sc. Syllabus

MAHATMA GANDHI KASHI VIDYAPITH Deptt. of Library and Information Science B.Lib. I.Sc. Syllabus MAHATMA GANDHI KASHI VIDYAPITH Deptt. of Library and Information Science B.Lib. I.Sc. Syllabus The Library and Information Science has the attributes of being a discipline of disciplines. The subject commenced

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused

More information

Context Free Grammars. Many slides from Michael Collins

Context Free Grammars. Many slides from Michael Collins Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures

More information

The College Board Redesigned SAT Grade 12

The College Board Redesigned SAT Grade 12 A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.

More information

Copyright 2017 DataWORKS Educational Research. All rights reserved.

Copyright 2017 DataWORKS Educational Research. All rights reserved. Copyright 2017 DataWORKS Educational Research. All rights reserved. No part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic or mechanical,

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Nathaniel Hayes Department of Computer Science Simpson College 701 N. C. St. Indianola, IA, 50125 nate.hayes@my.simpson.edu

More information

Name of Course: French 1 Middle School. Grade Level(s): 7 and 8 (half each) Unit 1

Name of Course: French 1 Middle School. Grade Level(s): 7 and 8 (half each) Unit 1 Name of Course: French 1 Middle School Grade Level(s): 7 and 8 (half each) Unit 1 Estimated Instructional Time: 15 classes PA Academic Standards: Communication: Communicate in Languages Other Than English

More information

The Discourse Anaphoric Properties of Connectives

The Discourse Anaphoric Properties of Connectives The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

Loughton School s curriculum evening. 28 th February 2017

Loughton School s curriculum evening. 28 th February 2017 Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's

More information