An Arabic Semantic Parser and Meaning Analyzer

Size: px
Start display at page:

Download "An Arabic Semantic Parser and Meaning Analyzer"

Transcription

1 An Arabic Semantic Parser and Meaning Analyzer AbdulMalik Al-Salman, Yousef Al-Ohali, and Maha AlRabiah Computer Sceince Department, KSU, Saudi Arabia {Salman, Abstract Arabic language is very rich in derivations, vocabulary, and grammatical structures. The problem of determining the correct meaning of a word in a non-vowelized Arabic sentence is not a trivial task since Arabic is very rich in the polysemy phenomena. This paper attempts to reveal the word sense ambiguity, by building a semantic parser powered by a statistical semantic analyzer, which may aid in the improvement of machine translation, question answering and other Arabic NLP systems. Building the parser was done in three steps. The first step was to acquire the grammatical rules for Arabic that was covered in an Arabic grammar textbook, and develop constraints that aided in revealing part of the parsing ambiguity. The grammar and the constraints were then written in an XML format to make them readable and available for future uses. The second step was to build the semantic parser that assigns grammatical structure onto input sentence. The final step was to impose a semantically statistical technique on the resulting grammatical structures to determine the most accurate structure, the one that result in resolving the word sense ambiguity, and determining the most accurate meaning of the word. 1. Introduction There are many different applications for natural language understanding that researchers work on [1], these include: information retrieval, knowledge-acquisition, translation, summarization, and categorization. In natural language understanding, morphology serves as the basic layer on which the higher syntactic and semantic layers are built. It is the linguistic system that governs how the words of this language are built. There are two kinds of morphology: inflectional forms and derivational forms. Inflectional forms add a suffix and/or a prefix to the root of a word generating another word while obtaining the same grammatical category and the same basic meaning. Derivational morphology involves the derivation of new words from a finite set of roots using a set of predefined morphological `patterns' or `forms'. The new words may have completely different grammatical categories. The morphological pattern also represents the category of the word, and in some cases indicates its syntactic and semantic roles. The automation of morphological analysis has gone a long way in English and also in Arabic. In Arabic, the word may either be a verb, a noun or a particle. It is mainly a derivational language, where most of its words are derived from a finite set of roots using a finite set of morphological `forms' or `patterns'. According to [7], the total number of roots in Arabic is around This number varies slightly from dictionary to dictionary, but it remains around 10,000 on average. Around 70% of these are tri-literal while the rest are 4-letter roots, and the total number of morphological forms is around 900. There exist several approaches and algorithms for morphological analysis and generation. Such as "the typical conventional algorithm", "the sliding window approximate matching (SWAM)", "the finite state transducers (FSTs) approach" and "the two-level finite state machine approach" [1][7]. Many of these approaches are applicable to Arabic, and many Arabic morphological analyzers have been developed. For example, Al-Affendi has developed "The Arabic Morphological Analyzer (AMA)" based on the SWAM algorithm. Kenneth R. Beesley built a system for morphological analysis and generation of Arabic using the finite-state approach. His work was part of the Xerox Research Center [3][4]. Buckwalter produced an Arabic morphological

2 analyzer system that converts the Arabic letters into Latin letters using the Buckwalter's transliteration scheme [5]. Darwish developed a light Arabic stemmer based on removing common prefixes and suffixes from the word to reach the stem [6]. Al-Shalabi and Evens built a computational morphology system for Arabic that works only for non-vowelized words [2]. To examine how the grammatical structure of a sentence can be computed, two things must be considered: the grammar, which specifies the acceptable structures that can produce a correct sentence, and the parsing technique, which is the method of analyzing a sentence to determine its structure according to the grammar. In order for a computer to deal with a natural language, the structure of the language should be described in symbols and notations familiar to the computer specifying all the legal structures in that language. Several grammars exist, which include: Context-Free Grammars (CFGs), Augmented CFGs, Transition Network Grammars, Augmented Transition Networks, Definite Clause Grammars (DCG) [1] and Affix Grammars over a Finite Lattice (AGFL) [15]. Most of these grammar formalization techniques were successfully applied to Arabic as in [8][9][10] and [11]. A parsing algorithm can be described as a procedure that searches through various ways of combining grammatical rules to find a combination that generates a tree that represents a possible structure of the input sentence. Parsing techniques includes: Top-down parsing, Bottom-up parsing, Bottom-up chart parsing, Top-down chart parsing, Top-Down Parsing with Recursive Transition Networks and Recursive Descent Parsing [1]. Parsing Arabic sentences is a difficult task. The difficulty comes from several sources. One is that sentences are long and complex. The average length of a sentence is 20 to 30 words, and it may exceed 100 words [11]. Another difficulty comes from the sentence structure. The Arabic sentence is complex and syntactically ambiguous due to the frequent usage of grammatical relations, order of words, phrases and conjunctions, the omission of diacritics (vowels) in written Arabic and the presence of elliptic personal pronouns المستترة" الضماي ر makes things more difficult. Due to these difficulties, little work has been done in developing parsers involving Arabic. Farouk [11] has adapted a simple top-down parsing algorithm implemented in Prolog to parse Arabic sentences using DCG. Othman et. al. developed a semantic bottom-up chart parser for Arabic using a Unification Based Grammar (UBG) that was implemented using Prolog [17][18]. Other successful parsers include AraParse [19], which uses AGFL formalism. Semantic analysis is the study of meaning communicated through language [20]. Modern approaches to semantic analysis are often grouped into two classes. The first class consists of logicbased or symbolic systems, which have the goal of producing a deep and rich semantic and pragmatic interpretation of a text. They generally use representations based on predicate logic, and include complex knowledge structures and inference rules necessary to interpret connected texts. The second class adapts machine-learning techniques, such as statistical techniques, allowing systems to be trained directly from examples of input/output pairs. These systems, forgoes the deep representations in favor of directly modeling the task to be performed. They tend to reformulate the task of understanding as a pattern-recognition problem. In the logic-based approach, the meaning of a sentence is represented in a formal representation language using logical frames, which can be derived using first-order predicate calculus (FOPC) or Lambda Calculus. Semantic grammar can also be used as a method of semantic interpretation for a specific domain. They also can be augmented to produce a logical form in the normal way, or the parse tree of the semantic grammar itself can be used as a logical form [1] [14]. The field of natural language processing has undergone a fundamental shift toward machinelearning methods especially statistical methods. The availability of large and annotated corpora has made possible a methodology based on training systems on labeled data and quantitatively evaluating their performance on held-out test data. Some successful examples of systems built over statistical techniques includes Hidden Understanding Model (HUM), described in [21] and [22], which is based completely on trained statistical models derived from annotated corpora. Also Chill 2

3 [16] [23], which is a learning semantic parser that maps a natural-language database queries (sentences) into executable Prolog queries (detailed logical forms). Arabic semantic analysis, on the othe hand, suffers from little attention and research. Until now, there exists no formal theory of semantics that is able to provide a complete and consistent account of all the phenomena of Arabic. Haddad et. al. [12] [13] attempted to model the Arabic sentence using FOPC representation and the lambda calculus. Although we didn't come across any publications or applications of semantic grammars, we think that they can be applied successfully to Arabic. On the other hand, machine-learning approaches to semantic representation can be applied to Arabic in the same way they are applied to English. Yet they require the availability of Arabic corpus in specific domains to train systems. These Arabic corpuses are unfortunately not widely available, as far as our knowledge, the thing that is making the progress of these methods in Arabic language very slow. 2. System Analysis & Design Building the Arabic Semantic Parser and Analyzer system was done in four stages: building the morphological analyzer, creating an XML document of Arabic grammar, building the semantic parser and finally, building the semantic analyzer. To understand the overall system architecture, see the data flow diagrams and the pseudo code (Figures 1, 2 and 3). In the following sub-sections we will briefly describe the stages of building the system. Arabic Sentence 0 User Arabic Semantic Parser & Meaning Analyzer Sentence Grammatical Structure Word Meanings' Sentence Meaning Probability Figure 1: Context Diagram 3

4 D1 MORPHOLOGICAL ANALYSER USER Arabic Sentence 1.0 Identify Word Morphological Features Word 2.0 Build Sentence Tree Semantic Features Sentence Tree Words meanings' 4.0 Parse Tree 3.0 Sentence Meaning Probability Sentence Grammatical Structures Word's meanings Semantic Analyzer D2 ARABIC SEMANTIC LEXICON Disambiguating words Semantic Parser DOM Tree D2 GRAMMAR XML Figure 2: Leve-0 Data Flow Diagram Bottom-up chart parser: 1- Read the sentence. 2- Build the sentence tree: a. Create the root node, and let it be the parent node. b. While it is not the end of the sentence: i. Identify a word. ii. Search for the word in the Morphological Analyzer database, and create a new object from the class Verb, Noun or Particle for each matching word depending on the word's type storing the morphological and semantic features of that word in the object created. iii. Append the object(s) created as children of the parent node. iv. Set each one of the child node as a parent, and continue to step (i). 3- Build the parse tree: a. Read the Grammar XML document, and convert it into a DOM (Document Object Model) tree. b. For each path in the sentence tree: i. Store the path in a sequence. ii. Determine sentence type (verbal or nominal). iii. Parse the branch VPhrase or Nphrase in the DOM tree depending on the sentence type. iv. For each rule in the branch quit parsing if a node does not match, and move to the next rule. c. If the sequence matches a given rule, store it in the parse tree. 4- Get words meanings' and calculate probability: a. For each path in the parse tree: i. Retrieve the word from the semantic lexicon, using the stem. ii. Compare the supporting words with the rest of the words in the path. For each match increment a score. iii. Sum the scores for each path (sentence). b. Calculate the probability for each path. Figure 3: The System's Pseudo Code 4

5 2.1 Building the Arabic morphological analyzer Parsers usually depend on the results generated by morphological analyzers, and this project is not an exception. The morphological analyzer we've used adapts a similar approach proposed by Othman et. al. with some modifications [18]. It depends on a semantic lexicon, Semantic Lexicon Database, which stores for each word the lexical and syntactic features. Since developing a morphological analyzer is not in the scope of this project, we have simulated the morphological analyzer by storing the morphological features of the words in the Semantic Lexicon Database (Table 1). Morphological features Syntactic features Semantic features Table 1: Semantic Lexicon Database Verbs Nouns Particles stem, root, attached suffix attached suffix (suffix), attached suffix gender (suffix), attached (suffix_gender), attached suffix suffix gender number (suffix_number), (suffix_gender) attached prefix (prefix) and and attached suffix attached prefix category number (prefix_category). (suffix_number). stem, root, connected subject (subject), connected object (object), prefix and prefix category (prefix_category). tense, voice, transitivity, subject_gender (sbj_gender), subject_number (sbj_number), object_gender (obj_gender), object_number (obj_number), the irab case (irab_case) and the verb's category. subject rationality (sbjrat) and object rationality (objrat), supporting stems, meaning. gender, number, irab case (irab_case), noun's adjectivability, category and if it is a definite noun (definite) or a perfect noun (perfect_noun). noun's rationality (rational), supporting stems, meaning. gender, number and category Creating an XML document of Arabic grammar As we have mentioned earlier, a grammar is the group of symbols and notations that describe the legal structures of sentences in that language, which is essential for every parser. From this perspective, we have made some effort trying to review all (or most) of the Arabic grammar rules from Arabic grammar textbooks. We have also tried to figure out which of these grammar rules can be represented in a way that is understood by the computer. We have faced a great deal of difficulty since Arabic is very rich in its grammatical structures. Finally, we have chosen to represent the grammar in a Backus Naur Form (BNF), which was part of a previous work on building an Arabic parser [24], because it can be easily transferred into a grammar that the computer can understand. For that we have selected a comprehensive Arabic grammar textbook [28], and tried to represent most of the rules in that book in BNF (see appendix A). These rules were revised by a linguistic specialist, and an HTML version is also available. The next step was to construct the grammar to be used by the parser. We have chosen to represent the grammar in an XML document (Grammar.xml) to make it extendable and available for other researches and future work. The document Grammar is composed of two parts: a VPhrase containing noun phrases. This (الجمل ة _ الاس مية ( NPhrase containing verb phrases, and a (الجمل ة _ الفعل ية ( separation between noun and verb phrases aids in reducing the time required for parsing the document searching for the structure of a certain sentence. Both of the VPhrase and NPhrase contain a set of rules. Each rule ( (ق اعدة contains a set of words ( (آلم ة that represent the words in a 5

6 sentence and a set of constrains (,(ض ابط if available, on those words (see Figure 4 for the Document Type Definition (DTD), and Figure 5 for a snapshot of the document's DOM tree). The constraints impose restrictions on the semantic and syntactic features of the words, aiding in the reduction of ambiguity. The XML document fragment in Figure 6 specifies the grammar rule of a verb phrase consisting of a verb ( (فع ل and a subject (.(ف اعل The rule ( (ق اعدة has two words (.(آلم ة The first word ضمير متصل ( subject has no connected,(مبن ي للمعلوم ( voice It is in active."فع ل "verb- (ن وع ( type has the "اسم- noun " The second word has the type."فعل" and its grammar position should be (ف ي مح ل رف ع ف اعل and the grammar position "."ف اعل The rule has also four constraints (.(ض ابط The first constraint specifies that verb's subject gender ( _ الف اعل (ج نس should be equal to the subject's gender (.(الج نس The second constraint specifies that the verb's subject rationality ( _ ع اقل (الف اعل should be equal to the subject's rationality (.(ع اقل The third constraint specifies that the subject ( _ الا عراب ية (الحال ة should be either nominative ( (م رفوع or not specified (-). Finally, the last constraint specifies that if the subject is plural ( س الم مذآ ر (جم ع it should not be without a noon ( ال نون.(مح ذوف Appendix B lists the Arabic Grammar rules XML document. <(الجملة _ الاسمية,الجملة _ الفعلية ( قواعد _ النحو ELEMENT <(+قاعدة) الجملة _ الفعلية <!ELEMENT <(+قاعدة) الجملة _ الاسمية <!ELEMENT <(*ضابط,+آلمة) قاعدة <!ELEMENT,?الفاعل,?نوع _ السابقة,?السابقة,?صفة,?الحالة _ الا عرابية,?التصنيف,?علم,?معرفة,?التعدي,?البناء,?الزمن,النوع) آلمة <!ELEMENT,?اللاحقة,?الا داة,?المفعول _به_عاقل,?الفاعل _ عاقل,?عدد _المفعول_به,?جنس _المفعول_به,?عدد _ الفاعل,?جنس _ الفاعل,?المفعول_به <(الموقع,?العدد,?الجنس,?جنس _ اللاحقة,?عدد_اللاحقة <(العلاقة,*القيمة,+الموقع _ الا عرابي ( ضابط <!ELEMENT <!ELEMENT الموقع _ الا عرابي (#PCDATA) > عدد _ الفاعل جنس _ الفاعل التعدي البناء الزمن نوع _السابقة السابقة المفعول_به الفاعل الجذر) الخاصية الموقع _ الا عرابي <!ATTLIST صفة العدد الجنس جنس _ اللاحقة عدد _ اللاحقة اللاحقة المفعول _به_عاقل الفاعل _ عاقل التصنيف الحالة _ الا عرابية عدد _المفعول_به جنس_المفعول_به <"الفاعل" (الاسم _ مجرد الفعل _ مجرد الا داة عاقل علم معرفة <!ELEMENT النوع (#PCDATA) > <!ELEMENT صفة (#PCDATA) > <!ELEMENT السابقة (#PCDATA) > <!ELEMENT نوع _ السابقة (#PCDATA) > <!ELEMENT الزمن (#PCDATA) > <!ELEMENT البناء (#PCDATA) > <!ELEMENT التعدي (#PCDATA) > <!ELEMENT معرفة (#PCDATA) > <!ELEMENT علم (#PCDATA) > <!ELEMENT الجنس (#PCDATA) > <!ELEMENT العدد (#PCDATA) > <!ELEMENT الفاعل (#PCDATA) > <!ELEMENT المفعول _ به (#PCDATA) > <!ELEMENT جنس _ الفاعل (#PCDATA) > <!ELEMENT عدد _ الفاعل (#PCDATA) > <!ELEMENT الفاعل _ عاقل (#PCDATA) > <!ELEMENT جنس _المفعول_به (#PCDATA) > <!ELEMENT عدد _المفعول_به (#PCDATA) > <!ELEMENT المفعول _به_عاقل (#PCDATA) > <!ELEMENT الا داة (#PCDATA) > <!ELEMENT اللاحقة (#PCDATA) > <!ELEMENT جنس _ اللاحقة (#PCDATA) > <!ELEMENT عدد _ اللاحقة (#PCDATA) > <!ELEMENT نوع _ الاسم (#PCDATA) > <!ELEMENT نوع _ الفعل (#PCDATA) > <!ELEMENT نوع _ الا داة (#PCDATA) > <!ELEMENT الحالة _ الا عرابية (#PCDATA) > <!ELEMENT الموقع (#PCDATA) > <!ELEMENT العلاقة (#PCDATA) > <!ELEMENT القيمة (#PCDATA) > Figure 4: XML Document DTD 6

7 Figure 5: XML Document DOM Tree <--فعل - فاعل --!> <آلمة> <النوع>فعل</النوع> <البناء>مبني للمعلوم</البناء> <الفاعل>-</ الفاعل> <الموقع>فعل</الموقع> <آلمة/> <آلمة> <النوع> سا م</النوع> <الموقع>فاعل</الموقع> <آلمة/> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا یساوي</العلاقة> Figure 6: XML Grammar Rule Example 7

8 2.3 Building the semantic parser The semantic parser implements a bottom-up chart parser, which takes as input a "sentence tree" containing the sentence with all possible morphological combinations of its words. For example, the following input sentence: "ذهب حمد إلى المسجد " will generate the sentence tree in Figure 7. This tree will be passed to the semantic parser, which in turn will parse every possible path in the tree against the Grammar XML document. This will result in rejectiion most of the paths and greatly minimize the degree of parsing ambiguity (Figure 8 demonstrates the resulting parse tree). The rules in the XML document were written without the use of recursion. This was very helpful in speeding-up the parsing process, since the parser does not have to go recursively very deep in the DOM tree while searching for a valid rule. root ذهب ذهب ذهب حمد حمد حمد حمد حمد حمد حمد حمد حمد إلى إلى إلى إلى إلى إلى إلى إلى إلى المسجد المسجد المسجد المسجد المسجد المسجد المسجد المسجد المسجد اسم Noun فعل مبني للمعلوم Active verb فعل مبني للمجهول Passive verb حرف Particle "ذهب حمد إلى المسجد" Figure 7: The Semantic Tree of 8

9 root فعل مبني للمعلوم ذهب ذهب مبتدأ مضاف Noun Active verb فاعل حمد حمد مضاف إليه Noun Noun حرف جر إلى إلى حرف جر اسم مجرور المسجد المسجد اسم مجرور- شبه الجملة خبر "ذهب حمد إلى المسجد" Figure 8: The Parse Tree of 2.4 Building the semantic analyzer Since Arabic is very rich with the polysemy phenomena, i.e. having words with the same pronunciation and spelling but with totally different meanings. Determining the correct meaning of a word in a non-vowelized Arabic sentence is a difficult task [25]. It is somehow related to the overall understanding of the sentence's meaning and the meanings of its words. This phase represnts the final and most important stage of this research work: building a semantic analyzer that can identify the correct or most accurate meaning of a word in the sentence. The idea adapted here is purely statistical. For each word (stem), in the semantic lexicon, we store its meaning (extracted from [26][27][29]) along with a group of words that appeare frequently with it. The semantic analyzer takes the parse tree as input, and tries to figure out how many of the words supporting a particular word's meaning were present in the sentence. Then it computes the probability of the correctness of the meaning for the whole sentence. The sentence(s) with the highest probability have the most accurate meaning. In our previous example, the parse tree will be input to the semantic analyzer, which in turn computes the probability of the correctness of both sentences and assigns meanings to words (see Figure 9). 9

10 Probability = 0 root Probability = 1 فعل مبني للمعلوم ذهب ذهب مبتدأ مضاف غادر معدن نفيس فاعل حمد اسم علم حمد مضاف إليه حرف جر إلى إلى حرف جر اسم مجرور المسجد مكان تعبد المسلمين المسجد اسم مجرور- شبه الجملة خبر Figure 9: The Output of the Semantic Analyzer 3. System Implementation The Arabic Semantic Parser & Analyzer was implemented in Java using Borland's JBuilder9. Programming with Java makes code portable and machine independent. On the other hand, both databases were created using Microsoft SQL Server System Classes The system has 7 classes (there are other classes used for the GUI, which we are ignored here). We will briefly describe each one of these classes and their main functions, along with their UML representation diagrams. The diagrams are generated using Borland s JBuilder 2006 enterprise. Main Class: As its name implies, this class is the basic class that calls other classes. Mainly it is the class that glues everything together. Figure 10 shows a simple UML diagram of this class. Figure 10: A UML Diagram of Class Main Verb Class: This class represents a word of type verb, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 11 shows a simple UML diagram of this class. 10

11 Figure 11: A UML Diagram of Class Verb Noun Class: This class represents a word of type noun, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 12 shows a simple UML diagram of this class. Figure 12: A UML Diagram of Class Noun Particle Class: This class represents a word of type particle, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 13 shows a simple UML diagram of this class. 11

12 Figure 13: A UML Diagram of Class Particle ConnectToDB Class: This class establishes a connection to the Morphological Analyzer database through the functions Load_driver() and connection(). It also generates the "sentence tree" through the function ProcessSentence(), which reads an input sentence and then builds the equivalent tree of that sentence creating appropriate objects from the verb, noun and particle classes. Figure 14 shows a simple UML diagram of this class. Figure 14: A UML Diagram of Class ConnectToDB CreateDom Class: This class creates the DOM document of the XML file (Grammar.xml) through the function parsexmlfile(). It also uses the functions TraverseTree(), Traverse(), traverse_field() and traverse_const() to parse the DOM tree looking for the sentence structure and creating the "parse tree". Figure 15 shows a simple UML diagram of this class. 12

13 Figure 15: A UML Diagram of Class CreateDom GetMeaning Class: This class establishes a connection to the Arabic Semantic Lexicon database. Then it gets the meaning of words in the "parse tree" and computes the score of each word through the function ComputeScore(). Then it computes the overall probability of sentence correctness through the function ComputeProbability(). Figure 16 shows a simple UML diagram of this class. Figure 16: A UML Diagram of Class GetMeaning 4. Experimental Results Testing the system was done in two stages: testing the parser, and then testing the semantic analyzer. For testing the parser, we've built five correct sentences, and three incorrect ones from the available words in the lexicon for each rule. The results were satisfactory (see Tables 2 and 3). The parser succeeded in parsing most of the sentences correctly (Appendix C contains a more detailed testing). Some of the sentences had more than one correct grammatical structure, but some may not be semantically correct. The sentence that was not correctly paresed is: ضرب حمد محمد Where " "حم د is the object and " "محم د is the subject. As noticed, the object precedes the object, and both are rational. The problem here, is that the verb " "ض رب can have both a rational subject and 13

14 object Therefore, there is no way to figure out if the first word is a subject or an object unless the words are vowelized, which is not the case in our example. Rule Number of Correct Sentences Success Rate Table 2: Results of Testing the Semantic Parser Failure Rate Success % Failure % Number of Incorrect Sentences Success Rate Failure Rate Success Percentage 1 5 4/5 1/5 80% 20% 3 3/3 0/3 100% 0% 2 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 3 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 4 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 5 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 6 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 7 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 8 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 9 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% Failure % Total 60 59/60 1/60 98% 2% 36 36/36 0/36 100% 0% For testing the semantic analyzer, the sentences that passed the semantic parsing test were passed to the semantic analyzer. The meanings of less than half of these sentences were recognized, and most of these were recognized correctly (see Table 3). The meanings of other sentences were not recognized due to the fact that there were no words supporting the meanings in those sentences. Rule Number of Sentences Table 3: Results of Testing the Semantic Analyzer Sentences With a Recognized Meanings Success Rate Failure Rate Sentences Without a Recognized Meanings 1 4 4/4 4/4 0/4 0/ /5 3/3 0/3 2/ /5 1/1 0/1 4/ / / /5 3/3 0/3 2/ /5 4/5 1/5 0/ / / /5 1/1 0/1 4/ / / /5 4/4 0/4 1/ /5 1/1 0/1 4/ /5 4/4 0/4 1/5 Total 59 26/59 25/26 1/26 33/59 Percentage 44% 96% 4% 56% 5. Conclusion & Future Work 14

15 Syntactical and morphological analysis of Arabic has received a great attention from researchers in the past years, and a lot of successful morphological and parsing systems have been developed. However, semantic analysis suffers from little attention and research. Apparently, Arabic is still on its infancy regarding semantic and discourse analysis. A lot of effort and research is required to improve Arabic natural language understanding. The aim of this paper is to describe a system that is capable of revealing the ambiguity resulting from understanding the meaning of the sentence. The system succeeded in revealing the ambiguity resulting from the polysemy phenomena. There are many avenues to enhance this work; among these avenues are the following. Connecting the system to a real morphological analyzer that is capable of figuring out all the possibilities of a given word. Completing the semantic lexicon by adding more entries. Completing the XML grammar rules. It currently contains 12 rules. Testing the system extensively. Building a system that collects the supporting words automatically from large Arabic corpus. English References 1. Allen James, Natural Language Understanding, Benjamin/Cummings Publishing Company, 1995, 2nd edition. 2. Al-Shalabi Riyad and Evens Martha, "A Computational Morphology System for Arabic". In proceedings of COLONG-ACL 98 Montreal, Quebec, Canada. 3. Beesley Kenneth, "Arabic Morphology Using Only Finite-State Operations". In proceedings of the Workshop on Computational Approaches to Semitic languages, COLING-ACL 98, Montreal, Quebec, August, 1998, pp Beesley Kenneth, "Finite-State Morphological Analysis And Generation of Arabic at Xerox Research: Status and plans in 2001". In proceedings of the Arabic Language Processing: Status and Prospect--39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, 2001, pp Buckwalter Tim, Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC2002L Darwish, K., "Building A Shallow Arabic Morphological Analyzer In One Day". In M. Rosner and S. Wintner (Eds.), Computational Approaches to Semitic Languages, an ACL'02 Workshop, Philadelphia, PA, 2002, pp El -Affendi M. A., "An LVQ connectionist solution to the non-determinacy Problem in Arabic morphological analysis: a learning hybrid algorithm", Natural Language Engineering, vol. (8), Cambridge University Press, 2002, pp Elnaggar Ayman, "A Phrase Structure Grammar of the Arabic Language". In Proceedings of the 13th COLING, Vol. 3, Helsinki, Finland, 1990, pp El-Shishiny H., "A Formal Description of Arabic Syntax in Definite Clause Grammar". In Proceedings of the 13th COLING, Vol. 3, Helsinki, Finland, 1990, pp Everhard Ditters, A Formal Grammar for the Description of Sentence Structure in Modern Standard Arabic. In proceedings of ACL/EACL01: Conference of the European Chapter, Workshop: Arabic Language Processing: Status and Prospects, Farouk Ahmad, "Developing an Arabic Parser in a Multilingual Machine Translation System", M. Sc, Thesis, Computer and Information Science Department, Cairo University, Haddad Bassam and Yaseen Mustafa, Towards Semantic Composition of ARABIC: A λ-drt Based Approach, MT Summit IX Workshop: Machine Translation for Semitic Languages: Issues and Approaches, USA, September 23,

16 13. Haddad Bassam and Yaseen Mustafa, Towards Understanding Arabic: A Logical Approach for Semantic Representation, ACL/EACL01: Conference of the European Chapter, Workshop: Arabic Language Processing: Status and Prospects, Jurafsky Daniel and Martin James, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice- Hall, Koster C.H.A., "Affix Grammars for Natural Languages". In: H. Alblas and B. Melichar (Eds.), Attribute Grammars, applications and systems. SLNCS 545, Heidelberg, 1991, pp Mooney Raymond, "Learning Semantic Parsers: An Important but Under-Studied Problem". In proceedings of the AAAI 2004 Spring Symposium on Language Learning: An Interdisciplinary Perspective, Stanford, CA, March 2004, pp Othman E., Shaalan K., and Rafea A., "A Chart Parser for Analyzing Modern Standard Arabic Sentence". In proceedings of the MT Summit IX Workshop on Machine Translation for Semitic Languages: Issues and Approaches, New Orleans, Louisiana, U.S.A., Othman Eman, Shaalan Khaled, Rafea Ahmed, Towards Resolving Ambiguity In Understanding Arabic Sentence. In proceedings of the International Conference on Arabic Language Resources and Tools, NEMLAR, Egypt, 2004, pp Ouersighni R., "A Major Offshoot of the DIINAR-MBC Project: AraParse, a Morphosyntactic Analyzer for Unvowelled Arabic Texts". In the proceeding of Arabic NLP Workshop at ACL/EACL Saeed John, Semantics, Blackwell publishing, 2003, 2 nd edition. 21. Schwartz R., Miller S., Stallard D. and Makhoul J., "Hidden understanding models for statistical sentence understanding". In proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, 1997, pp Schwartz R., Miller S., Stallard D., and Makhoul J., "Language Understanding Using Hidden Understanding Models". In proceedings of ICSLP, 1996, pp Zelle, J. M., and Mooney, R. J., "Learning to parse database queries using inductive logic programming". In Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), pp Arabic References ٢٤. " تغريد العفيصان سامية المهوس مها الربيعة و فاتن القحطاني. "ا داة لتحليل الجملة العربية". مشروع تخرج لدرجة البكالوريوس الرياض صفر ١٤٢١ ه. ٢٥ د.. ا حمد محمد المعتوق. "الا لفاظ المشتركة المعاني في اللغة العربية طبيعتها ا هميتها مصادرها" جامعة الملك فهد للبترول والمعادن الظهران. ٢٦. مجد الدين محمد بن يعقوب الفيروز ا بادي. "القاموس المحيط" دار الفكر بيروت ١٤١٥ ه. ٢٧. ا حمد بن محمد بن علي الفيومي المقري. "المصباح المنير" المكتبة العصرية بيروت الطبعة الثانية ١٤٢٠ ه. ٢٨. ا حمد مختار عمر محمد حماسة عبداللطيف ومصطفى النحاس زهران. "النحو الا ساسي" دار الفكر العربي مصر ١٤١٧ ه. ٢٩. محمد بن ا بي بكر بن عبد القادر الرازي. "مختار الصحاح" مو سسة الرسالة بيروت الطبعة السابعة. Appendix A تراآيب الجملة العربية Arabic Grammar Rules جملة اسمية جملة فعلية. جملة مفيدة ----> جملة اسمية + حرف عطف + جملة اسمية جملة اسمية + حرف عطف + جملة فعليه (حرف استفهام جملة اسمية ----> ا ن وا خواتها +) مبتدا معرفة (+صفة معرفة)(+حال) (+حرف عطف + مبتدا معرفة)+ خبر(+ صفة) (+حال) (+حرف عطف + حرف استفهام + مبتدا نكرة (+ صفة) + خبر- شبه حرف نفي + مبتدا نكرة (+ صفة) + خبر- شبه جملة خبر) (+شبه جملة) خبر- شبه جملة + مبتدا معرفة حرف نفي + خبر+ مبتدا معرفة (حرف استفهام ا ن جملة (حرف استفهام ا ن وا خواتها +) (حرف يدخل على الفعل +) فعل ناسخ + مبتدا معرفة + خبر ) حرف وا خواتها +) خبر- شبه جملة + مبتدا نكرة (+صفة) يدخل على الفعل +) ا فعال الظن + فاعل + مفعول به ١ + مفعول به ٢ (حرف يدخل على الفعل +) ا فعال الشروع والرجاء والمقاربة + مبتدا (+"ا ن") + خبر- جملة فعلية فعل مضارع 16

17 ف " ف " + فاعل (+ جملة فعلية + حرف عطف + جملة فعلية جملة فعلية + حرف عطف + جملة اسمية فعل جملة فعلية ----> صفة) (+ بدل) + ) مفعول به (+ صفة) (+ بدل) (+ حال)) (+ مفعول مطلق ( (+ظرف (+ مضاف ا ليه) (+صفة)) ) حرف يدخل على الفعل +) فعل مضارع + فاعل (+ صفة) (+ بدل) + ) مفعول به (+ صفة) (+ بدل) (+حال) ( (+ شبه جملة) (+ (+ جار (+ بدل) ) حرف يدخل على الفعل+) فعل مضارع + ناي ب فاعل (+ صفة) مفعول مطلق ( (+ مفعول لا جله ( + فاعل (+ صفة) (+ بدل) + ) حرف يدخل على الفعل +) فعل لمفعولين ومجرور ( (+ مفعول مطلق ( (+ مفعول لا جله) ) حرف يدخل على الفعل الماضي+) فعل ما ض مفعول به (+ صفة) (+ بدل) + مفعول به (+ صفة) (+ بدل) (+ مفعول لا جله ( ) +) بدل) + ) مفعول به +) صفة) +) بدل) +) حال)) +) مفعول مطلق ( +) مفعول لا جله ( +) شبه جملة ( + فاعل +) صفة) (+جار و مجرور ( (+ مفعول مطلق ( (+ (+ بدل) حرف يدخل على الفعل الماضي+) فعل ما ض + ناي ب فاعل (+ صفة) (+ + فاعل (+ صفة) (+ بدل) (+ بدل) مفعول به (+ صفة) (حرف يدخل على الفعل +) فعل مضارع + مفعول لا جله ( + فاعل مفعول به (+ صفة) (+ بدل) (حرف يدخل على الفعل الماضي +) فعل ماض + مفعول مطلق ( (+ مفعول لا جله ( ) حرف يدخل على الفعل +) ا فعال ا خرى ١ + فاعل + مفعول به ١ مفعول مطلق ( (+ مفعول لا جله ( (+ صفة) (+ بدل) (+ ) " +) جواب الشرط حرف الشرط + فعل الشرط +( " +) جواب الشرط + مفعول به ٢ اسم الشرط + فعل الشرط +( حرف نفي +) جملة فعلية (+ مستثنى منه ( + حرف استثناء + مستثنى (حرف يدخل على الفعل +) فعل لثلاث مفاعيل + فاعل + + مفعول به ٢ جملة فعلية + ا فعال الاستثناء + مفعول به حرف نداء + منادى (+ جملة اسمية جملة + مفعول به ١ مفعول به فعلية ( حرف قسم + المقسم به + جواب القسم اسم غير معرفة. مبتدا نكرة ----> تعريف + اسم غير معرفة اسم علم اسم مبني مصدر مو ول مبتدا نكرة + مضاف ا ليه. مبتدا معرفة ----> تعريف + صفة. صفة معرفة ----> ----> مصدر مو ول اسم معرب ضمير اسم ا شارة اسم موصول الا عداد المركبة بعض الظروف وما ركب منها اسم خبر جارومجرور اسم استفهام جملة اسمية جملة فعلية. الفعل ----> تعريف + اسم غير معرفة اسم علم بدل اسم اسم غير معرفة + مضاف ا ليه. ----> فاعل مفعول به ناي ب فاعل ----> مصدر من الفعل: ضرب ا رمي ا... مفعول مطلق + صفة مفعول مطلق + مضاف ا ليه. مفعول مطلق ----> اسم غير معرفة. صفة اسم معرب مثنى ) + جار ومجرور ( اسم معرب جمع ) + جار و مجرور ----> اسم غير علم ) + جار و مجرور ( مفعول لا جله. ( مبتدا معرفة. مفعول به ----> ١ خبر. مفعول به ٢----> جملة فعلية شبه جملة. ----> ساخن ا بارد ا واو الحال + جملة اسمية حال اسم معرب مثنى اسم معرب جمع. ----> تعريف + اسم غير علم المستثنى منه اسم. ----> المستثنى ----> ) تعريف +) اسم معرب اسم مبني مصدر مو ول. اسم ----> اسم علم اسم غير علم اسم معرب مثنى اسم معرب جمع. اسم معرب اسم غير علم + "ين". اسم غير علم + "ان" ----> اسم معرب مثنى اسم غير علم + "ين" اسم غير علم + "ات" جمع تكسير لاسم غير علم. اسم غير علم + "ون" ----> اسم معرب جمع ----> امرا ة مدينة فتاة. اسم غير معرفة ----> امرا ة مدينة فتاة. اسم غير علم فاتن محمد مها سعيد تغريد عمر سامية. ----> اسم علم ضمير اسم ا شارة اسم موصول اسم شرط اسم استفهام الا عداد المركبة بعض الظروف وما ركب منها اسم اسم مبني ----> الفعل اسم شرط غير جازم. مصدر مو ول ----> مصدر مو ول من " ا ن و الفعل " مصدر مو ول من " ا ن و اسمها وخبرها". شبة جملة ----> جار ومجرور ظرف + مضاف ا ليه. حرف جر + اسم مجرور. جار و مجرور ----> اسم مجرور ----> اسم. اسم ا شارة ----> هذا هذه هو لاء ذاك ذلك تلك ا ولي ك هنا ههنا هناك هنالك هذان هاتان هذين هاتين. اسم موصول ----> الذي التي الذين اللاتي اللاي ي من ما اللذان اللتان اللذين اللتين. اسم شرط ----> من ما مهما متى ا يان ا ين ا ينما ا نى حيثما كيفما ا ي. ----> من ما متى ا ين كم كيف ا ي ا يان ا نى. اسم استفهام الا عداد المركبة ----> من ١١ ا لى ) ١٩ ما عدا ١٢) + تمييز. التمييز ----> اسم نكرة منصوب. ----> ظرف مكان ظرف زمان. ظرف ----> فوق تحت حول ا مام ا زاء. ظرف مكان ظرف زمان ----> شهرا صباحا لحظة ليلا صيفا يوم ا.. مضاف ا ليه ----> تعريف +( اسم غير علم اسم معرب مثنى اسم معرب جمع ( اسم علم.. اسم غير علم + مضاف ا ليه اسم غير علم اسم علم المنادى ----> ----> حيث ا مس الا ن ا ذ ليل نهار بين. بعض الظروف و ما ركب منها ----> مساجد مصانع. جمع تكسير لاسم غير علم لولا لوما. ----> ا ذا ا و لو كلما اسم شرط غير جازم ----> هيهات شتان سرعان ا ه ا ف ا مين عليك حذار صه ا يه حي. اسم الفعل 17

18 ل" لا " ن" ا " ت" ي " الضمير ----> ضمير رفع منفصل ضمير نصب منفصل ضمير رفع متصل ضمير نصب متصل ضمير جر متصل ضمير مستتر. ضمير رفع منفصل ----> ا نا نحن ا ن ت ا نت ا نتما انتم انتن هو هي هما هم هن. ضمير نصب منفصل ----> ا ياي ا يانا ا ياك ا ياكما ا ياكم ا ياكن ا ياه ا ياها ا ياهما ا ياهم ا ياهن. ضمير رفع متصل ----> تاء الفاعل نا ا لف الاثنين واو الجماعة ياء المخاطبة نون النسوة. ضمير نصب متصل ----> ياء المتكلم نا كاف المخاطب هاء الغاي ب. ضمير جر متصل ----> ياء المتكلم نا كاف المخاطب هاء الغاي ب. المقسم به ----> لفظ الجلالة اسم-صفة من صفات االله تعالى + مضاف ا ليه-لفظ الجلالة " + فعل مضارع + "نون التوكيد" + فاعل (+صفة) (+حال) (+مفعول به) "لقد" + فعل ماض + فاعل (+صفة) جواب القسم ----> " + جملة فعلية - فعل مضارع "ما" + جملة فعلية فعل ماض "ما" + (+حال) (+مفعول به) "ا ن " + مبتدا + "ل" + خبر جملة اسمية. ----> فعل مضارع فعل ماض فعل ا مر. فعل ----> ذهب جاء حضر. فعل ماض اذهب قم كل. ----> فعل ا مر " ( + فعل ماض. " " " <---- ) فعل مضارع ا فعال ا خرى ----> ١ ا فعال اليقين ا فعال التحويل. ---> ماعدا ماخلا ماحاشا. ا فعال الاستثناء ----> ظن خال حسب زعم جعل هب. ا فعال الظن ----> را ى علم وجد ا لفى تعلم ) بمعنى اعلم (. ا فعال اليقين ----> صير حول جعل رد اتخذ. ا فعال التحويل ----> كسا البس ا عطى منح سا ل منع. فعل لمفعولين ----> ا علم ا رى نب ا ا نبا خب ر ا خبر حد ث. ا فعال لثلاث مفاعيل ا فعال ناسخة ----> كان صار ليس ا صبح ا ضحى ظل ما زال ما دام بات ا مسى. ----> جملة فعلية. فعل الشرط ----> جملة اسمية فعل جامد + جملة اسمية فعل ا مر+ فاعل( + مفعول به ( لا الناهية + جملة فعلية اسم استفهام + جواب الشرط ( + جملة فعلية. لن قد س" " سوف حرف استفهام + جملة فعلية (حرف نفي جملة فعلية ----> نعم بي س حبذا. فعل جامد ا فعال الشروع والرجاء والمقاربة ----> ا فعال الشروع ا فعال الرجاء ا فعال المقاربة ا فعال الشروع ----> ا خذ ا نشا بدا جعل ا فعال الرجاء ----> عسى ا فعال المقاربة ----> ا وشك كاد حرف يدخل على الفعل ----> حرف يدخل على الفعل المضارع فعل يدخل على الفعل الماضي. حرف استفهام. حرف يدخل على الفعل المضارع ----> حرف نصب حرف جزم حرف نفي قد سين سوف حرف يدخل على الفعل الماضي ----> قد حرف عطف حرف استفهام. حرف الشرط ----> ا ن. من ا لى عن على في الباء الكاف اللام واو القسم تاء القسم رب مذ منذ واو رب عدا خلا ----> حرف جر حاشا. ا ن و ا خواتها ----> ا ن ا ن لكن كا ن لعل ليت لا. ----> ياء ا يا هيا ا ي الهمزة. حرف نداء ----> ا لا. حرف استثناء لكن لا بل حتى. حرف عطف ----> الواو الفاء ثم ا و ا م ----> الهمزة هل. حرف استفهام ----> ا ن لن كي ا ذن لام التعليل فاء السببية حتى. حرف نصب ----> الباء التاء الواو حرف قسم ----> لم لما لام الا مر لا الناهية. حرف جزم ----> ما لا. حرف نفي ----> ال. التعريف 18

19 Appendix B Arabic Grammar Rules XML Document <?xml version="1.0" encoding="windows-1256"?> <!DOCTYPE قواعد _ النحو SYSTEM " Grammar.dtd"> <قواعد _ النحو> <الجملة _ الفعلية> <--فعل فاعل مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع> فعل </ الموقع> <السابقة>-</ السابقة> <الموقع> فاعل </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الزمن">فعل</الموقع_الا عرابي> <القيمة>ا مر</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- حرف يدخل على الفعل الماضي - فعل ماضي - فاعل - مفعول به --!> <النوع> ا داة </ النوع> <الموقع> حرف لا محل له من الا عراب</الموقع> <النوع> فعل </ النوع> <الزمن> ماضي </ الزمن> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع>فعل</الموقع> <السابقة>-</ السابقة> <الموقع>فاعل</الموقع> <السابقة>-</ السابقة> <الموقع>مفعول به</الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الا داة"> حرف لا محل له من الا عراب</الموقع_الا عرابي> <القيمة>هل</القيمة> <القيمة>قد</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- فعل - فاعل(ضمير متصل) - مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <المفعول _به>-</المفعول_به> <الموقع>فعل والفاعل ضمير متصل </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="الفاعل">فعل والفاعل ضمير متصل</الموقع_الا عرابي> <القيمة>ت</القيمة> <القيمة>نا</القيمة> (Continue) 19

20 <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>-</القيمة> <القيمة>منصوب</القيمة> الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل والفاعل ضمير متصل < > </الموقع_الا عرابي <الموقع_الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- (فعل - فاعل(ضمير متصل) - مفعول به(ضمير متصل --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الموقع>فعل والفاعل ضمير متصل والمفعول به ضمير متصل</الموقع> <التصنيف> ظرف </ التصنيف> <الموقع> ظرف </ الموقع> الموقع _الا عرابي الخاصية="الفاعل">فعل والفاعل ضمير متصل والمفعول به ضمير < <متصل</الموقع_الا عرابي <القيمة>ت</القيمة> <القيمة>نا</القيمة> الموقع _الا عرابي الخاصية="المفعول_به">فعل والفاعل ضمير متصل والمفعول به < <ضمير متصل</الموقع_الا عرابي <القيمة>ه</القيمة> <القيمة>ها</القيمة> <القيمة>ك</القيمة> <القيمة>كما</القيمة> <القيمة>كم</القيمة> <القيمة>كن</القيمة> <القيمة>هما</القيمة> <القيمة>هم</القيمة> <القيمة>هن</القيمة> <-- فعل - فاعل - صفة - مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع> فعل </ الموقع> <السابقة>-</ السابقة> <الموقع> فاعل </ الموقع> <علم> لا </ علم> <صفة> نعم </ صفة> <الموقع> صفة </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">صفة</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">صفة</الموقع_الا عرابي> <ضابط /> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">صفة</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- فعل - فاعل - جار ومجرور --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <الموقع> فعل </ الموقع> <الموقع> فاعل </ الموقع> <النوع> ا داة </ النوع> <التصنيف> حرف جر </ التصنيف> <الموقع>حرف جر</الموقع> <معرفة> نعم </ معرفة> <السابقة>-</ السابقة> <الموقع> اسم مجرور </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> (Continue) 20

21 <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع_الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- (فعل - فاعل - جار ومجرور ) حرف الباء ا و التاء --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <الموقع> فعل </ الموقع> <الموقع> فاعل </ الموقع> <نوع _السابقة> حرف جر </نوع_السابقة> <الموقع> حرف جر واسم مجرور </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>-</القيمة> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <--فعل - فاعل --!> <النوع>فعل</النوع> <البناء>مبني للمعلوم</البناء> <الفاعل>-</الفاعل> <الموقع>فعل</الموقع> <النوع>اسم</النوع> <الموقع>فاعل</الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الجملة _ الفعلية /> <-- بداية الجملة الا سمية --!> <الجملة _ الاسمية> <-- (مبتدا (غير ممنوع من الصرف)- خبر(اسم --!> <معرفة> نعم </ معرفة> <الموقع>مبتدا </ الموقع> <معرفة> لا </ معرفة> <الموقع> خبر </ الموقع> <كلمة /> <الموقع _الا عرابي الخاصية="العدد">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مبتدا </الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <--(مبتدا (جمع تكسير ممنوع من الصرف)- خبر(اسم --!> <معرفة> نعم </ معرفة> <الموقع> مبتدا </ الموقع> <معرفة> لا </ معرفة> <الموقع> خبر </ الموقع> <الموقع _الا عرابي الخاصية="الجنس">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مبتدا </الموقع_الا عرابي> <القيمة>ممنوع من الصرف</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">مبتدا </الموقع_الا عرابي> <القيمة> جمع تكسير لغير العاقل</القيمة> <--(مبتدا مضاف - مضاف ا ليه - خبر(اسم --!> (Continue) 21

Division of Arts, Humanities & Wellness Department of World Languages and Cultures. Course Syllabus اللغة والثقافة العربية ١ LAN 115

Division of Arts, Humanities & Wellness Department of World Languages and Cultures. Course Syllabus اللغة والثقافة العربية ١ LAN 115 Division of Arts, Humanities & Wellness Department of World Languages and Cultures Course Syllabus Semester and Year: Course and Section number: Meeting Times: INSTRUCTOR: Office Location: Phone: Office

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon

A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon Imen Ben Cheikh, Abdel Belaïd, Afef Kacem To cite this version: Imen Ben Cheikh, Abdel Belaïd, Afef Kacem. A Novel Approach

More information

Some Principles of Automated Natural Language Information Extraction

Some Principles of Automated Natural Language Information Extraction Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract

More information

Study Center in Amman, Jordan

Study Center in Amman, Jordan Study Center in Amman, Jordan Course name: Modern Standard Arabic, Superior I Course number: ARAB 4011 AMJO Programs offering course: Advanced Arabic Language Language of instruction: Arabic U.S. Semester

More information

HybridTechniqueforArabicTextCompression

HybridTechniqueforArabicTextCompression Global Journal of Computer Science and Technology: C Software & Data Engineering Volume 15 Issue 1 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Arabic and Chinese Extended Sequences (ACES) Program

Arabic and Chinese Extended Sequences (ACES) Program ArabicandChineseExtendedSequences(ACES)Program زيارة لبنان Visiting Lebanon ThisIntegratedPerformanceAssessmentwasdevelopedaspartoftheMinneapolisPublicSchools ACESProject(ArabicandChineseExtended Sequences).TheprojectwasfundedwithaFLAP(ForeignLanguageAssistanceProject)grantfrom2008

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

What the National Curriculum requires in reading at Y5 and Y6

What the National Curriculum requires in reading at Y5 and Y6 What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the

More information

ASR for Tajweed Rules: Integrated with Self- Learning Environments

ASR for Tajweed Rules: Integrated with Self- Learning Environments I.J. Information Engineering and Electronic Business, 2017, 6, 1-9 Published Online November 2017 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2017.06.01 ASR for Tajweed Rules: Integrated with

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

1. Introduction. 2. The OMBI database editor

1. Introduction. 2. The OMBI database editor OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper

More information

An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

More information

A Comparative Survey on Arabic Stemming: Approaches and Challenges

A Comparative Survey on Arabic Stemming: Approaches and Challenges Intelligent Information Management, 2017, 9, 39-67 http://www.scirp.org/journal/iim ISSN Online: 2160-5920 ISSN Print: 2160-5912 A Comparative Survey on Arabic Stemming: Approaches and Challenges Mohammad

More information

A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition

A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition Abir Masmoudi 1,2, Mariem Ellouze Khemakhem 1,Yannick Estève 2, Lamia Hadrich Belguith 1 and Nizar Habash 3 (1) ANLP Research group,

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Modeling full form lexica for Arabic

Modeling full form lexica for Arabic Modeling full form lexica for Arabic Susanne Alt Amine Akrout Atilf-CNRS Laurent Romary Loria-CNRS Objectives Presentation of the current standardization activity in the domain of lexical data modeling

More information

A hybrid approach to translate Moroccan Arabic dialect

A hybrid approach to translate Moroccan Arabic dialect A hybrid approach to translate Moroccan Arabic dialect Ridouane Tachicart Mohammadia school of Engineers Mohamed Vth Agdal University, Rabat, Morocco tachicart@gmail.com Karim Bouzoubaa Mohammadia school

More information

Derivational and Inflectional Morphemes in Pak-Pak Language

Derivational and Inflectional Morphemes in Pak-Pak Language Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic

More information

Phonological Processing for Urdu Text to Speech System

Phonological Processing for Urdu Text to Speech System Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,

More information

VISUAL MEDIA USED IN INTRODUCING VOCABULARY AT TK IT AL-MA UN SENGKALING THESIS. By: FAJRIN AL FERA

VISUAL MEDIA USED IN INTRODUCING VOCABULARY AT TK IT AL-MA UN SENGKALING THESIS. By: FAJRIN AL FERA VISUAL MEDIA USED IN INTRODUCING VOCABULARY AT TK IT AL-MA UN SENGKALING THESIS By: FAJRIN AL FERA ENGLISH DEPARTMENT FACULTY OF TEACHER TRAINING AND EDUCATION UNIVERSITY MUHAMMADIYAH OF MALANG OCTOBER

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Grammars & Parsing, Part 1:

Grammars & Parsing, Part 1: Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s)) Ohio Academic Content Standards Grade Level Indicators (Grade 11) A. ACQUISITION OF VOCABULARY Students acquire vocabulary through exposure to language-rich situations, such as reading books and other

More information

Accepted Manuscript. Title: Region Growing Based Segmentation Algorithm for Typewritten, Handwritten Text Recognition

Accepted Manuscript. Title: Region Growing Based Segmentation Algorithm for Typewritten, Handwritten Text Recognition Title: Region Growing Based Segmentation Algorithm for Typewritten, Handwritten Text Recognition Authors: Khalid Saeed, Majida Albakoor PII: S1568-4946(08)00114-2 DOI: doi:10.1016/j.asoc.2008.08.006 Reference:

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically

More information

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards TABE 9&10 Revised 8/2013- with reference to College and Career Readiness Standards LEVEL E Test 1: Reading Name Class E01- INTERPRET GRAPHIC INFORMATION Signs Maps Graphs Consumer Materials Forms Dictionary

More information

Loughton School s curriculum evening. 28 th February 2017

Loughton School s curriculum evening. 28 th February 2017 Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's

More information

CEFR Overall Illustrative English Proficiency Scales

CEFR Overall Illustrative English Proficiency Scales CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey

More information

South Carolina English Language Arts

South Carolina English Language Arts South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content

More information

National Literacy and Numeracy Framework for years 3/4

National Literacy and Numeracy Framework for years 3/4 1. Oracy National Literacy and Numeracy Framework for years 3/4 Speaking Listening Collaboration and discussion Year 3 - Explain information and ideas using relevant vocabulary - Organise what they say

More information

Accurate Unlexicalized Parsing for Modern Hebrew

Accurate Unlexicalized Parsing for Modern Hebrew Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative English Teaching Cycle The English curriculum at Wardley CE Primary is based upon the National Curriculum. Our English is taught through a text based curriculum as we believe this is the best way to develop

More information

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions. to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

More information

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1 Linguistics 1 Linguistics Matthew Gordon, Chair Interdepartmental Program in the College of Arts and Science 223 Tate Hall (573) 882-6421 gordonmj@missouri.edu Kibby Smith, Advisor Office of Multidisciplinary

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Writing a composition

Writing a composition A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

THE VERB ARGUMENT BROWSER

THE VERB ARGUMENT BROWSER THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW

More information

Mercer County Schools

Mercer County Schools Mercer County Schools PRIORITIZED CURRICULUM Reading/English Language Arts Content Maps Fourth Grade Mercer County Schools PRIORITIZED CURRICULUM The Mercer County Schools Prioritized Curriculum is composed

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Coast Academies Writing Framework Step 4. 1 of 7

Coast Academies Writing Framework Step 4. 1 of 7 1 KPI Spell further homophones. 2 3 Objective Spell words that are often misspelt (English Appendix 1) KPI Place the possessive apostrophe accurately in words with regular plurals: e.g. girls, boys and

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,

More information

GACE Computer Science Assessment Test at a Glance

GACE Computer Science Assessment Test at a Glance GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science

More information

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

INTELLIGENT ACTIVE COACHING AN EXECUTABLE PLAN APPROACH

INTELLIGENT ACTIVE COACHING AN EXECUTABLE PLAN APPROACH INTELLIGENT ACTIVE COACHING AN EXECUTABLE PLAN APPROACH S. A. Gamalel-Din Al-Azhar University, Systems & Computers Engineering Dept. قرب. يو من رجال التعليم ا ن التعليم التفاعلى الذى يعتمد على وجود المدرس

More information

Specifying Logic Programs in Controlled Natural Language

Specifying Logic Programs in Controlled Natural Language TECHNICAL REPORT 94.17, DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF ZURICH, NOVEMBER 1994 Specifying Logic Programs in Controlled Natural Language Norbert E. Fuchs, Hubert F. Hofmann, Rolf Schwitter

More information

Getting into top colleges. Farrukh Azmi, MD, PhD

Getting into top colleges. Farrukh Azmi, MD, PhD Getting into top colleges Farrukh Azmi, MD, PhD But Why? The first revealed word of the Quran? Verily, in the creation of the heavens and of the earth, and the succession of night and day: and in the

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up

More information

Natural Language Processing. George Konidaris

Natural Language Processing. George Konidaris Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

More information

A Framework for Customizable Generation of Hypertext Presentations

A Framework for Customizable Generation of Hypertext Presentations A Framework for Customizable Generation of Hypertext Presentations Benoit Lavoie and Owen Rambow CoGenTex, Inc. 840 Hanshaw Road, Ithaca, NY 14850, USA benoit, owen~cogentex, com Abstract In this paper,

More information

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80. CONTENTS FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8 УРОК (Unit) 1 25 1.1. QUESTIONS WITH КТО AND ЧТО 27 1.2. GENDER OF NOUNS 29 1.3. PERSONAL PRONOUNS 31 УРОК (Unit) 2 38 2.1. PRESENT TENSE OF THE

More information

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Project in the framework of the AIM-WEST project Annotation of MWEs for translation Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment

More information

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN C O P i L cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN 2050-5949 THE DYNAMICS OF STRUCTURE BUILDING IN RANGI: AT THE SYNTAX-SEMANTICS INTERFACE H a n n a h G i b s o

More information

The College Board Redesigned SAT Grade 12

The College Board Redesigned SAT Grade 12 A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.

More information

Guidelines for Writing an Internship Report

Guidelines for Writing an Internship Report Guidelines for Writing an Internship Report Master of Commerce (MCOM) Program Bahauddin Zakariya University, Multan Table of Contents Table of Contents... 2 1. Introduction.... 3 2. The Required Components

More information

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading

ELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading ELA/ELD Correlation Matrix for ELD Materials Grade 1 Reading The English Language Arts (ELA) required for the one hour of English-Language Development (ELD) Materials are listed in Appendix 9-A, Matrix

More information

Some problems of translation from English into Arabic

Some problems of translation from English into Arabic Kingdom of Saudi Arabia Ministry of Higher Education Qassim Private Colleges Department of Applied Linguistics Some problems of translation from English into Arabic A thesis presented to the Department

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:

More information

SIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS. Chris Adams Bachelor of Arts, Asbury College, May 2006

SIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS. Chris Adams Bachelor of Arts, Asbury College, May 2006 SIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS by Chris Adams Bachelor of Arts, Asbury College, May 2006 A Thesis Submitted to the Graduate Faculty of the University of North

More information

Applications of memory-based natural language processing

Applications of memory-based natural language processing Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal

More information

Age Effects on Syntactic Control in. Second Language Learning

Age Effects on Syntactic Control in. Second Language Learning Age Effects on Syntactic Control in Second Language Learning Miriam Tullgren Loyola University Chicago Abstract 1 This paper explores the effects of age on second language acquisition in adolescents, ages

More information

n 41, Juin Tome A - pp 89-98

n 41, Juin Tome A - pp 89-98 n 41, Juin 2014 -Tome A - pp 89-98 Investigating the Reading Difficulties of Magister Students of Physics vis-à-vis Their General English knowledge, University of Constantine Abstract This paper reports

More information

The Smart/Empire TIPSTER IR System

The Smart/Empire TIPSTER IR System The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of

More information

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda Content Language Objectives (CLOs) Outcomes Identify the evolution of the CLO Identify the components of the CLO Understand how the CLO helps provide all students the opportunity to access the rigor of

More information

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment

Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Takako Aikawa, Lee Schwartz, Ronit King Mo Corston-Oliver Carmen Lozano Microsoft

More information

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

LING 329 : MORPHOLOGY

LING 329 : MORPHOLOGY LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,

More information

BULATS A2 WORDLIST 2

BULATS A2 WORDLIST 2 BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is

More information

Radius STEM Readiness TM

Radius STEM Readiness TM Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and

More information

Thank you for encouraging your child to learn English and to take this YLE (Young Learners English) Flyers test.

Thank you for encouraging your child to learn English and to take this YLE (Young Learners English) Flyers test. TO: PARENT/GUARDIAN FROM: Class Teacher SUBJECT: University of Cambridge YLE (Young Learners: FLYERS) Examination (Grade 4) DATE: 18 th OF MARCH, 2014 Dear Parent/Guardian, Thank you for encouraging your

More information

Physics 270: Experimental Physics

Physics 270: Experimental Physics 2017 edition Lab Manual Physics 270 3 Physics 270: Experimental Physics Lecture: Lab: Instructor: Office: Email: Tuesdays, 2 3:50 PM Thursdays, 2 4:50 PM Dr. Uttam Manna 313C Moulton Hall umanna@ilstu.edu

More information

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML By EUGENIO JAROSIEWICZ A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

More information

Constraining X-Bar: Theta Theory

Constraining X-Bar: Theta Theory Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,

More information

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom CELTA Syllabus and Assessment Guidelines Third Edition CELTA (Certificate in Teaching English to Speakers of Other Languages) is accredited by Ofqual (the regulator of qualifications, examinations and

More information

Literature and the Language Arts Experiencing Literature

Literature and the Language Arts Experiencing Literature Correlation of Literature and the Language Arts Experiencing Literature Grade 9 2 nd edition to the Nebraska Reading/Writing Standards EMC/Paradigm Publishing 875 Montreal Way St. Paul, Minnesota 55102

More information

Universiteit Leiden ICT in Business

Universiteit Leiden ICT in Business Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:

More information