An Arabic Semantic Parser and Meaning Analyzer
|
|
- Gervais Eaton
- 6 years ago
- Views:
Transcription
1 An Arabic Semantic Parser and Meaning Analyzer AbdulMalik Al-Salman, Yousef Al-Ohali, and Maha AlRabiah Computer Sceince Department, KSU, Saudi Arabia {Salman, Abstract Arabic language is very rich in derivations, vocabulary, and grammatical structures. The problem of determining the correct meaning of a word in a non-vowelized Arabic sentence is not a trivial task since Arabic is very rich in the polysemy phenomena. This paper attempts to reveal the word sense ambiguity, by building a semantic parser powered by a statistical semantic analyzer, which may aid in the improvement of machine translation, question answering and other Arabic NLP systems. Building the parser was done in three steps. The first step was to acquire the grammatical rules for Arabic that was covered in an Arabic grammar textbook, and develop constraints that aided in revealing part of the parsing ambiguity. The grammar and the constraints were then written in an XML format to make them readable and available for future uses. The second step was to build the semantic parser that assigns grammatical structure onto input sentence. The final step was to impose a semantically statistical technique on the resulting grammatical structures to determine the most accurate structure, the one that result in resolving the word sense ambiguity, and determining the most accurate meaning of the word. 1. Introduction There are many different applications for natural language understanding that researchers work on [1], these include: information retrieval, knowledge-acquisition, translation, summarization, and categorization. In natural language understanding, morphology serves as the basic layer on which the higher syntactic and semantic layers are built. It is the linguistic system that governs how the words of this language are built. There are two kinds of morphology: inflectional forms and derivational forms. Inflectional forms add a suffix and/or a prefix to the root of a word generating another word while obtaining the same grammatical category and the same basic meaning. Derivational morphology involves the derivation of new words from a finite set of roots using a set of predefined morphological `patterns' or `forms'. The new words may have completely different grammatical categories. The morphological pattern also represents the category of the word, and in some cases indicates its syntactic and semantic roles. The automation of morphological analysis has gone a long way in English and also in Arabic. In Arabic, the word may either be a verb, a noun or a particle. It is mainly a derivational language, where most of its words are derived from a finite set of roots using a finite set of morphological `forms' or `patterns'. According to [7], the total number of roots in Arabic is around This number varies slightly from dictionary to dictionary, but it remains around 10,000 on average. Around 70% of these are tri-literal while the rest are 4-letter roots, and the total number of morphological forms is around 900. There exist several approaches and algorithms for morphological analysis and generation. Such as "the typical conventional algorithm", "the sliding window approximate matching (SWAM)", "the finite state transducers (FSTs) approach" and "the two-level finite state machine approach" [1][7]. Many of these approaches are applicable to Arabic, and many Arabic morphological analyzers have been developed. For example, Al-Affendi has developed "The Arabic Morphological Analyzer (AMA)" based on the SWAM algorithm. Kenneth R. Beesley built a system for morphological analysis and generation of Arabic using the finite-state approach. His work was part of the Xerox Research Center [3][4]. Buckwalter produced an Arabic morphological
2 analyzer system that converts the Arabic letters into Latin letters using the Buckwalter's transliteration scheme [5]. Darwish developed a light Arabic stemmer based on removing common prefixes and suffixes from the word to reach the stem [6]. Al-Shalabi and Evens built a computational morphology system for Arabic that works only for non-vowelized words [2]. To examine how the grammatical structure of a sentence can be computed, two things must be considered: the grammar, which specifies the acceptable structures that can produce a correct sentence, and the parsing technique, which is the method of analyzing a sentence to determine its structure according to the grammar. In order for a computer to deal with a natural language, the structure of the language should be described in symbols and notations familiar to the computer specifying all the legal structures in that language. Several grammars exist, which include: Context-Free Grammars (CFGs), Augmented CFGs, Transition Network Grammars, Augmented Transition Networks, Definite Clause Grammars (DCG) [1] and Affix Grammars over a Finite Lattice (AGFL) [15]. Most of these grammar formalization techniques were successfully applied to Arabic as in [8][9][10] and [11]. A parsing algorithm can be described as a procedure that searches through various ways of combining grammatical rules to find a combination that generates a tree that represents a possible structure of the input sentence. Parsing techniques includes: Top-down parsing, Bottom-up parsing, Bottom-up chart parsing, Top-down chart parsing, Top-Down Parsing with Recursive Transition Networks and Recursive Descent Parsing [1]. Parsing Arabic sentences is a difficult task. The difficulty comes from several sources. One is that sentences are long and complex. The average length of a sentence is 20 to 30 words, and it may exceed 100 words [11]. Another difficulty comes from the sentence structure. The Arabic sentence is complex and syntactically ambiguous due to the frequent usage of grammatical relations, order of words, phrases and conjunctions, the omission of diacritics (vowels) in written Arabic and the presence of elliptic personal pronouns المستترة" الضماي ر makes things more difficult. Due to these difficulties, little work has been done in developing parsers involving Arabic. Farouk [11] has adapted a simple top-down parsing algorithm implemented in Prolog to parse Arabic sentences using DCG. Othman et. al. developed a semantic bottom-up chart parser for Arabic using a Unification Based Grammar (UBG) that was implemented using Prolog [17][18]. Other successful parsers include AraParse [19], which uses AGFL formalism. Semantic analysis is the study of meaning communicated through language [20]. Modern approaches to semantic analysis are often grouped into two classes. The first class consists of logicbased or symbolic systems, which have the goal of producing a deep and rich semantic and pragmatic interpretation of a text. They generally use representations based on predicate logic, and include complex knowledge structures and inference rules necessary to interpret connected texts. The second class adapts machine-learning techniques, such as statistical techniques, allowing systems to be trained directly from examples of input/output pairs. These systems, forgoes the deep representations in favor of directly modeling the task to be performed. They tend to reformulate the task of understanding as a pattern-recognition problem. In the logic-based approach, the meaning of a sentence is represented in a formal representation language using logical frames, which can be derived using first-order predicate calculus (FOPC) or Lambda Calculus. Semantic grammar can also be used as a method of semantic interpretation for a specific domain. They also can be augmented to produce a logical form in the normal way, or the parse tree of the semantic grammar itself can be used as a logical form [1] [14]. The field of natural language processing has undergone a fundamental shift toward machinelearning methods especially statistical methods. The availability of large and annotated corpora has made possible a methodology based on training systems on labeled data and quantitatively evaluating their performance on held-out test data. Some successful examples of systems built over statistical techniques includes Hidden Understanding Model (HUM), described in [21] and [22], which is based completely on trained statistical models derived from annotated corpora. Also Chill 2
3 [16] [23], which is a learning semantic parser that maps a natural-language database queries (sentences) into executable Prolog queries (detailed logical forms). Arabic semantic analysis, on the othe hand, suffers from little attention and research. Until now, there exists no formal theory of semantics that is able to provide a complete and consistent account of all the phenomena of Arabic. Haddad et. al. [12] [13] attempted to model the Arabic sentence using FOPC representation and the lambda calculus. Although we didn't come across any publications or applications of semantic grammars, we think that they can be applied successfully to Arabic. On the other hand, machine-learning approaches to semantic representation can be applied to Arabic in the same way they are applied to English. Yet they require the availability of Arabic corpus in specific domains to train systems. These Arabic corpuses are unfortunately not widely available, as far as our knowledge, the thing that is making the progress of these methods in Arabic language very slow. 2. System Analysis & Design Building the Arabic Semantic Parser and Analyzer system was done in four stages: building the morphological analyzer, creating an XML document of Arabic grammar, building the semantic parser and finally, building the semantic analyzer. To understand the overall system architecture, see the data flow diagrams and the pseudo code (Figures 1, 2 and 3). In the following sub-sections we will briefly describe the stages of building the system. Arabic Sentence 0 User Arabic Semantic Parser & Meaning Analyzer Sentence Grammatical Structure Word Meanings' Sentence Meaning Probability Figure 1: Context Diagram 3
4 D1 MORPHOLOGICAL ANALYSER USER Arabic Sentence 1.0 Identify Word Morphological Features Word 2.0 Build Sentence Tree Semantic Features Sentence Tree Words meanings' 4.0 Parse Tree 3.0 Sentence Meaning Probability Sentence Grammatical Structures Word's meanings Semantic Analyzer D2 ARABIC SEMANTIC LEXICON Disambiguating words Semantic Parser DOM Tree D2 GRAMMAR XML Figure 2: Leve-0 Data Flow Diagram Bottom-up chart parser: 1- Read the sentence. 2- Build the sentence tree: a. Create the root node, and let it be the parent node. b. While it is not the end of the sentence: i. Identify a word. ii. Search for the word in the Morphological Analyzer database, and create a new object from the class Verb, Noun or Particle for each matching word depending on the word's type storing the morphological and semantic features of that word in the object created. iii. Append the object(s) created as children of the parent node. iv. Set each one of the child node as a parent, and continue to step (i). 3- Build the parse tree: a. Read the Grammar XML document, and convert it into a DOM (Document Object Model) tree. b. For each path in the sentence tree: i. Store the path in a sequence. ii. Determine sentence type (verbal or nominal). iii. Parse the branch VPhrase or Nphrase in the DOM tree depending on the sentence type. iv. For each rule in the branch quit parsing if a node does not match, and move to the next rule. c. If the sequence matches a given rule, store it in the parse tree. 4- Get words meanings' and calculate probability: a. For each path in the parse tree: i. Retrieve the word from the semantic lexicon, using the stem. ii. Compare the supporting words with the rest of the words in the path. For each match increment a score. iii. Sum the scores for each path (sentence). b. Calculate the probability for each path. Figure 3: The System's Pseudo Code 4
5 2.1 Building the Arabic morphological analyzer Parsers usually depend on the results generated by morphological analyzers, and this project is not an exception. The morphological analyzer we've used adapts a similar approach proposed by Othman et. al. with some modifications [18]. It depends on a semantic lexicon, Semantic Lexicon Database, which stores for each word the lexical and syntactic features. Since developing a morphological analyzer is not in the scope of this project, we have simulated the morphological analyzer by storing the morphological features of the words in the Semantic Lexicon Database (Table 1). Morphological features Syntactic features Semantic features Table 1: Semantic Lexicon Database Verbs Nouns Particles stem, root, attached suffix attached suffix (suffix), attached suffix gender (suffix), attached (suffix_gender), attached suffix suffix gender number (suffix_number), (suffix_gender) attached prefix (prefix) and and attached suffix attached prefix category number (prefix_category). (suffix_number). stem, root, connected subject (subject), connected object (object), prefix and prefix category (prefix_category). tense, voice, transitivity, subject_gender (sbj_gender), subject_number (sbj_number), object_gender (obj_gender), object_number (obj_number), the irab case (irab_case) and the verb's category. subject rationality (sbjrat) and object rationality (objrat), supporting stems, meaning. gender, number, irab case (irab_case), noun's adjectivability, category and if it is a definite noun (definite) or a perfect noun (perfect_noun). noun's rationality (rational), supporting stems, meaning. gender, number and category Creating an XML document of Arabic grammar As we have mentioned earlier, a grammar is the group of symbols and notations that describe the legal structures of sentences in that language, which is essential for every parser. From this perspective, we have made some effort trying to review all (or most) of the Arabic grammar rules from Arabic grammar textbooks. We have also tried to figure out which of these grammar rules can be represented in a way that is understood by the computer. We have faced a great deal of difficulty since Arabic is very rich in its grammatical structures. Finally, we have chosen to represent the grammar in a Backus Naur Form (BNF), which was part of a previous work on building an Arabic parser [24], because it can be easily transferred into a grammar that the computer can understand. For that we have selected a comprehensive Arabic grammar textbook [28], and tried to represent most of the rules in that book in BNF (see appendix A). These rules were revised by a linguistic specialist, and an HTML version is also available. The next step was to construct the grammar to be used by the parser. We have chosen to represent the grammar in an XML document (Grammar.xml) to make it extendable and available for other researches and future work. The document Grammar is composed of two parts: a VPhrase containing noun phrases. This (الجمل ة _ الاس مية ( NPhrase containing verb phrases, and a (الجمل ة _ الفعل ية ( separation between noun and verb phrases aids in reducing the time required for parsing the document searching for the structure of a certain sentence. Both of the VPhrase and NPhrase contain a set of rules. Each rule ( (ق اعدة contains a set of words ( (آلم ة that represent the words in a 5
6 sentence and a set of constrains (,(ض ابط if available, on those words (see Figure 4 for the Document Type Definition (DTD), and Figure 5 for a snapshot of the document's DOM tree). The constraints impose restrictions on the semantic and syntactic features of the words, aiding in the reduction of ambiguity. The XML document fragment in Figure 6 specifies the grammar rule of a verb phrase consisting of a verb ( (فع ل and a subject (.(ف اعل The rule ( (ق اعدة has two words (.(آلم ة The first word ضمير متصل ( subject has no connected,(مبن ي للمعلوم ( voice It is in active."فع ل "verb- (ن وع ( type has the "اسم- noun " The second word has the type."فعل" and its grammar position should be (ف ي مح ل رف ع ف اعل and the grammar position "."ف اعل The rule has also four constraints (.(ض ابط The first constraint specifies that verb's subject gender ( _ الف اعل (ج نس should be equal to the subject's gender (.(الج نس The second constraint specifies that the verb's subject rationality ( _ ع اقل (الف اعل should be equal to the subject's rationality (.(ع اقل The third constraint specifies that the subject ( _ الا عراب ية (الحال ة should be either nominative ( (م رفوع or not specified (-). Finally, the last constraint specifies that if the subject is plural ( س الم مذآ ر (جم ع it should not be without a noon ( ال نون.(مح ذوف Appendix B lists the Arabic Grammar rules XML document. <(الجملة _ الاسمية,الجملة _ الفعلية ( قواعد _ النحو ELEMENT <(+قاعدة) الجملة _ الفعلية <!ELEMENT <(+قاعدة) الجملة _ الاسمية <!ELEMENT <(*ضابط,+آلمة) قاعدة <!ELEMENT,?الفاعل,?نوع _ السابقة,?السابقة,?صفة,?الحالة _ الا عرابية,?التصنيف,?علم,?معرفة,?التعدي,?البناء,?الزمن,النوع) آلمة <!ELEMENT,?اللاحقة,?الا داة,?المفعول _به_عاقل,?الفاعل _ عاقل,?عدد _المفعول_به,?جنس _المفعول_به,?عدد _ الفاعل,?جنس _ الفاعل,?المفعول_به <(الموقع,?العدد,?الجنس,?جنس _ اللاحقة,?عدد_اللاحقة <(العلاقة,*القيمة,+الموقع _ الا عرابي ( ضابط <!ELEMENT <!ELEMENT الموقع _ الا عرابي (#PCDATA) > عدد _ الفاعل جنس _ الفاعل التعدي البناء الزمن نوع _السابقة السابقة المفعول_به الفاعل الجذر) الخاصية الموقع _ الا عرابي <!ATTLIST صفة العدد الجنس جنس _ اللاحقة عدد _ اللاحقة اللاحقة المفعول _به_عاقل الفاعل _ عاقل التصنيف الحالة _ الا عرابية عدد _المفعول_به جنس_المفعول_به <"الفاعل" (الاسم _ مجرد الفعل _ مجرد الا داة عاقل علم معرفة <!ELEMENT النوع (#PCDATA) > <!ELEMENT صفة (#PCDATA) > <!ELEMENT السابقة (#PCDATA) > <!ELEMENT نوع _ السابقة (#PCDATA) > <!ELEMENT الزمن (#PCDATA) > <!ELEMENT البناء (#PCDATA) > <!ELEMENT التعدي (#PCDATA) > <!ELEMENT معرفة (#PCDATA) > <!ELEMENT علم (#PCDATA) > <!ELEMENT الجنس (#PCDATA) > <!ELEMENT العدد (#PCDATA) > <!ELEMENT الفاعل (#PCDATA) > <!ELEMENT المفعول _ به (#PCDATA) > <!ELEMENT جنس _ الفاعل (#PCDATA) > <!ELEMENT عدد _ الفاعل (#PCDATA) > <!ELEMENT الفاعل _ عاقل (#PCDATA) > <!ELEMENT جنس _المفعول_به (#PCDATA) > <!ELEMENT عدد _المفعول_به (#PCDATA) > <!ELEMENT المفعول _به_عاقل (#PCDATA) > <!ELEMENT الا داة (#PCDATA) > <!ELEMENT اللاحقة (#PCDATA) > <!ELEMENT جنس _ اللاحقة (#PCDATA) > <!ELEMENT عدد _ اللاحقة (#PCDATA) > <!ELEMENT نوع _ الاسم (#PCDATA) > <!ELEMENT نوع _ الفعل (#PCDATA) > <!ELEMENT نوع _ الا داة (#PCDATA) > <!ELEMENT الحالة _ الا عرابية (#PCDATA) > <!ELEMENT الموقع (#PCDATA) > <!ELEMENT العلاقة (#PCDATA) > <!ELEMENT القيمة (#PCDATA) > Figure 4: XML Document DTD 6
7 Figure 5: XML Document DOM Tree <--فعل - فاعل --!> <آلمة> <النوع>فعل</النوع> <البناء>مبني للمعلوم</البناء> <الفاعل>-</ الفاعل> <الموقع>فعل</الموقع> <آلمة/> <آلمة> <النوع> سا م</النوع> <الموقع>فاعل</الموقع> <آلمة/> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <العلاقة>یساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا یساوي</العلاقة> Figure 6: XML Grammar Rule Example 7
8 2.3 Building the semantic parser The semantic parser implements a bottom-up chart parser, which takes as input a "sentence tree" containing the sentence with all possible morphological combinations of its words. For example, the following input sentence: "ذهب حمد إلى المسجد " will generate the sentence tree in Figure 7. This tree will be passed to the semantic parser, which in turn will parse every possible path in the tree against the Grammar XML document. This will result in rejectiion most of the paths and greatly minimize the degree of parsing ambiguity (Figure 8 demonstrates the resulting parse tree). The rules in the XML document were written without the use of recursion. This was very helpful in speeding-up the parsing process, since the parser does not have to go recursively very deep in the DOM tree while searching for a valid rule. root ذهب ذهب ذهب حمد حمد حمد حمد حمد حمد حمد حمد حمد إلى إلى إلى إلى إلى إلى إلى إلى إلى المسجد المسجد المسجد المسجد المسجد المسجد المسجد المسجد المسجد اسم Noun فعل مبني للمعلوم Active verb فعل مبني للمجهول Passive verb حرف Particle "ذهب حمد إلى المسجد" Figure 7: The Semantic Tree of 8
9 root فعل مبني للمعلوم ذهب ذهب مبتدأ مضاف Noun Active verb فاعل حمد حمد مضاف إليه Noun Noun حرف جر إلى إلى حرف جر اسم مجرور المسجد المسجد اسم مجرور- شبه الجملة خبر "ذهب حمد إلى المسجد" Figure 8: The Parse Tree of 2.4 Building the semantic analyzer Since Arabic is very rich with the polysemy phenomena, i.e. having words with the same pronunciation and spelling but with totally different meanings. Determining the correct meaning of a word in a non-vowelized Arabic sentence is a difficult task [25]. It is somehow related to the overall understanding of the sentence's meaning and the meanings of its words. This phase represnts the final and most important stage of this research work: building a semantic analyzer that can identify the correct or most accurate meaning of a word in the sentence. The idea adapted here is purely statistical. For each word (stem), in the semantic lexicon, we store its meaning (extracted from [26][27][29]) along with a group of words that appeare frequently with it. The semantic analyzer takes the parse tree as input, and tries to figure out how many of the words supporting a particular word's meaning were present in the sentence. Then it computes the probability of the correctness of the meaning for the whole sentence. The sentence(s) with the highest probability have the most accurate meaning. In our previous example, the parse tree will be input to the semantic analyzer, which in turn computes the probability of the correctness of both sentences and assigns meanings to words (see Figure 9). 9
10 Probability = 0 root Probability = 1 فعل مبني للمعلوم ذهب ذهب مبتدأ مضاف غادر معدن نفيس فاعل حمد اسم علم حمد مضاف إليه حرف جر إلى إلى حرف جر اسم مجرور المسجد مكان تعبد المسلمين المسجد اسم مجرور- شبه الجملة خبر Figure 9: The Output of the Semantic Analyzer 3. System Implementation The Arabic Semantic Parser & Analyzer was implemented in Java using Borland's JBuilder9. Programming with Java makes code portable and machine independent. On the other hand, both databases were created using Microsoft SQL Server System Classes The system has 7 classes (there are other classes used for the GUI, which we are ignored here). We will briefly describe each one of these classes and their main functions, along with their UML representation diagrams. The diagrams are generated using Borland s JBuilder 2006 enterprise. Main Class: As its name implies, this class is the basic class that calls other classes. Mainly it is the class that glues everything together. Figure 10 shows a simple UML diagram of this class. Figure 10: A UML Diagram of Class Main Verb Class: This class represents a word of type verb, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 11 shows a simple UML diagram of this class. 10
11 Figure 11: A UML Diagram of Class Verb Noun Class: This class represents a word of type noun, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 12 shows a simple UML diagram of this class. Figure 12: A UML Diagram of Class Noun Particle Class: This class represents a word of type particle, including all the required features of this word. It also includes a constructor that creates a new instance of this class, and initializes it with the values of the input SQL query result set. Figure 13 shows a simple UML diagram of this class. 11
12 Figure 13: A UML Diagram of Class Particle ConnectToDB Class: This class establishes a connection to the Morphological Analyzer database through the functions Load_driver() and connection(). It also generates the "sentence tree" through the function ProcessSentence(), which reads an input sentence and then builds the equivalent tree of that sentence creating appropriate objects from the verb, noun and particle classes. Figure 14 shows a simple UML diagram of this class. Figure 14: A UML Diagram of Class ConnectToDB CreateDom Class: This class creates the DOM document of the XML file (Grammar.xml) through the function parsexmlfile(). It also uses the functions TraverseTree(), Traverse(), traverse_field() and traverse_const() to parse the DOM tree looking for the sentence structure and creating the "parse tree". Figure 15 shows a simple UML diagram of this class. 12
13 Figure 15: A UML Diagram of Class CreateDom GetMeaning Class: This class establishes a connection to the Arabic Semantic Lexicon database. Then it gets the meaning of words in the "parse tree" and computes the score of each word through the function ComputeScore(). Then it computes the overall probability of sentence correctness through the function ComputeProbability(). Figure 16 shows a simple UML diagram of this class. Figure 16: A UML Diagram of Class GetMeaning 4. Experimental Results Testing the system was done in two stages: testing the parser, and then testing the semantic analyzer. For testing the parser, we've built five correct sentences, and three incorrect ones from the available words in the lexicon for each rule. The results were satisfactory (see Tables 2 and 3). The parser succeeded in parsing most of the sentences correctly (Appendix C contains a more detailed testing). Some of the sentences had more than one correct grammatical structure, but some may not be semantically correct. The sentence that was not correctly paresed is: ضرب حمد محمد Where " "حم د is the object and " "محم د is the subject. As noticed, the object precedes the object, and both are rational. The problem here, is that the verb " "ض رب can have both a rational subject and 13
14 object Therefore, there is no way to figure out if the first word is a subject or an object unless the words are vowelized, which is not the case in our example. Rule Number of Correct Sentences Success Rate Table 2: Results of Testing the Semantic Parser Failure Rate Success % Failure % Number of Incorrect Sentences Success Rate Failure Rate Success Percentage 1 5 4/5 1/5 80% 20% 3 3/3 0/3 100% 0% 2 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 3 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 4 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 5 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 6 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 7 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 8 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% 9 5 5/5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% /5 0/5 100% 0% 3 3/3 0/3 100% 0% Failure % Total 60 59/60 1/60 98% 2% 36 36/36 0/36 100% 0% For testing the semantic analyzer, the sentences that passed the semantic parsing test were passed to the semantic analyzer. The meanings of less than half of these sentences were recognized, and most of these were recognized correctly (see Table 3). The meanings of other sentences were not recognized due to the fact that there were no words supporting the meanings in those sentences. Rule Number of Sentences Table 3: Results of Testing the Semantic Analyzer Sentences With a Recognized Meanings Success Rate Failure Rate Sentences Without a Recognized Meanings 1 4 4/4 4/4 0/4 0/ /5 3/3 0/3 2/ /5 1/1 0/1 4/ / / /5 3/3 0/3 2/ /5 4/5 1/5 0/ / / /5 1/1 0/1 4/ / / /5 4/4 0/4 1/ /5 1/1 0/1 4/ /5 4/4 0/4 1/5 Total 59 26/59 25/26 1/26 33/59 Percentage 44% 96% 4% 56% 5. Conclusion & Future Work 14
15 Syntactical and morphological analysis of Arabic has received a great attention from researchers in the past years, and a lot of successful morphological and parsing systems have been developed. However, semantic analysis suffers from little attention and research. Apparently, Arabic is still on its infancy regarding semantic and discourse analysis. A lot of effort and research is required to improve Arabic natural language understanding. The aim of this paper is to describe a system that is capable of revealing the ambiguity resulting from understanding the meaning of the sentence. The system succeeded in revealing the ambiguity resulting from the polysemy phenomena. There are many avenues to enhance this work; among these avenues are the following. Connecting the system to a real morphological analyzer that is capable of figuring out all the possibilities of a given word. Completing the semantic lexicon by adding more entries. Completing the XML grammar rules. It currently contains 12 rules. Testing the system extensively. Building a system that collects the supporting words automatically from large Arabic corpus. English References 1. Allen James, Natural Language Understanding, Benjamin/Cummings Publishing Company, 1995, 2nd edition. 2. Al-Shalabi Riyad and Evens Martha, "A Computational Morphology System for Arabic". In proceedings of COLONG-ACL 98 Montreal, Quebec, Canada. 3. Beesley Kenneth, "Arabic Morphology Using Only Finite-State Operations". In proceedings of the Workshop on Computational Approaches to Semitic languages, COLING-ACL 98, Montreal, Quebec, August, 1998, pp Beesley Kenneth, "Finite-State Morphological Analysis And Generation of Arabic at Xerox Research: Status and plans in 2001". In proceedings of the Arabic Language Processing: Status and Prospect--39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, 2001, pp Buckwalter Tim, Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC2002L Darwish, K., "Building A Shallow Arabic Morphological Analyzer In One Day". In M. Rosner and S. Wintner (Eds.), Computational Approaches to Semitic Languages, an ACL'02 Workshop, Philadelphia, PA, 2002, pp El -Affendi M. A., "An LVQ connectionist solution to the non-determinacy Problem in Arabic morphological analysis: a learning hybrid algorithm", Natural Language Engineering, vol. (8), Cambridge University Press, 2002, pp Elnaggar Ayman, "A Phrase Structure Grammar of the Arabic Language". In Proceedings of the 13th COLING, Vol. 3, Helsinki, Finland, 1990, pp El-Shishiny H., "A Formal Description of Arabic Syntax in Definite Clause Grammar". In Proceedings of the 13th COLING, Vol. 3, Helsinki, Finland, 1990, pp Everhard Ditters, A Formal Grammar for the Description of Sentence Structure in Modern Standard Arabic. In proceedings of ACL/EACL01: Conference of the European Chapter, Workshop: Arabic Language Processing: Status and Prospects, Farouk Ahmad, "Developing an Arabic Parser in a Multilingual Machine Translation System", M. Sc, Thesis, Computer and Information Science Department, Cairo University, Haddad Bassam and Yaseen Mustafa, Towards Semantic Composition of ARABIC: A λ-drt Based Approach, MT Summit IX Workshop: Machine Translation for Semitic Languages: Issues and Approaches, USA, September 23,
16 13. Haddad Bassam and Yaseen Mustafa, Towards Understanding Arabic: A Logical Approach for Semantic Representation, ACL/EACL01: Conference of the European Chapter, Workshop: Arabic Language Processing: Status and Prospects, Jurafsky Daniel and Martin James, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice- Hall, Koster C.H.A., "Affix Grammars for Natural Languages". In: H. Alblas and B. Melichar (Eds.), Attribute Grammars, applications and systems. SLNCS 545, Heidelberg, 1991, pp Mooney Raymond, "Learning Semantic Parsers: An Important but Under-Studied Problem". In proceedings of the AAAI 2004 Spring Symposium on Language Learning: An Interdisciplinary Perspective, Stanford, CA, March 2004, pp Othman E., Shaalan K., and Rafea A., "A Chart Parser for Analyzing Modern Standard Arabic Sentence". In proceedings of the MT Summit IX Workshop on Machine Translation for Semitic Languages: Issues and Approaches, New Orleans, Louisiana, U.S.A., Othman Eman, Shaalan Khaled, Rafea Ahmed, Towards Resolving Ambiguity In Understanding Arabic Sentence. In proceedings of the International Conference on Arabic Language Resources and Tools, NEMLAR, Egypt, 2004, pp Ouersighni R., "A Major Offshoot of the DIINAR-MBC Project: AraParse, a Morphosyntactic Analyzer for Unvowelled Arabic Texts". In the proceeding of Arabic NLP Workshop at ACL/EACL Saeed John, Semantics, Blackwell publishing, 2003, 2 nd edition. 21. Schwartz R., Miller S., Stallard D. and Makhoul J., "Hidden understanding models for statistical sentence understanding". In proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, 1997, pp Schwartz R., Miller S., Stallard D., and Makhoul J., "Language Understanding Using Hidden Understanding Models". In proceedings of ICSLP, 1996, pp Zelle, J. M., and Mooney, R. J., "Learning to parse database queries using inductive logic programming". In Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), pp Arabic References ٢٤. " تغريد العفيصان سامية المهوس مها الربيعة و فاتن القحطاني. "ا داة لتحليل الجملة العربية". مشروع تخرج لدرجة البكالوريوس الرياض صفر ١٤٢١ ه. ٢٥ د.. ا حمد محمد المعتوق. "الا لفاظ المشتركة المعاني في اللغة العربية طبيعتها ا هميتها مصادرها" جامعة الملك فهد للبترول والمعادن الظهران. ٢٦. مجد الدين محمد بن يعقوب الفيروز ا بادي. "القاموس المحيط" دار الفكر بيروت ١٤١٥ ه. ٢٧. ا حمد بن محمد بن علي الفيومي المقري. "المصباح المنير" المكتبة العصرية بيروت الطبعة الثانية ١٤٢٠ ه. ٢٨. ا حمد مختار عمر محمد حماسة عبداللطيف ومصطفى النحاس زهران. "النحو الا ساسي" دار الفكر العربي مصر ١٤١٧ ه. ٢٩. محمد بن ا بي بكر بن عبد القادر الرازي. "مختار الصحاح" مو سسة الرسالة بيروت الطبعة السابعة. Appendix A تراآيب الجملة العربية Arabic Grammar Rules جملة اسمية جملة فعلية. جملة مفيدة ----> جملة اسمية + حرف عطف + جملة اسمية جملة اسمية + حرف عطف + جملة فعليه (حرف استفهام جملة اسمية ----> ا ن وا خواتها +) مبتدا معرفة (+صفة معرفة)(+حال) (+حرف عطف + مبتدا معرفة)+ خبر(+ صفة) (+حال) (+حرف عطف + حرف استفهام + مبتدا نكرة (+ صفة) + خبر- شبه حرف نفي + مبتدا نكرة (+ صفة) + خبر- شبه جملة خبر) (+شبه جملة) خبر- شبه جملة + مبتدا معرفة حرف نفي + خبر+ مبتدا معرفة (حرف استفهام ا ن جملة (حرف استفهام ا ن وا خواتها +) (حرف يدخل على الفعل +) فعل ناسخ + مبتدا معرفة + خبر ) حرف وا خواتها +) خبر- شبه جملة + مبتدا نكرة (+صفة) يدخل على الفعل +) ا فعال الظن + فاعل + مفعول به ١ + مفعول به ٢ (حرف يدخل على الفعل +) ا فعال الشروع والرجاء والمقاربة + مبتدا (+"ا ن") + خبر- جملة فعلية فعل مضارع 16
17 ف " ف " + فاعل (+ جملة فعلية + حرف عطف + جملة فعلية جملة فعلية + حرف عطف + جملة اسمية فعل جملة فعلية ----> صفة) (+ بدل) + ) مفعول به (+ صفة) (+ بدل) (+ حال)) (+ مفعول مطلق ( (+ظرف (+ مضاف ا ليه) (+صفة)) ) حرف يدخل على الفعل +) فعل مضارع + فاعل (+ صفة) (+ بدل) + ) مفعول به (+ صفة) (+ بدل) (+حال) ( (+ شبه جملة) (+ (+ جار (+ بدل) ) حرف يدخل على الفعل+) فعل مضارع + ناي ب فاعل (+ صفة) مفعول مطلق ( (+ مفعول لا جله ( + فاعل (+ صفة) (+ بدل) + ) حرف يدخل على الفعل +) فعل لمفعولين ومجرور ( (+ مفعول مطلق ( (+ مفعول لا جله) ) حرف يدخل على الفعل الماضي+) فعل ما ض مفعول به (+ صفة) (+ بدل) + مفعول به (+ صفة) (+ بدل) (+ مفعول لا جله ( ) +) بدل) + ) مفعول به +) صفة) +) بدل) +) حال)) +) مفعول مطلق ( +) مفعول لا جله ( +) شبه جملة ( + فاعل +) صفة) (+جار و مجرور ( (+ مفعول مطلق ( (+ (+ بدل) حرف يدخل على الفعل الماضي+) فعل ما ض + ناي ب فاعل (+ صفة) (+ + فاعل (+ صفة) (+ بدل) (+ بدل) مفعول به (+ صفة) (حرف يدخل على الفعل +) فعل مضارع + مفعول لا جله ( + فاعل مفعول به (+ صفة) (+ بدل) (حرف يدخل على الفعل الماضي +) فعل ماض + مفعول مطلق ( (+ مفعول لا جله ( ) حرف يدخل على الفعل +) ا فعال ا خرى ١ + فاعل + مفعول به ١ مفعول مطلق ( (+ مفعول لا جله ( (+ صفة) (+ بدل) (+ ) " +) جواب الشرط حرف الشرط + فعل الشرط +( " +) جواب الشرط + مفعول به ٢ اسم الشرط + فعل الشرط +( حرف نفي +) جملة فعلية (+ مستثنى منه ( + حرف استثناء + مستثنى (حرف يدخل على الفعل +) فعل لثلاث مفاعيل + فاعل + + مفعول به ٢ جملة فعلية + ا فعال الاستثناء + مفعول به حرف نداء + منادى (+ جملة اسمية جملة + مفعول به ١ مفعول به فعلية ( حرف قسم + المقسم به + جواب القسم اسم غير معرفة. مبتدا نكرة ----> تعريف + اسم غير معرفة اسم علم اسم مبني مصدر مو ول مبتدا نكرة + مضاف ا ليه. مبتدا معرفة ----> تعريف + صفة. صفة معرفة ----> ----> مصدر مو ول اسم معرب ضمير اسم ا شارة اسم موصول الا عداد المركبة بعض الظروف وما ركب منها اسم خبر جارومجرور اسم استفهام جملة اسمية جملة فعلية. الفعل ----> تعريف + اسم غير معرفة اسم علم بدل اسم اسم غير معرفة + مضاف ا ليه. ----> فاعل مفعول به ناي ب فاعل ----> مصدر من الفعل: ضرب ا رمي ا... مفعول مطلق + صفة مفعول مطلق + مضاف ا ليه. مفعول مطلق ----> اسم غير معرفة. صفة اسم معرب مثنى ) + جار ومجرور ( اسم معرب جمع ) + جار و مجرور ----> اسم غير علم ) + جار و مجرور ( مفعول لا جله. ( مبتدا معرفة. مفعول به ----> ١ خبر. مفعول به ٢----> جملة فعلية شبه جملة. ----> ساخن ا بارد ا واو الحال + جملة اسمية حال اسم معرب مثنى اسم معرب جمع. ----> تعريف + اسم غير علم المستثنى منه اسم. ----> المستثنى ----> ) تعريف +) اسم معرب اسم مبني مصدر مو ول. اسم ----> اسم علم اسم غير علم اسم معرب مثنى اسم معرب جمع. اسم معرب اسم غير علم + "ين". اسم غير علم + "ان" ----> اسم معرب مثنى اسم غير علم + "ين" اسم غير علم + "ات" جمع تكسير لاسم غير علم. اسم غير علم + "ون" ----> اسم معرب جمع ----> امرا ة مدينة فتاة. اسم غير معرفة ----> امرا ة مدينة فتاة. اسم غير علم فاتن محمد مها سعيد تغريد عمر سامية. ----> اسم علم ضمير اسم ا شارة اسم موصول اسم شرط اسم استفهام الا عداد المركبة بعض الظروف وما ركب منها اسم اسم مبني ----> الفعل اسم شرط غير جازم. مصدر مو ول ----> مصدر مو ول من " ا ن و الفعل " مصدر مو ول من " ا ن و اسمها وخبرها". شبة جملة ----> جار ومجرور ظرف + مضاف ا ليه. حرف جر + اسم مجرور. جار و مجرور ----> اسم مجرور ----> اسم. اسم ا شارة ----> هذا هذه هو لاء ذاك ذلك تلك ا ولي ك هنا ههنا هناك هنالك هذان هاتان هذين هاتين. اسم موصول ----> الذي التي الذين اللاتي اللاي ي من ما اللذان اللتان اللذين اللتين. اسم شرط ----> من ما مهما متى ا يان ا ين ا ينما ا نى حيثما كيفما ا ي. ----> من ما متى ا ين كم كيف ا ي ا يان ا نى. اسم استفهام الا عداد المركبة ----> من ١١ ا لى ) ١٩ ما عدا ١٢) + تمييز. التمييز ----> اسم نكرة منصوب. ----> ظرف مكان ظرف زمان. ظرف ----> فوق تحت حول ا مام ا زاء. ظرف مكان ظرف زمان ----> شهرا صباحا لحظة ليلا صيفا يوم ا.. مضاف ا ليه ----> تعريف +( اسم غير علم اسم معرب مثنى اسم معرب جمع ( اسم علم.. اسم غير علم + مضاف ا ليه اسم غير علم اسم علم المنادى ----> ----> حيث ا مس الا ن ا ذ ليل نهار بين. بعض الظروف و ما ركب منها ----> مساجد مصانع. جمع تكسير لاسم غير علم لولا لوما. ----> ا ذا ا و لو كلما اسم شرط غير جازم ----> هيهات شتان سرعان ا ه ا ف ا مين عليك حذار صه ا يه حي. اسم الفعل 17
18 ل" لا " ن" ا " ت" ي " الضمير ----> ضمير رفع منفصل ضمير نصب منفصل ضمير رفع متصل ضمير نصب متصل ضمير جر متصل ضمير مستتر. ضمير رفع منفصل ----> ا نا نحن ا ن ت ا نت ا نتما انتم انتن هو هي هما هم هن. ضمير نصب منفصل ----> ا ياي ا يانا ا ياك ا ياكما ا ياكم ا ياكن ا ياه ا ياها ا ياهما ا ياهم ا ياهن. ضمير رفع متصل ----> تاء الفاعل نا ا لف الاثنين واو الجماعة ياء المخاطبة نون النسوة. ضمير نصب متصل ----> ياء المتكلم نا كاف المخاطب هاء الغاي ب. ضمير جر متصل ----> ياء المتكلم نا كاف المخاطب هاء الغاي ب. المقسم به ----> لفظ الجلالة اسم-صفة من صفات االله تعالى + مضاف ا ليه-لفظ الجلالة " + فعل مضارع + "نون التوكيد" + فاعل (+صفة) (+حال) (+مفعول به) "لقد" + فعل ماض + فاعل (+صفة) جواب القسم ----> " + جملة فعلية - فعل مضارع "ما" + جملة فعلية فعل ماض "ما" + (+حال) (+مفعول به) "ا ن " + مبتدا + "ل" + خبر جملة اسمية. ----> فعل مضارع فعل ماض فعل ا مر. فعل ----> ذهب جاء حضر. فعل ماض اذهب قم كل. ----> فعل ا مر " ( + فعل ماض. " " " <---- ) فعل مضارع ا فعال ا خرى ----> ١ ا فعال اليقين ا فعال التحويل. ---> ماعدا ماخلا ماحاشا. ا فعال الاستثناء ----> ظن خال حسب زعم جعل هب. ا فعال الظن ----> را ى علم وجد ا لفى تعلم ) بمعنى اعلم (. ا فعال اليقين ----> صير حول جعل رد اتخذ. ا فعال التحويل ----> كسا البس ا عطى منح سا ل منع. فعل لمفعولين ----> ا علم ا رى نب ا ا نبا خب ر ا خبر حد ث. ا فعال لثلاث مفاعيل ا فعال ناسخة ----> كان صار ليس ا صبح ا ضحى ظل ما زال ما دام بات ا مسى. ----> جملة فعلية. فعل الشرط ----> جملة اسمية فعل جامد + جملة اسمية فعل ا مر+ فاعل( + مفعول به ( لا الناهية + جملة فعلية اسم استفهام + جواب الشرط ( + جملة فعلية. لن قد س" " سوف حرف استفهام + جملة فعلية (حرف نفي جملة فعلية ----> نعم بي س حبذا. فعل جامد ا فعال الشروع والرجاء والمقاربة ----> ا فعال الشروع ا فعال الرجاء ا فعال المقاربة ا فعال الشروع ----> ا خذ ا نشا بدا جعل ا فعال الرجاء ----> عسى ا فعال المقاربة ----> ا وشك كاد حرف يدخل على الفعل ----> حرف يدخل على الفعل المضارع فعل يدخل على الفعل الماضي. حرف استفهام. حرف يدخل على الفعل المضارع ----> حرف نصب حرف جزم حرف نفي قد سين سوف حرف يدخل على الفعل الماضي ----> قد حرف عطف حرف استفهام. حرف الشرط ----> ا ن. من ا لى عن على في الباء الكاف اللام واو القسم تاء القسم رب مذ منذ واو رب عدا خلا ----> حرف جر حاشا. ا ن و ا خواتها ----> ا ن ا ن لكن كا ن لعل ليت لا. ----> ياء ا يا هيا ا ي الهمزة. حرف نداء ----> ا لا. حرف استثناء لكن لا بل حتى. حرف عطف ----> الواو الفاء ثم ا و ا م ----> الهمزة هل. حرف استفهام ----> ا ن لن كي ا ذن لام التعليل فاء السببية حتى. حرف نصب ----> الباء التاء الواو حرف قسم ----> لم لما لام الا مر لا الناهية. حرف جزم ----> ما لا. حرف نفي ----> ال. التعريف 18
19 Appendix B Arabic Grammar Rules XML Document <?xml version="1.0" encoding="windows-1256"?> <!DOCTYPE قواعد _ النحو SYSTEM " Grammar.dtd"> <قواعد _ النحو> <الجملة _ الفعلية> <--فعل فاعل مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع> فعل </ الموقع> <السابقة>-</ السابقة> <الموقع> فاعل </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الزمن">فعل</الموقع_الا عرابي> <القيمة>ا مر</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- حرف يدخل على الفعل الماضي - فعل ماضي - فاعل - مفعول به --!> <النوع> ا داة </ النوع> <الموقع> حرف لا محل له من الا عراب</الموقع> <النوع> فعل </ النوع> <الزمن> ماضي </ الزمن> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع>فعل</الموقع> <السابقة>-</ السابقة> <الموقع>فاعل</الموقع> <السابقة>-</ السابقة> <الموقع>مفعول به</الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الا داة"> حرف لا محل له من الا عراب</الموقع_الا عرابي> <القيمة>هل</القيمة> <القيمة>قد</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- فعل - فاعل(ضمير متصل) - مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <المفعول _به>-</المفعول_به> <الموقع>فعل والفاعل ضمير متصل </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="الفاعل">فعل والفاعل ضمير متصل</الموقع_الا عرابي> <القيمة>ت</القيمة> <القيمة>نا</القيمة> (Continue) 19
20 <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>-</القيمة> <القيمة>منصوب</القيمة> الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل والفاعل ضمير متصل < > </الموقع_الا عرابي <الموقع_الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- (فعل - فاعل(ضمير متصل) - مفعول به(ضمير متصل --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الموقع>فعل والفاعل ضمير متصل والمفعول به ضمير متصل</الموقع> <التصنيف> ظرف </ التصنيف> <الموقع> ظرف </ الموقع> الموقع _الا عرابي الخاصية="الفاعل">فعل والفاعل ضمير متصل والمفعول به ضمير < <متصل</الموقع_الا عرابي <القيمة>ت</القيمة> <القيمة>نا</القيمة> الموقع _الا عرابي الخاصية="المفعول_به">فعل والفاعل ضمير متصل والمفعول به < <ضمير متصل</الموقع_الا عرابي <القيمة>ه</القيمة> <القيمة>ها</القيمة> <القيمة>ك</القيمة> <القيمة>كما</القيمة> <القيمة>كم</القيمة> <القيمة>كن</القيمة> <القيمة>هما</القيمة> <القيمة>هم</القيمة> <القيمة>هن</القيمة> <-- فعل - فاعل - صفة - مفعول به --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <التعدي> </ ١ التعدي> <الفاعل>-</ الفاعل> <الموقع> فعل </ الموقع> <السابقة>-</ السابقة> <الموقع> فاعل </ الموقع> <علم> لا </ علم> <صفة> نعم </ صفة> <الموقع> صفة </ الموقع> <السابقة>-</ السابقة> <الموقع> مفعول به </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">صفة</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">صفة</الموقع_الا عرابي> <ضابط /> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="المفعول_به_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">مفعول به</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">صفة</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مفعول به</الموقع_الا عرابي> <القيمة>منصوب</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الموقع _الا عرابي الخاصية="التصنيف">مفعول به</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- فعل - فاعل - جار ومجرور --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <الموقع> فعل </ الموقع> <الموقع> فاعل </ الموقع> <النوع> ا داة </ النوع> <التصنيف> حرف جر </ التصنيف> <الموقع>حرف جر</الموقع> <معرفة> نعم </ معرفة> <السابقة>-</ السابقة> <الموقع> اسم مجرور </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> (Continue) 20
21 <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع_الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <-- (فعل - فاعل - جار ومجرور ) حرف الباء ا و التاء --!> <النوع> فعل </ النوع> <البناء> مبني للمعلوم </ البناء> <الموقع> فعل </ الموقع> <الموقع> فاعل </ الموقع> <نوع _السابقة> حرف جر </نوع_السابقة> <الموقع> حرف جر واسم مجرور </ الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>-</القيمة> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <--فعل - فاعل --!> <النوع>فعل</النوع> <البناء>مبني للمعلوم</البناء> <الفاعل>-</الفاعل> <الموقع>فعل</الموقع> <النوع>اسم</النوع> <الموقع>فاعل</الموقع> <الموقع _الا عرابي الخاصية="جنس_الفاعل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الفاعل_عاقل">فعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="عاقل">فاعل</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">فاعل</الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">فاعل</الموقع_الا عرابي> <القيمة>محذوف النون</القيمة> <العلاقة>لا يساوي</العلاقة> <الجملة _ الفعلية /> <-- بداية الجملة الا سمية --!> <الجملة _ الاسمية> <-- (مبتدا (غير ممنوع من الصرف)- خبر(اسم --!> <معرفة> نعم </ معرفة> <الموقع>مبتدا </ الموقع> <معرفة> لا </ معرفة> <الموقع> خبر </ الموقع> <كلمة /> <الموقع _الا عرابي الخاصية="العدد">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="العدد">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مبتدا </الموقع_الا عرابي> <القيمة>مرفوع</القيمة> <--(مبتدا (جمع تكسير ممنوع من الصرف)- خبر(اسم --!> <معرفة> نعم </ معرفة> <الموقع> مبتدا </ الموقع> <معرفة> لا </ معرفة> <الموقع> خبر </ الموقع> <الموقع _الا عرابي الخاصية="الجنس">مبتدا </الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الجنس">خبر</الموقع_الا عرابي> <الموقع _الا عرابي الخاصية="الحالة_الا عرابية">مبتدا </الموقع_الا عرابي> <القيمة>ممنوع من الصرف</القيمة> <الموقع _الا عرابي الخاصية="التصنيف">مبتدا </الموقع_الا عرابي> <القيمة> جمع تكسير لغير العاقل</القيمة> <--(مبتدا مضاف - مضاف ا ليه - خبر(اسم --!> (Continue) 21
Division of Arts, Humanities & Wellness Department of World Languages and Cultures. Course Syllabus اللغة والثقافة العربية ١ LAN 115
Division of Arts, Humanities & Wellness Department of World Languages and Cultures Course Syllabus Semester and Year: Course and Section number: Meeting Times: INSTRUCTOR: Office Location: Phone: Office
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationA Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon
A Novel Approach for the Recognition of a wide Arabic Handwritten Word Lexicon Imen Ben Cheikh, Abdel Belaïd, Afef Kacem To cite this version: Imen Ben Cheikh, Abdel Belaïd, Afef Kacem. A Novel Approach
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationStudy Center in Amman, Jordan
Study Center in Amman, Jordan Course name: Modern Standard Arabic, Superior I Course number: ARAB 4011 AMJO Programs offering course: Advanced Arabic Language Language of instruction: Arabic U.S. Semester
More informationHybridTechniqueforArabicTextCompression
Global Journal of Computer Science and Technology: C Software & Data Engineering Volume 15 Issue 1 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationArabic and Chinese Extended Sequences (ACES) Program
ArabicandChineseExtendedSequences(ACES)Program زيارة لبنان Visiting Lebanon ThisIntegratedPerformanceAssessmentwasdevelopedaspartoftheMinneapolisPublicSchools ACESProject(ArabicandChineseExtended Sequences).TheprojectwasfundedwithaFLAP(ForeignLanguageAssistanceProject)grantfrom2008
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationCompositional Semantics
Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language
More informationWhat the National Curriculum requires in reading at Y5 and Y6
What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the
More informationASR for Tajweed Rules: Integrated with Self- Learning Environments
I.J. Information Engineering and Electronic Business, 2017, 6, 1-9 Published Online November 2017 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2017.06.01 ASR for Tajweed Rules: Integrated with
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More information1. Introduction. 2. The OMBI database editor
OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationA Comparative Survey on Arabic Stemming: Approaches and Challenges
Intelligent Information Management, 2017, 9, 39-67 http://www.scirp.org/journal/iim ISSN Online: 2160-5920 ISSN Print: 2160-5912 A Comparative Survey on Arabic Stemming: Approaches and Challenges Mohammad
More informationA Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition Abir Masmoudi 1,2, Mariem Ellouze Khemakhem 1,Yannick Estève 2, Lamia Hadrich Belguith 1 and Nizar Habash 3 (1) ANLP Research group,
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist
Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationModeling full form lexica for Arabic
Modeling full form lexica for Arabic Susanne Alt Amine Akrout Atilf-CNRS Laurent Romary Loria-CNRS Objectives Presentation of the current standardization activity in the domain of lexical data modeling
More informationA hybrid approach to translate Moroccan Arabic dialect
A hybrid approach to translate Moroccan Arabic dialect Ridouane Tachicart Mohammadia school of Engineers Mohamed Vth Agdal University, Rabat, Morocco tachicart@gmail.com Karim Bouzoubaa Mohammadia school
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationProblems of the Arabic OCR: New Attitudes
Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationPhonological Processing for Urdu Text to Speech System
Phonological Processing for Urdu Text to Speech System Sarmad Hussain Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, B Block, Faisal Town, Lahore,
More informationVISUAL MEDIA USED IN INTRODUCING VOCABULARY AT TK IT AL-MA UN SENGKALING THESIS. By: FAJRIN AL FERA
VISUAL MEDIA USED IN INTRODUCING VOCABULARY AT TK IT AL-MA UN SENGKALING THESIS By: FAJRIN AL FERA ENGLISH DEPARTMENT FACULTY OF TEACHER TRAINING AND EDUCATION UNIVERSITY MUHAMMADIYAH OF MALANG OCTOBER
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationPAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))
Ohio Academic Content Standards Grade Level Indicators (Grade 11) A. ACQUISITION OF VOCABULARY Students acquire vocabulary through exposure to language-rich situations, such as reading books and other
More informationAccepted Manuscript. Title: Region Growing Based Segmentation Algorithm for Typewritten, Handwritten Text Recognition
Title: Region Growing Based Segmentation Algorithm for Typewritten, Handwritten Text Recognition Authors: Khalid Saeed, Majida Albakoor PII: S1568-4946(08)00114-2 DOI: doi:10.1016/j.asoc.2008.08.006 Reference:
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationApproaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque
Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically
More informationTABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards
TABE 9&10 Revised 8/2013- with reference to College and Career Readiness Standards LEVEL E Test 1: Reading Name Class E01- INTERPRET GRAPHIC INFORMATION Signs Maps Graphs Consumer Materials Forms Dictionary
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationCEFR Overall Illustrative English Proficiency Scales
CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey
More informationSouth Carolina English Language Arts
South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content
More informationNational Literacy and Numeracy Framework for years 3/4
1. Oracy National Literacy and Numeracy Framework for years 3/4 Speaking Listening Collaboration and discussion Year 3 - Explain information and ideas using relevant vocabulary - Organise what they say
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationNotes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1
Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial
More informationOpportunities for Writing Title Key Stage 1 Key Stage 2 Narrative
English Teaching Cycle The English curriculum at Wardley CE Primary is based upon the National Curriculum. Our English is taught through a text based curriculum as we believe this is the best way to develop
More informationIntroduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationLinguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1
Linguistics 1 Linguistics Matthew Gordon, Chair Interdepartmental Program in the College of Arts and Science 223 Tate Hall (573) 882-6421 gordonmj@missouri.edu Kibby Smith, Advisor Office of Multidisciplinary
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationWriting a composition
A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationTHE VERB ARGUMENT BROWSER
THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW
More informationMercer County Schools
Mercer County Schools PRIORITIZED CURRICULUM Reading/English Language Arts Content Maps Fourth Grade Mercer County Schools PRIORITIZED CURRICULUM The Mercer County Schools Prioritized Curriculum is composed
More informationEvolution of Symbolisation in Chimpanzees and Neural Nets
Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication
More informationCoast Academies Writing Framework Step 4. 1 of 7
1 KPI Spell further homophones. 2 3 Objective Spell words that are often misspelt (English Appendix 1) KPI Place the possessive apostrophe accurately in words with regular plurals: e.g. girls, boys and
More informationFlorida Reading Endorsement Alignment Matrix Competency 1
Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending
More informationCase government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG
Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,
More informationGACE Computer Science Assessment Test at a Glance
GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science
More informationCLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction
CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationINTELLIGENT ACTIVE COACHING AN EXECUTABLE PLAN APPROACH
INTELLIGENT ACTIVE COACHING AN EXECUTABLE PLAN APPROACH S. A. Gamalel-Din Al-Azhar University, Systems & Computers Engineering Dept. قرب. يو من رجال التعليم ا ن التعليم التفاعلى الذى يعتمد على وجود المدرس
More informationSpecifying Logic Programs in Controlled Natural Language
TECHNICAL REPORT 94.17, DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF ZURICH, NOVEMBER 1994 Specifying Logic Programs in Controlled Natural Language Norbert E. Fuchs, Hubert F. Hofmann, Rolf Schwitter
More informationGetting into top colleges. Farrukh Azmi, MD, PhD
Getting into top colleges Farrukh Azmi, MD, PhD But Why? The first revealed word of the Quran? Verily, in the creation of the heavens and of the earth, and the succession of night and day: and in the
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationA Framework for Customizable Generation of Hypertext Presentations
A Framework for Customizable Generation of Hypertext Presentations Benoit Lavoie and Owen Rambow CoGenTex, Inc. 840 Hanshaw Road, Ithaca, NY 14850, USA benoit, owen~cogentex, com Abstract In this paper,
More informationFOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.
CONTENTS FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8 УРОК (Unit) 1 25 1.1. QUESTIONS WITH КТО AND ЧТО 27 1.2. GENDER OF NOUNS 29 1.3. PERSONAL PRONOUNS 31 УРОК (Unit) 2 38 2.1. PRESENT TENSE OF THE
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationcambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN
C O P i L cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN 2050-5949 THE DYNAMICS OF STRUCTURE BUILDING IN RANGI: AT THE SYNTAX-SEMANTICS INTERFACE H a n n a h G i b s o
More informationThe College Board Redesigned SAT Grade 12
A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.
More informationGuidelines for Writing an Internship Report
Guidelines for Writing an Internship Report Master of Commerce (MCOM) Program Bahauddin Zakariya University, Multan Table of Contents Table of Contents... 2 1. Introduction.... 3 2. The Required Components
More informationELA/ELD Standards Correlation Matrix for ELD Materials Grade 1 Reading
ELA/ELD Correlation Matrix for ELD Materials Grade 1 Reading The English Language Arts (ELA) required for the one hour of English-Language Development (ELD) Materials are listed in Appendix 9-A, Matrix
More informationSome problems of translation from English into Arabic
Kingdom of Saudi Arabia Ministry of Higher Education Qassim Private Colleges Department of Applied Linguistics Some problems of translation from English into Arabic A thesis presented to the Department
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationLinguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis
International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:
More informationSIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS. Chris Adams Bachelor of Arts, Asbury College, May 2006
SIX DISCOURSE MARKERS IN TUNISIAN ARABIC: A SYNTACTIC AND PRAGMATIC ANALYSIS by Chris Adams Bachelor of Arts, Asbury College, May 2006 A Thesis Submitted to the Graduate Faculty of the University of North
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationAge Effects on Syntactic Control in. Second Language Learning
Age Effects on Syntactic Control in Second Language Learning Miriam Tullgren Loyola University Chicago Abstract 1 This paper explores the effects of age on second language acquisition in adolescents, ages
More informationn 41, Juin Tome A - pp 89-98
n 41, Juin 2014 -Tome A - pp 89-98 Investigating the Reading Difficulties of Magister Students of Physics vis-à-vis Their General English knowledge, University of Constantine Abstract This paper reports
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More informationContent Language Objectives (CLOs) August 2012, H. Butts & G. De Anda
Content Language Objectives (CLOs) Outcomes Identify the evolution of the CLO Identify the components of the CLO Understand how the CLO helps provide all students the opportunity to access the rigor of
More informationImpact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment
Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment Takako Aikawa, Lee Schwartz, Ronit King Mo Corston-Oliver Carmen Lozano Microsoft
More informationTowards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la
Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationLING 329 : MORPHOLOGY
LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,
More informationBULATS A2 WORDLIST 2
BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationThank you for encouraging your child to learn English and to take this YLE (Young Learners English) Flyers test.
TO: PARENT/GUARDIAN FROM: Class Teacher SUBJECT: University of Cambridge YLE (Young Learners: FLYERS) Examination (Grade 4) DATE: 18 th OF MARCH, 2014 Dear Parent/Guardian, Thank you for encouraging your
More informationPhysics 270: Experimental Physics
2017 edition Lab Manual Physics 270 3 Physics 270: Experimental Physics Lecture: Lab: Instructor: Office: Email: Tuesdays, 2 3:50 PM Thursdays, 2 4:50 PM Dr. Uttam Manna 313C Moulton Hall umanna@ilstu.edu
More informationNATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ
NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML By EUGENIO JAROSIEWICZ A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE
More informationConstraining X-Bar: Theta Theory
Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,
More informationCELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom
CELTA Syllabus and Assessment Guidelines Third Edition CELTA (Certificate in Teaching English to Speakers of Other Languages) is accredited by Ofqual (the regulator of qualifications, examinations and
More informationLiterature and the Language Arts Experiencing Literature
Correlation of Literature and the Language Arts Experiencing Literature Grade 9 2 nd edition to the Nebraska Reading/Writing Standards EMC/Paradigm Publishing 875 Montreal Way St. Paul, Minnesota 55102
More informationUniversiteit Leiden ICT in Business
Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:
More information