Improving coverage and parsing quality of a large-scale LFG for German
|
|
- Thomasine Wells
- 6 years ago
- Views:
Transcription
1 Improving coverage and parsing quality of a large-scale LFG for German Christian Rohrer, Martin Forst Institute for Natural Language Processing (IMS) University of Stuttgart Azenbergstr Stuttgart, Germany {rohrer,forst}@ims.uni-stuttgart.de Abstract We describe experiments in parsing the German TIGER Treebank. In parsing the complete treebank, 86.44% of the sentences receive full parses; 13.56% receive fragment parses. We discuss the methods used to enhance coverage and parsing quality and we present an evaluation on a gold standard, to our knowledge the first one for a deep grammar of German. Considering the selection performed by our current version of a stochastic disambiguation component, we achieve an f-score of 84.2%, the upper and lower bounds being 87.4% and 82.3% respectively. 1. Introduction For realistic applications we need grammars with broad coverage. The broader the coverage, however, the greater the number of possible readings per sentence and the lower the performance. When increasing coverage, we tried to include the most frequent constructions (based on a corpus study) and at the same time to restrict the grammar rules in order to avoid overgeneration. The restrictions are sometimes too heavy, and we loose certain sentences, but the gain in performance clearly justifies the restrictions. Besides quantity of analyses, one also wants quality. Quality can only be measured by evaluating against a gold standard. Once substantial coverage with high quality has been reached, the problem is to chose the intended reading. Disambiguation of competing syntactic analyses is one of the greatest challenges for computational linguistics. We present first results of experiments with a stochastic disambiguation model. 2. A Broad-Coverage LFG for German The grammar was developed in the ParGram project (Butt et al., 2002). Besides achieving 50% coverage (Dipper, 2003), the grammar writers concentrated on phenomena discussed in theoretical syntax. With the advent of treebanks and successful attempts to induce grammars from treebanks, we shifted our focus. In a new project (DLFG 1 ), we are concentrating on coverage. The grammar now has 274 LFG style rules, which compile into an automaton with 6,584 states and 22,241 arcs. The grammar uses several lexicons and and a guessing mechanism for default lexical entries. The lexicons record mainly subcategorization information. As a form of preprocessing, the grammar uses a cascade of finite-state transducers (Kaplan et al., 2004), mainly for tokenization and morphological analysis. The input sentences are thus processed by a tokenizer, a multi-word transducer, a morphology and a 1 Disambiguierung einer Lexikalisch-Funktionalen Grammatik für das Deutsche ( Disambiguation of a Lexical Functional Grammar for German ) research project financed by the DFG (Deutsche Forschungsgemeinschaft German Research Foundation ), grant Ro 245/18-1 guesser before they are actually parsed. Later we will also include a named entity recognizer (NER). In the current experiments with the gold standard we simulate the NER by manual marking. 3. Enhancing grammar coverage 3.1. Corpus-based enlargement of grammar coverage In order to increase coverage of the grammar we first had to find out where the grammar was incomplete. We systematically created testsuites extracted from the TIGER Treebank. For instance we extracted all NPs up to the head or all NPs which are modified by a (subcategorized) subordinate clause or a verbphrase. We also extracted the trees associated with the corrsponding strings in order to determine the frequency of a construction. Most of the examples where our grammar failed involved constructions with very limited frequency. Hence, once a grammar has achieved broad coverage progress is slow. There were, however, a few areas where adding new rules really helped to increase coverage: Coordination Coordination was one phenomenon of which only the basic instances were covered by the original grammar. We thus introduced new rules for several subtypes of asymmetric or otherwise special coordination. Coordination of adverbs with PPs In analogy to predicative constituents like in he is a Republican and proud of it, which can be handled with a special coordination rule for predicative constituents that allows, e.g., DPs and APs to be coordinated, we account for the coordination of ADVPs and PPs that function as modifiers with a special coordination rule 2, namely ADVP ADVP: ; CONJco PP:. (1) hier und in Berlin here and in Berlin here and in Berlin 2 For simplicity of presentation, we only present simplified versions of the newly introduced grammar rules.
2 Subject gap in finite constructions (SGF) (2) Hierhin kam Here came Hans und hielt seinen Vortrag. Hans and gave his talk. Hans came here and gave his talk. In these constructions, which have received a lot of linguistic attention since Höhle (1983), the shared subject is in the Mittelfeld of the first conjunct instead of being in the Vorfeld. This means that it is not distributed automatically into the second conjunct. We have implemented an analysis following Frank (2002), who treats SGF coordination as a marked case of CP coordination that can only occur given a very particular information structure. Unlike Frank (2002), we formulate the rule as a coordination of a CP and a Cbar, but this is a detail motivated by efficiency considerations: CP CP: ( SUBJ) = ( SUBJ); CONJco Cbar:. Adverbs and PPs between conjunction and conjunct Coordinated structures where an ADVP or a PP occurs left of the last conjunct, as illustrated in (3), have received much less attention in theoretical linguistics nor are they accounted for in most deep grammars, to our knowledge. (3) From Von month Monat to zu month Monat grows wächst the das offer Angebot und mit ihm auch die Nachfrage. and with it also the demand. The offer grows from month to month and so does the demand. However, they are relatively frequent in text corpora, so that coverage can be noticeably improved by the introduction of a rule for these constructions. We therefore formulated coordination rules of the following type, where, in the f- annotations, refers to the f-structure of the right sister: DP DP: ; CONJco ( { ADVP: ( ADJUNCT); PP: ( ADJUNCT) } ) DP:. Parentheticals 4-5% of the sentences in the TIGER Corpus contain constituents marked as parentheticals. We introduce parenthetical constructions via a metarule macro. It allows insertion of a parenthetical between any two constituents on the right hand side of a phrase-structure rule. Reported speech without real verbum dicendum In German newspaper text, sentences like the following occur relatively frequently: (4) Die The Fans fans waren were zunächst at irritiert, first bewertet irritated, Hans die Veränderung der Band. evaluates Hans the change of the band. The fans were confused at first, says Hans, evaluating the change of the band. The first clause, Die Fans waren zunächst irritiert, represents reported speech, but the clause which introduces the reported speech does not contain a verb of saying. Bewerten does not subcategorize for a sentential complement. In our example, it takes a subject (Hans) and an object (die Veränderung der Band). The distribution of this kind of construction is the same as the distribution of reportive parentheticals headed by verbs that subcategorize for a COMP. Hence, in addition to COMP, we allow the reported speech before or around a reportive parenthetical to be projected to the semantic function REPORTEDSPEECH. The f-structure associated to (4) is illustrated in figure 1. "``Die Fans waren zunächst irritiert'', bewertet Hans die Veränderung der Band PRED SUBJ ADJUNCT 1 TOPIC 'irritieren<null, [333:Fan]>' PRED 'Fan' SPEC DET PRED 'die' PRED 'zunächst' PRED SUBJ OBJ 'bewerten<[114:hans], [134:Veränderung]>' 114 PRED 'Hans' PRED 'Veränderung' PRED 'Band' ADJ-GEN SPEC DET PRED 'die' 204 SPEC DET PRED 'die' REPORTEDSPEECH [1:irritieren] [333:Fan] Figure 1: f-structure of (4) 3.2. Corpus-based restriction of grammar rules Rule specialization In the original version of our grammar we tried to write rules as general as possible. For instance, a VP can function as an AP if the head verb is transformed into a participle. Instead of an unrestricted rule AP[+infl] VP[+infl], with very negative effects on efficiency, we wrote a special rule VP-as-AP, where we limit the number and function of possible constituents and where we exclude recursion in the verbal complex. This is motivated by the fact that, in the TIGER Corpus, there is not a single occurrence of an AP with a participle head dominating a VP. The exclusion of recursion in deverbal attributive APs has a very positive impact on the efficiency of the grammar because there are numerous forms that can be both an inflected past participle and a past tense form. Consider the following subordinate clause: (5) Weil Because er die Frau die he the woman the Aktien zu verkaufen shares to sell überredete, convinced, Because he convinced the woman to sell the shares... The form überredete can be both a past tense form and a past participle. As the original grammar allows infinitival VPs to be embedded in attributive deverbal APs, it can analyze the string die Aktien zu verkaufen überredete as an inflected AP, and this inflected AP can then be analyzed as a headless DP. This means that a large number of undesired c-structures is built which are only ruled out during the solution of the f-structure constraints. Of course, with respect to efficiency, it is a very attractive feature of the revised grammar that these erroneous c-structures are not built at all in the first place.
3 Restricting long distance dependencies Solving the equations which account for long distance dependencies can be very time-consuming. We therefore simplified these equations based on a corpus study, e.g. for extraposed relative clauses Restricting rules by number of tokens We restrict certain rules by limiting the number of tokens covered by the rule. E.g., subjectless insertions like wie früher berichtet ( as previously reported ) have only very few words between as and reported Generality of the steps taken to enhance grammar coverage Our section on corpus-based improvement of grammar coverage may create the impression that we tailored the grammar too closely to the TIGER Corpus. We therefore parsed the 20,614 sentences of the NEGRA Corpus. 81.5% of the sentences obtained a full parse and 18.5%, a partial parse. These results on the NEGRA Corpus are clearly not as good as the results on the TIGER Corpus, but with a grammar coverage of more than 80%, they show that coverage does not drop dramatically on unseen corpora and that at least most of the measures taken to improve coverage carry over to the unseen data. 4. Robustness We augmented the standard grammar with a FRAGMENT grammar to collect as much information as possible in cases where a sentence does not get a full parse. The parser returns well-formed chunks like NPs, PPs, VPs, Ss, etc. The grammar has a fewest-chunk method for determining the least fragmented parse. It turned out that the quality of fragment parses can be improved by restricting complex rules (e.g. the S-rule) in the fragment grammar wrt. the standard grammar. In order to cope with timeouts and memory problems, we use the SKIMMING technique (Riezler et al., 2002). When the amount of time or memory spent on a sentence exceeds a given threshold, XLE skims the constituents whose processing has not yet been completed, i.e. XLE does only a bounded amount of work per subtree. When skimming, we use a restricted version of our grammar. This is achieved with the help of special OT marks (Frank et al., 2001), socalled SKIMMING NOGOOD marks, which turn off expensive rules like headless NPs, free datives, etc. during skimming. 5. Testing 5.1. Gold standard We evaluated parse quality on manually validated dependency annotations for 1602 sentences from the TiGer Dependency Bank (Forst et al., 2004) The annotation from the TIGER Treebank were semi-automatically transformed into dependency triples which were then corrected and extended by human annotators. It encodes the same type of dependency triples as the PARC 700 Dependency Bank (King et al., 2003). The grammatical relations and morphosyntactic features are the ones annotated in the TIGER Treebank, except for systematic changes meant to make the TiGer DB more suitable for parser evaluation Parsing quality In tables 1 and 2, we give the results of two types of parse selection: (1) lower bound: In the lower bound a parse from the set of parses is chosen randomly. (2) upper bound: In the case of the upper bound the best F-score according to the annotation schema is chosen. F-score is defined as the harmonic mean of precision and recall (f = 2pr p+r ). We use the triple encoding and evaluation software of (Crouch et al., 2002). Table 1 shows that full parses achieve a noticeably higher f-score than partial parses; this shows that it is crucial to improve coverage to, say, at least 80% in order to parse free text with a reasonable quality. Table 2 gives the upper bound and the lower bound figures for the 1602 gold standard sentences broken down according to the grammatical relations and morphosyntactic features encoded Disambiguation Table 3, finally, gives preliminary results for our stochastic disambiguation component. Two versions of the component are compared with each other and with the upper and lower bound. Both versions are based on maximum entropy models that are trained in a supervised manner on partially labelled data. The training material for both models were the parses of 3,817 sentences from the TIGER Corpus (except of sentences 8,001 through 10,000). The all properties version uses both the kind of property described in Riezler et al. (2002) and a series of new properties that mainly encode information on the linear order of grammatical functions. The only original properties version only makes use of the former. upper all properties only original lower relation bound for disamb. properties bound all preds only da gr oa op op loc quant sb sbp Table 3: F-scores for selected grammatical relations in the 1602 TiGer DB examples broken down according to parse selection method 6. Discussion 6.1. Coverage In order to get a full parse, the input sentence has to be wellformed. At least 1% of the sentences in the testsuite contain spelling mistakes, punctuation errors or grammatical errors. Furthermore the TIGER annotators sometimes assign full structures to elliptical sentences that lack a clear syntactic head. In order to match the analyses annotated for them, our parser would have to do a lot of structure building, which would lead to overgeneration and inefficiency.
4 full and non- non-skimmed skimmed all full skimmed fragments fragments fragments fragments % of test set upper bound lower bound avg. sentence length avg. parse time in sec Table 1: Upper bound and lower bound f-scores for grammatical relations and morphosyntactic features in the 1602 TiGer DB examples broken down according to parse quality Among the well-formed sentences which receive a partial parse we have to distinguish three types: (1) constructions for which our grammar contains rules, which, however, are turned off for efficiency reasons (e.g. coordination without an explicit conjunction), (2) constructions for which we do not have rules (e.g., special types of non-constituent coordination, certain parenthetical constructions, heavy ellipsis), (3) sentences which contain lexical material that is not in the lexicon and which our guesser cannot handle (e.g., problems of subcategorization, idioms and collocations). Subcategorization poses problems especially if a MWE as a whole subcategorizes for a sentential function like COMP despite the fact that none of its parts subcatgorizes for a COMP. This is the case with the MWE zu Protokoll geben which subcategorizes for a COMP but neither geben nor Protokoll subcategorize for a COMP Parsing quality As Table 1 shows, the results for the complete testsuite are quite good. Breaking them down according to parse quality shows that our upper bound for full parses is roughly identical to Riezler et al. (2002). Our values for the complete test set are better (87.4% vs. 84.1%) because more sentences of our testsuite receive a full parse. If we subtract the 55 sentences with an average length of 41.7 words that get a partial parse after skimming, we obtain for 96.6% of our testsuite an upper bound of 88.0% and a lower bound of 82.9%. The F-score of our non-skimmed fragment parses is surprisingly high. Only highly elliptical sentences get really bad values. One explanation for our good values are our detailed subcategorization lexicons. The figures in table 2 are more informative than overall F- score. They illustrate that the f-scores for grammatical relations are not as good as those for morphosyntactic features. The lower values for case are due to syntactic ambiguity and are therefore not a purely morphological problem; to a limited extent this is also true for the feature num (number). In the preds-only evaluation the values for arguments sb (subject) and oa (accusative object) are better than those for da (dative object) and og (genitive object). So-called free datives are quite frequent in German, and as the name indicates, difficult to predict and to specify in the subcategorization lexicon. We guess free datives and, apparently, we go wrong sometimes. For genitive objects we get bad values because, for efficiency reasons, we require that the genitive be morphologically marked. Furthermore, genitive NPs may be attached to preceding NPs. The figures for sbp (logical subject in passives) are worse than those for grammatical subjects because the PP denoting the logical subject is introduced by von, which has many different functions. Subcategorized PPs (and ADVPs) are annotated as op (oblique), op dir (directional argument), op loc (locative argument) and op manner (modal argument). The low f- score for subcategorized PPs indicates gaps in the subcategorization lexicon. In addition, this low score has a negative effect on the f-score of mo (modifiers or ajduncts). pds (predicative complements) with the copula sein can be confused with stative passives. E.g., Er ist ihm übergeordnet is analyzed as stative passive by our grammar and as pd by the annotators. The values for the subcategorized functions oc fin (finite complement clauses) and oc inf (non-finite argument VPs) differ. The figures for clauses with the function oc fin are lower because clauses introduced by interrogative or relative pronouns in adverbial function can be interpreted as oc fins if the embedding clause contains a word which subcategorizes for such a clause. Furthermore there is interference with rs (reported speech) and app cl (appositive clauses). gl (genitive left) denotes possessives and gr (genitive right) denotes genitive adjuncts and von PPs with genitive function. gl constructions are easy to identify because they always precede their head, whereas the analysis of gr ultimately is a semantic problem, at least when it is realized by a von PP. Comparative complements (cc) and relative clauses (rc), which are often extraposed, are difficult to attach to the corresponding head. Coordination (cj) is also notoriously difficult and achieves fairly low values Disambiguation The figures in table 3 show that a selection performed by one of the versions of the stochastic disambiguation component clearly performs better than a random selection (lower bound). We also observe that the all properties version of the disambiguation component performs noticeably better than the only original properties version. In terms of overall f-score, the gain with respect to the lower bound doubles with the help of the additional properties; for the core grammatical functions, such as oa, sb etc., which are particularly important for the potential construction of a semantic representation on the basis of f-structures, this gain is even far more important. For many of the grammatical functions, the additional properties allow the all properties f-score to be closer to the upper bound f-score than to the lower bound f-score. As this is not the case of the only original properties f-scores, we believe that property design will be partic-
5 relation or upper bound lower bound feature precision recall f-score precision recall f-score all 61213/ / / /70482 = 88.1 = = 82.8 = preds only 22050/ / / /27328 = = = 76.2 = ams 0/2 = 0 0 0/2 = 0 0 app 185/268 = /337 = /282 = /336 = app cl 23/27 = 85 23/77 = /26 = 85 22/77 = cc 17/23 = 74 17/46 = /20 = 70 14/45 = cj 1183/1412 = /1806 = /1412 = /1806 = da 118/190 = /162 = /226 = /162 = det 3655/3816 = /3938 = /3822 = /3930 = gl 292/316 = /317 = /305 = /316 = gr 804/928 = /902 = /897 = /899 = measured 9/20 = 45 9/24 = /20 = 45 9/24 = mo 4997/6878 = /6610 = /6946 = /6601 = mod 2087/2219 = /2228 = /2226 = /2227 = name mod 336/420 = /385 = /424 = /385 = number 370/469 = /424 = /456 = /423 = oa 923/1098 = /1191 = /1104 = /1189 = oa2 0/1 = 0 0 obj 2916/3213 = /3180 = /3227 = /3174 = oc fin 151/212 = /226 = /211 = /226 = oc inf 340/379 = /411 = /387 = /411 = og 5/5 = 100 5/9 = /5 = 60 3/9 = op 267/389 = /526 = /377 = /526 = op dir 29/38 = 76 29/140 = /38 = 53 20/140 = op loc 35/52 = 67 35/59 = /44 = 52 23/59 = op manner 6/8 = 75 6/16 = /4 = 50 2/16 = pd 258/358 = /403 = /358 = /403 = pred restr 110/121 = /122 = /123 = /122 = quant 172/195 = /234 = /184 = /234 = rc 175/212 = /250 = /209 = /250 = rs 2/19 = 11 2/4 = /19 = 11 2/4 = sb 2549/3128 = /3274 = /3140 = /3272 = sbp 35/46 = 76 35/57 = /43 = 65 28/57 = topic disloc 1/16 = 6 1/3 = /18 = 0 0/3 = 0 0 case 7941/9004 = /9098 = /8991 = /9085 = circ form 5/8 = 62 5/6 = /8 = 62 5/6 = comp form 99/115 = 86 99/160 = /111 = 86 96/160 = coord form 557/613 = /648 = /615 = /648 = degree 2346/2640 = /2488 = /2668 = /2486 = det type 3628/3780 = /3779 = /3772 = /3771 = fut 61/63 = 97 61/71 = /65 = 94 61/71 = gend 7207/7829 = /7875 = /7850 = /7864 = mood 2129/2254 = /2366 = /2253 = /2364 = num 8739/9495 = /9333 = /9510 = /9319 = pass asp 258/287 = /324 = /287 = /324 = perf 296/301 = /355 = /299 = /355 = pers 2392/2621 = /2800 = /2617 = /2796 = precoord form 7/8 = 88 7/9 = /7 = 86 6/9 = pron form 71/74 = 96 71/72 = /74 = 96 71/72 = pron type 1282/1689 = /1482 = /1700 = /1482 = tense 2145/2240 = /2360 = /2239 = /2358 = Table 2: Upper bound and lower bound precisions, recalls and F-scores for grammatical relations and morphosyntactic features in the 1602 TiGer DB examples
6 ularly important for the further improvement of the stochastic disambiguation component. A further step that we plan to take and that, as we hope, will improve the results of the stochastic disambiguation, regardless of the properties that are used for it, is the acquisition of more training data Comparison with previous work Our results are comparable to those reported by Riezler et al. (2002) and Cahill et al. (2005) for English. Our score is improved by the fact that we check some morphological information like gender, number or tense, which a good chunker could also identify correctly. In a preds-only evaluation, the figures are lower, but the same tendency is observed with other parsers that are evaluated on dependencybased gold standards. Dubey and Keller (2003) induce a grammar from the NE- GRA Treebank, a predecessor of TIGER. They report a labelled precision and recall of up to 74%. The results for induced grammars seem to be worse for German with its free word order than for English. This also holds for the German LFG induced from the TIGER Corpus (Cahill et al., 2005). The authors report an f-score of 71%. The evaluation is equivalent to ours, i.e. based on dependency triples obtained via conversion from TIGER graphs. The testsuite which functions as a gold standard, however, is fairly small. One of the reasons for the low f-score seems to be the lack of morphological information and the very flat structure of the TIGER graphs. Integrating morphological information would certainly improve the score. The flat structure of the NEGRA and TIGER Treebanks may also have a negative influence on the quality of the induced grammars. Foth et al. (2005) describe a parsing system for unrestricted German text. Total coverage is achieved by means of defeasible, graded constraints. The authors report an f-score of 87% in an evaluation with the NEGRA Corpus. These are clearly the best results for German so far. They are also better than those reported by Schiehlen (2003), who achieves an f-score of 81.7% on the NEGRA data. In support of our approach, we would like to mention that our grammar is fully reversible and comes with a fullfledged generator. 7. Conclusion We have shown that a hand-crafted deep grammar can achieve good results on free text. The next step will be to refine our stochastic disambiguation component. Our grammar can also be used in generation, unlike other large-scale grammars of German. 8. References Miriam Butt, Helge Dyvik, Tracy H. King, Hiroshi Masuichi, and Christian Rohrer The Parallel Grammar Project. In Proceedings of COLING-2002 Workshop on Grammar Engineering and Evaluation, pages 1 7. Aoife Cahill, Michael Burke, Martin Forst, Ruth O Donovan, Christian Rohrer, Josef van Genabith, and Andy Way Treebank-Based Multilingual Unification-Grammar Resources. Research in Language and Computation. Richard Crouch, Ronald M. Kaplan, Tracy H. King, and Stefan Riezler A comparison of evaluation metrics for a broad-coverage parser. In Proceedings of the LREC Workshop Beyond PARSEVAL Towards improved evaluation mesures for parsing systems, pages 67 74, Las Palmas, Spain. Stefanie Dipper Implementing and Documenting Large-scale Grammars German LFG. Ph.D. thesis, IMS, University of Stuttgart. Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS), Volume 9, Number 1. Amit Dubey and Frank Keller Probabilistic Parsing for German using Sister-Head Dependencies. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages , Sapporo, Japan. Martin Forst, Núria Bertomeu, Berthold Crysmann, Frederik Fouvry, Silvia Hansen-Schirra, and Valia Kordoni Towards a dependency-based gold standard for German parsers The TiGer Dependency Bank. In Proceedings of the COLING Workshop on Linguistically Interpreted Corpora (LINC 04), Geneva. Kilian Foth, Wolfgang Menzel, and Ingo Schröder Robust parsing with weighted constraints. Natural Language Engineering, 11(1):1 25. Anette Frank, Tracy Holloway King, Jonas Kuhn, and John T. Maxwell III Optimality Theory Style Constraint Ranking in Large-Scale LFG Grammars. In Peter Sells, editor, Formal and Empirical Issues in Optimality Theoretic Syntax. Anette Frank A (Discourse) Functional Analysis of Asymmetric Coordination. In Proceedings of the 7th International LFG Conference (LFG 05), Athens, Greece. CSLI Publications. Tilmann Höhle Topologische Felder. Ph.D. thesis, University of Cologne. Ronald M. Kaplan, John T. Maxwell, Tracy H. King, and Richard Crouch Integrating Finite-state Technology with Deep LFG Grammars. In Proceedings of the ESSLLI 2004 Workshop on Combining Shallow and Deep Processing for NLP, Nancy, France. Tracy Holloway King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ronald M. Kaplan The PARC 700 Dependency Bank. In Proceedings of the EACL Workshop on Linguistically Interpreted Corpora (LINC 03), Budapest. Stefan Riezler, Tracy Holloway King, Ronald M. Kaplan, Richard Crouch, John T. Maxwell III, and Mark Johnson Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics 2002, Philadelphia. Michael Schiehlen Combining Deep and Shallow Approaches in Parsing German. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan.
Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG
Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,
More informationIntroduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationTowards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la
Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)
More informationAdapting Stochastic Output for Rule-Based Semantics
Adapting Stochastic Output for Rule-Based Semantics Wissenschaftliche Arbeit zur Erlangung des Grades eines Diplom-Handelslehrers im Fachbereich Wirtschaftswissenschaften der Universität Konstanz Februar
More informationBasic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.
Basic Syntax Doug Arnold doug@essex.ac.uk We review some basic grammatical ideas and terminology, and look at some common constructions in English. 1 Categories 1.1 Word level (lexical and functional)
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,
More informationModeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures
Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,
More informationEAGLE: an Error-Annotated Corpus of Beginning Learner German
EAGLE: an Error-Annotated Corpus of Beginning Learner German Adriane Boyd Department of Linguistics The Ohio State University adriane@ling.osu.edu Abstract This paper describes the Error-Annotated German
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationApproaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque
Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationThe presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.
Lecture 4: OT Syntax Sources: Kager 1999, Section 8; Legendre et al. 1998; Grimshaw 1997; Barbosa et al. 1998, Introduction; Bresnan 1998; Fanselow et al. 1999; Gibson & Broihier 1998. OT is not a theory
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationSpecifying a shallow grammatical for parsing purposes
Specifying a shallow grammatical for parsing purposes representation Atro Voutilainen and Timo J~irvinen Research Unit for Multilingual Language Technology P.O. Box 4 FIN-0004 University of Helsinki Finland
More informationFeature-Based Grammar
8 Feature-Based Grammar James P. Blevins 8.1 Introduction This chapter considers some of the basic ideas about language and linguistic analysis that define the family of feature-based grammars. Underlying
More informationConstruction Grammar. University of Jena.
Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What
More informationLFG Semantics via Constraints
LFG Semantics via Constraints Mary Dalrymple John Lamping Vijay Saraswat fdalrymple, lamping, saraswatg@parc.xerox.com Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304 USA Abstract Semantic theories
More informationTHE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES
THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES PRO and Control in Lexical Functional Grammar: Lexical or Theory Motivated? Evidence from Kikuyu Njuguna Githitu Bernard Ph.D. Student, University
More informationControl and Boundedness
Control and Boundedness Having eliminated rules, we would expect constructions to follow from the lexical categories (of heads and specifiers of syntactic constructions) alone. Combinatory syntax simply
More informationThe College Board Redesigned SAT Grade 12
A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationA relational approach to translation
A relational approach to translation Rémi Zajac Project POLYGLOSS* University of Stuttgart IMS-CL /IfI-AIS, KeplerstraBe 17 7000 Stuttgart 1, West-Germany zajac@is.informatik.uni-stuttgart.dbp.de Abstract.
More informationIndeterminacy by Underspecification Mary Dalrymple (Oxford), Tracy Holloway King (PARC) and Louisa Sadler (Essex) (9) was: ( case) = nom ( case) = acc
Indeterminacy by Underspecification Mary Dalrymple (Oxford), Tracy Holloway King (PARC) and Louisa Sadler (Essex) 1 Ambiguity vs Indeterminacy The simple view is that agreement features have atomic values,
More informationSwitched Control and other 'uncontrolled' cases of obligatory control
Switched Control and other 'uncontrolled' cases of obligatory control Dorothee Beermann and Lars Hellan Norwegian University of Science and Technology, Trondheim, Norway dorothee.beermann@ntnu.no, lars.hellan@ntnu.no
More information"f TOPIC =T COMP COMP... OBJ
TREATMENT OF LONG DISTANCE DEPENDENCIES IN LFG AND TAG: FUNCTIONAL UNCERTAINTY IN LFG IS A COROLLARY IN TAG" Aravind K. Joshi Dept. of Computer & Information Science University of Pennsylvania Philadelphia,
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationTheoretical Syntax Winter Answers to practice problems
Linguistics 325 Sturman Theoretical Syntax Winter 2017 Answers to practice problems 1. Draw trees for the following English sentences. a. I have not been running in the mornings. 1 b. Joel frequently sings
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationInterfacing Phonology with LFG
Interfacing Phonology with LFG Miriam Butt and Tracy Holloway King University of Konstanz and Xerox PARC Proceedings of the LFG98 Conference The University of Queensland, Brisbane Miriam Butt and Tracy
More informationThe Discourse Anaphoric Properties of Connectives
The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationDerivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.
Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationBuilding an HPSG-based Indonesian Resource Grammar (INDRA)
Building an HPSG-based Indonesian Resource Grammar (INDRA) David Moeljadi, Francis Bond, Sanghoun Song {D001,fcbond,sanghoun}@ntu.edu.sg Division of Linguistics and Multilingual Studies, Nanyang Technological
More informationConstraining X-Bar: Theta Theory
Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationNational University of Singapore Faculty of Arts and Social Sciences Centre for Language Studies Academic Year 2014/2015 Semester 2
National University of Singapore Faculty of Arts and Social Sciences Centre for Language Studies Academic Year 2014/2015 Semester 2 LAG2201 German 2 Course Outline Course coordinators and lecturers A/P
More informationA Computational Evaluation of Case-Assignment Algorithms
A Computational Evaluation of Case-Assignment Algorithms Miles Calabresi Advisors: Bob Frank and Jim Wood Submitted to the faculty of the Department of Linguistics in partial fulfillment of the requirements
More informationThe Strong Minimalist Thesis and Bounded Optimality
The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this
More informationChapter 4: Valence & Agreement CSLI Publications
Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).
More informationcmp-lg/ Jul 1995
A CONSTRAINT-BASED CASE FRAME LEXICON ARCHITECTURE 1 Introduction Kemal Oazer and Okan Ylmaz Department of Computer Engineering and Information Science Bilkent University Bilkent, Ankara 0, Turkey fko,okang@cs.bilkent.edu.tr
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationAnnotation Projection for Discourse Connectives
SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationThe Role of the Head in the Interpretation of English Deverbal Compounds
The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt
More informationENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist
Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationDevelopment of the First LRs for Macedonian: Current Projects
Development of the First LRs for Macedonian: Current Projects Ruska Ivanovska-Naskova Faculty of Philology- University St. Cyril and Methodius Bul. Krste Petkov Misirkov bb, 1000 Skopje, Macedonia rivanovska@flf.ukim.edu.mk
More informationInformation Status in Generation Ranking
Aoife Cahill nformation Status in Generation Ranking 1 / 57 nformation Status in Generation Ranking Aoife Cahill joint work with Arndt Riester Heidelberg Computational Linguistics Colloquium December 9,
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationUniversal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses
Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural
More informationFreitag 7. Januar = QUIZ = REFLEXIVE VERBEN = IM KLASSENZIMMER = JUDD 115
DEUTSCH 3 DIE DEBATTE: GEFÄHRLICHE HAUSTIERE Debatte: Freitag 14. JANUAR, 2011 Bewertung: zwei kleine Prüfungen. Bewertungssystem: (see attached) Thema:Wir haben schon die Geschichte Gefährliche Haustiere
More informationCOMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR
COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR ROLAND HAUSSER Institut für Deutsche Philologie Ludwig-Maximilians Universität München München, West Germany 1. CHOICE OF A PRIMITIVE OPERATION The
More informationTowards a MWE-driven A* parsing with LTAGs [WG2,WG3]
Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general
More informationUnderlying and Surface Grammatical Relations in Greek consider
0 Underlying and Surface Grammatical Relations in Greek consider Sentences Brian D. Joseph The Ohio State University Abbreviated Title Grammatical Relations in Greek consider Sentences Brian D. Joseph
More informationWriting a composition
A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a
More informationType-driven semantic interpretation and feature dependencies in R-LFG
Type-driven semantic interpretation and feature dependencies in R-LFG Mark Johnson Revision of 23rd August, 1997 1 Introduction This paper describes a new formalization of Lexical-Functional Grammar called
More informationPossessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand
1 Introduction Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand heidi.quinn@canterbury.ac.nz NWAV 33, Ann Arbor 1 October 24 This paper looks at
More informationELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit
Unit 1 Language Development Express Ideas and Opinions Ask for and Give Information Engage in Discussion ELD CELDT 5 EDGE Level C Curriculum Guide 20132014 Sentences Reflective Essay August 12 th September
More informationUpdate on Soar-based language processing
Update on Soar-based language processing Deryle Lonsdale (and the rest of the BYU NL-Soar Research Group) BYU Linguistics lonz@byu.edu Soar 2006 1 NL-Soar Soar 2006 2 NL-Soar developments Discourse/robotic
More informationLTAG-spinal and the Treebank
LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)
More informationHindi Aspectual Verb Complexes
Hindi Aspectual Verb Complexes HPSG-09 1 Introduction One of the goals of syntax is to termine how much languages do vary, in the hope to be able to make hypothesis about how much natural languages can
More informationArgument structure and theta roles
Argument structure and theta roles Introduction to Syntax, EGG Summer School 2017 András Bárány ab155@soas.ac.uk 26 July 2017 Overview Where we left off Arguments and theta roles Some consequences of theta
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More informationParallel Evaluation in Stratal OT * Adam Baker University of Arizona
Parallel Evaluation in Stratal OT * Adam Baker University of Arizona tabaker@u.arizona.edu 1.0. Introduction The model of Stratal OT presented by Kiparsky (forthcoming), has not and will not prove uncontroversial
More information1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class
If we cancel class 1/20 idea We ll spend an extra hour on 1/21 I ll give you a brief writing problem for 1/21 based on assigned readings Jot down your thoughts based on your reading so you ll be ready
More informationThe Interface between Phrasal and Functional Constraints
The Interface between Phrasal and Functional Constraints John T. Maxwell III* Xerox Palo Alto Research Center Ronald M. Kaplan t Xerox Palo Alto Research Center Many modern grammatical formalisms divide
More information! XLE: A First Walkthrough! Robustness techniques! Generation! Disambiguation! Applications: ! Provide detailed syntactic/semantic analyses
XLE: Grammar Development Platform Parser/Generator/Rewrite System ICON 2007 Miriam Butt (Universit( Universität Konstanz) Tracy Holloway King (PARC) Outline! What is a deep grammar and why would you want
More informationSom and Optimality Theory
Som and Optimality Theory This article argues that the difference between English and Norwegian with respect to the presence of a complementizer in embedded subject questions is attributable to a larger
More informationAdvanced Grammar in Use
Advanced Grammar in Use A self-study reference and practice book for advanced learners of English Third Edition with answers and CD-ROM cambridge university press cambridge, new york, melbourne, madrid,
More informationGuidelines for Writing an Internship Report
Guidelines for Writing an Internship Report Master of Commerce (MCOM) Program Bahauddin Zakariya University, Multan Table of Contents Table of Contents... 2 1. Introduction.... 3 2. The Required Components
More informationInleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3
Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection
More informationPhysics 270: Experimental Physics
2017 edition Lab Manual Physics 270 3 Physics 270: Experimental Physics Lecture: Lab: Instructor: Office: Email: Tuesdays, 2 3:50 PM Thursdays, 2 4:50 PM Dr. Uttam Manna 313C Moulton Hall umanna@ilstu.edu
More informationWelcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading
Welcome to the Purdue OWL This page is brought to you by the OWL at Purdue (http://owl.english.purdue.edu/). When printing this page, you must include the entire legal notice at bottom. Where do I begin?
More informationLNGT0101 Introduction to Linguistics
LNGT0101 Introduction to Linguistics Lecture #11 Oct 15 th, 2014 Announcements HW3 is now posted. It s due Wed Oct 22 by 5pm. Today is a sociolinguistics talk by Toni Cook at 4:30 at Hillcrest 103. Extra
More informationTHE VERB ARGUMENT BROWSER
THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW
More informationParticipate in expanded conversations and respond appropriately to a variety of conversational prompts
Students continue their study of German by further expanding their knowledge of key vocabulary topics and grammar concepts. Students not only begin to comprehend listening and reading passages more fully,
More informationNational Literacy and Numeracy Framework for years 3/4
1. Oracy National Literacy and Numeracy Framework for years 3/4 Speaking Listening Collaboration and discussion Year 3 - Explain information and ideas using relevant vocabulary - Organise what they say
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationPAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))
Ohio Academic Content Standards Grade Level Indicators (Grade 11) A. ACQUISITION OF VOCABULARY Students acquire vocabulary through exposure to language-rich situations, such as reading books and other
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationExploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data
Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer
More informationToday we examine the distribution of infinitival clauses, which can be
Infinitival Clauses Today we examine the distribution of infinitival clauses, which can be a) the subject of a main clause (1) [to vote for oneself] is objectionable (2) It is objectionable to vote for
More informationChapter 9 Banked gap-filling
Chapter 9 Banked gap-filling This testing technique is known as banked gap-filling, because you have to choose the appropriate word from a bank of alternatives. In a banked gap-filling task, similarly
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationIntension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation
Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation Gene Kim and Lenhart Schubert Presented by: Gene Kim April 2017 Project Overview Project: Annotate a large, topically
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationSemantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition
Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition Roy Bar-Haim,Ido Dagan, Iddo Greental, Idan Szpektor and Moshe Friedman Computer Science Department, Bar-Ilan University,
More informationMultiple case assignment and the English pseudo-passive *
Multiple case assignment and the English pseudo-passive * Norvin Richards Massachusetts Institute of Technology Previous literature on pseudo-passives (see van Riemsdijk 1978, Chomsky 1981, Hornstein &
More informationHindi-Urdu Phrase Structure Annotation
Hindi-Urdu Phrase Structure Annotation Rajesh Bhatt and Owen Rambow January 12, 2009 1 Design Principle: Minimal Commitments Binary Branching Representations. Mostly lexical projections (P,, AP, AdvP)
More information