Information Status in Generation Ranking

Size: px

Start display at page:

Download "Information Status in Generation Ranking"

Victoria McDaniel
6 years ago
Views:

1 Aoife Cahill nformation Status in Generation Ranking 1 / 57 nformation Status in Generation Ranking Aoife Cahill joint work with Arndt Riester Heidelberg Computational Linguistics Colloquium December 9, 2010

2 Aoife Cahill nformation Status in Generation Ranking 2 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

3 Aoife Cahill nformation Status in Generation Ranking 3 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

4 Aoife Cahill nformation Status in Generation Ranking 4 / 57 Outlining the problem German is considered a relatively free word order language (with a rich case system) Notion dates from a time when discourse information did not play much of a role in linguistics Our task: generating German strings from LFG F-structures The problem: how to choose the best string from the many grammatical strings output by the system?

5 Aoife Cahill nformation Status in Generation Ranking 5 / 57 Surface Realisation System Lexical Functional Grammar F-Structure Basic predicate argument structure "Die Nato werde nicht von der EU geführt." PRED SUBJ OBL-AG ADJUNCT CHECK 'führen<[249:von], [21:Nato]>' PRED 'Nato' _SPEC-TYPE _COUNT +, _DEF +, _DET attr CHECK _NFL strong-det NTYPE NSYN proper SPEC DET PRED 'die' DET-TYPE def 21 CASE nom, GEND fem, NUM sg, PERS 3 PRED OBJ 'von<[283:eu]>' PRED 'EU' CHECK _SPEC-TYPE _COUNT +, _DEF +, _DET attr _NFL strong-det NTYPE NSYN proper PRED 'die' SPEC DET DET-TYPE def 283 CASE dat, GEND fem, NUM sg, PERS PSEM dir, PTYPE sem PRED 'nicht' 215 ADJUNCT-TYPE neg _AUX-FORM werden-pass VLEX _VMORPH _AUX-SELECT sein _PARTCPLE perfect TNS-ASP MOOD subjunctive, PASS-SEM dynamic_, TENSE pres TOPC [21:Nato] 128 CLAUSE-TYPE decl, PASSVE +, STMT-TYPE decl, VTYPE main

6 Aoife Cahill nformation Status in Generation Ranking 6 / 57 Surface Realisation System Hand Crafted Large-Scale Grammar (Rohrer and Forst, 2006) generates all possible (grammatical) strings. NATO is not led by the EU. Die Nato werde von der EU nicht geführt. Nicht von der EU geführt werde die Nato. Nicht werde die Nato von der EU geführt. Nicht geführt werde die Nato von der EU. Von der EU werde die Nato nicht geführt. Von der EU geführt werde nicht die Nato. Geführt werde die Nato nicht von der EU. Geführt werde nicht von der EU die Nato. Geführt werde von der EU nicht die Nato. Die Nato werde nicht von der EU geführt. Nicht werde von der EU die Nato geführt. Nicht geführt werde von der EU die Nato. Von der EU nicht geführt werde die Nato. Von der EU werde nicht die Nato geführt. Von der EU geführt werde die Nato nicht. Geführt werde die Nato von der EU nicht. Geführt werde nicht die Nato von der EU. Geführt werde von der EU die Nato nicht.

7 Aoife Cahill nformation Status in Generation Ranking 7 / 57 Surface Realisation System (Cahill et al., 2007) Log-linear ranking model chooses most likely string Linguistically Motivated Feature Types 1. C-structure number of NPs, number of children of PP 2. C- & F-Structure SUBJ precedes OBJ 3. Language Model tri-gram score Outperforms a basic tri-gram language model, but can be further improved

8 Aoife Cahill nformation Status in Generation Ranking 7 / 57 Surface Realisation System (Cahill et al., 2007) Log-linear ranking model chooses most likely string Linguistically Motivated Feature Types 1. C-structure number of NPs, number of children of PP 2. C- & F-Structure SUBJ precedes OBJ 3. Language Model tri-gram score Outperforms a basic tri-gram language model, but can be further improved dea: Capturing the influence of discourse information can help choose the best string

9 Aoife Cahill nformation Status in Generation Ranking 8 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

10 Aoife Cahill nformation Status in Generation Ranking 9 / 57 nformation Status (S) (Prince 1981,1992) Means of discourse analysis Classifying (NP/PP/DP) constituents according to their givenness S is marked in prosody (Baumann, 2006; Schweitzer et al., 2009) as well as in syntax Corpus of German news texts manually annotated for S Advantages with regard to earlier S work: proper treatment of embedded phrases higher inter-annotator agreement on difficult texts closer to insights from semantic theory (e.g. semantic presuppositions)

11 Aoife Cahill nformation Status in Generation Ranking 10 / 57 S Labels: Riester, Lorenz, Seemann (2010) Full BRDGNG BRDGNG-CONTANED CATAPHOR EXPLETVE GVEN-EPTHET GVEN-PRONOUN GVEN-REFLEXVE GVEN-REPEATED GVEN-SHORT NDEF-GENERC NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED NDEF-RESUMPTVE NULL RELATVE STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN Collapsed BRDGNG CATAPHOR EXPLETVE GVEN NDEF NULL RELATVE STUATVE UNUSED

12 Aoife Cahill nformation Status in Generation Ranking 11 / 57 Most mportant Classes GVEN BRDGNG UNUSED-KNOWN UNUSED-UNKNOWN coreferential anaphor non-coreferential but context dependent expression discourse new, familiar definite discourse new, unfamiliar definite Merkel... sie Stuttgart... der Bahnhof der Mond das neue Gesetz zur Gesundheitsreform STUATVE deictic expression am Dienstag NDEF indefinite einige hundert Menschen

13 Aoife Cahill nformation Status in Generation Ranking 12 / 57 Grammaticality and markedness Two grammatical sentences The army has even been able to recapture smaller territories. (1) Die Armee habe sogar kleinere Gebiete zurückerobern können. (ok) (2) Kleinere Gebiete habe die Armee sogar zurückerobern können. (strongly marked) A sentence is marked precisely if there are only few or very special contexts in which it is appropriate

14 Aoife Cahill nformation Status in Generation Ranking 13 / 57 Capturing context nformation status reflects context to a certain degree S labels taken from corpus The army has even been able to recapture smaller territories. (3) Die Armee GVEN-EPTHET habe sogar kleinere Gebiete NDEF-NEW zurückerobern können. The givenness/novelty of an expression characterise the class of contexts in which the expression can occur Compute the preferred order for each pair of S labels

15 Aoife Cahill nformation Status in Generation Ranking 14 / 57 Precedence of label pairs within a clause X before Y (e.g. BRDGNG before UNUSED-UNKNOWN) Die Gespräche BRDGNG sollen heute in Jerusalem UNUSED-KNOWN fortgesetzt werden. The talks shall be continued in Jerusalem today. Occurrences in corpus: 49

16 Aoife Cahill nformation Status in Generation Ranking 14 / 57 Precedence of label pairs within a clause X before Y (e.g. BRDGNG before UNUSED-UNKNOWN) Die Gespräche BRDGNG sollen heute in Jerusalem UNUSED-KNOWN fortgesetzt werden. The talks shall be continued in Jerusalem today. Occurrences in corpus: 49 Y before X (e.g. UNUSED-UNKNOWN before BRDGNG) So müsse dies die britische Regierung UNUSED-KNOWN den Bürgern BRDGNG klarmachen. Thus, the British Government should make this clear to the citizens. Occurrences in corpus: 81

17 Aoife Cahill nformation Status in Generation Ranking 14 / 57 Precedence of label pairs within a clause X before Y (e.g. BRDGNG before UNUSED-UNKNOWN) Die Gespräche BRDGNG sollen heute in Jerusalem UNUSED-KNOWN fortgesetzt werden. The talks shall be continued in Jerusalem today. Occurrences in corpus: 49 less prominent order B Y before X (e.g. UNUSED-UNKNOWN before BRDGNG) So müsse dies die britische Regierung UNUSED-KNOWN den Bürgern BRDGNG klarmachen. Thus, the British Government should make this clear to the citizens. Occurrences in corpus: 81 dominant order A

18 Aoife Cahill nformation Status in Generation Ranking 15 / 57 Defining a measure Asymmetry ratio A (Dominant order) B Asym. ratio B/A Total Compute asymmetry ratio for each pair of S labels.

19 Asymmetry tables (top) Dominant order Asym. ratio Freq UNUSED-KNOWN before CATAPHOR GVEN-REPEATED before UNUSED-TYPE GVEN-PRONOUN before STUATVE GVEN-REFLEXVE before NDEF-NEW GVEN-PRONOUN before CATAPHOR GVEN-PRONOUN before NDEF-NEW BRDGNG before NDEF-GENERC GVEN-SHORT before GVEN-REPEATED GVEN-PRONOUN before UNUSED-TYPE GVEN-REFLEXVE before UNUSED-TYPE GVEN-EPTHET before UNUSED-TYPE UNUSED-KNOWN before UNUSED-TYPE EXPLETVE before NDEF-NEW Aoife Cahill... nformation Status in Generation Ranking 16 / 57

20 Aoife Cahill nformation Status in Generation Ranking 17 / 57 The crucial problem S is an indicator for constituent order, but... there is no reliable automatic annotation system for S

21 Aoife Cahill nformation Status in Generation Ranking 17 / 57 The crucial problem S is an indicator for constituent order, but... there is no reliable automatic annotation system for S First Attempt (Cahill and Riester, 2009): use morphosyntactic features correlated with S

22 Aoife Cahill nformation Status in Generation Ranking 18 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

23 Syntactic Features We define an inventory of syntactic features that can appear under all S labels and automatically mark up the corpus with them. The features include: is simple definite is simple definite description with a possessive modifier is definite description with adjectival modifier is definite description with a genitive argument is definite description with an (obligatory/referentially restricting) PP adjunct is definite description including a relative clause is definite description including an embedded proper name and (perhaps) a title or job description is a combination of position/title and proper name (without article) is a bare proper name... Aoife Cahill nformation Status in Generation Ranking 19 / 57

24 Aoife Cahill nformation Status in Generation Ranking 20 / 57 Morphosyntactic correlates of S Some S categories directly derive from syntactic classes (1:1 correspondence) GVEN-REFLEXVE s a reflexive pronoun (all items) EXPLETVE s an expletive, e.g. es (all items)

25 Aoife Cahill nformation Status in Generation Ranking 21 / 57 Morphosyntactic correlates of S Some S categories are represented by various features UNUSED-KNOWN feature items example s a simple definite 145 the moon s a name with a title 55 President Obama s a bare noun 54 Africa s definite with apposition 36 the German Chancellor, Angela Merkel...

26 Aoife Cahill nformation Status in Generation Ranking 22 / 57 Syntactic Features and S phrases Extracting information from the corpus We have a corpus that is: annotated with S labels marked up with syntactic features For each phrase annotated with an S label, look at what syntactic features are present Collect statistics for each S label type

27 Aoife Cahill nformation Status in Generation Ranking 23 / 57 Syntactic Features associated with S labels GVEN-PRONOUN Syn. Feat Count S_PERS_PRON 88 S_DA_PRON 56 S_DEMON_PRON 41 S_GENERC_PRON 16

28 Aoife Cahill nformation Status in Generation Ranking 23 / 57 Syntactic Features associated with S labels GVEN-PRONOUN Syn. Feat Count S_PERS_PRON 88 S_DA_PRON 56 S_DEMON_PRON 41 S_GENERC_PRON 16 NDEF-NEW Syn. Feat Count S_SMPLE_NDEF 203 S_NDEF_ATTR 95 S_NDEF_NUM 85 S_NDEF_GENARG 20 S_NDEF_PPADJUNCT 19...

29 Aoife Cahill nformation Status in Generation Ranking 24 / 57 S asymmetries with syntactic features Label 1 Label 2 Ratio Freq. UNUSED-KNOWN CATAPHOR S_BAREPROPER 166 S_SMPLE_DEF 14 S_SMPLE_DEF 102 S_DA_PRON 13 S_PROPER 85 GVEN-REPEATED UNUSED-TYPE S_SMPLE_DEF 28 S_SMPLE_DEF 37 S_BAREPROPER 23 S_SMPLE_NDEF 36 GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19...

30 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19

31 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV

32 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV S_PERS_PRON precedes S_SMPLE_DEF

33 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV S_PERS_PRON precedes S_SMPLE_DEF S_PERS_PRON precedes S_DEF_ATTR_ADJUNCT

34 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV S_PERS_PRON precedes S_SMPLE_DEF S_PERS_PRON precedes S_DEF_ATTR_ADJUNCT S_PERS_PRON precedes S_SMPLE_NDEF

35 Aoife Cahill nformation Status in Generation Ranking 25 / 57 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV S_PERS_PRON precedes S_SMPLE_DEF S_PERS_PRON precedes S_DEF_ATTR_ADJUNCT S_PERS_PRON precedes S_SMPLE_NDEF S_DA_PRON precedes S_TEMP_ADV

36 New Features From each S asymmetry extract precedence patterns of corresponding syntactic features GVEN-PRONOUN STUATVE S_PERS_PRON 88 S_TEMP_ADV 62 S_DA_PRON 56 S_SMPLE_DEF 44 S_DEMON_PRON 41 S_DEF_ATTR_ADJUNCT 23 S_GENERC_PRON 16 S_SMPLE_NDEF 19 S_PERS_PRON precedes S_TEMP_ADV S_PERS_PRON precedes S_SMPLE_DEF S_PERS_PRON precedes S_DEF_ATTR_ADJUNCT S_PERS_PRON precedes S_SMPLE_NDEF S_DA_PRON precedes S_TEMP_ADV S_DA_PRON precedes S_SMPLE_DEF S_DA_PRON precedes S_DEF_ATTR_ADJUNCT S_DA_PRON precedes S_SMPLE_NDEF S_DEMON_PRON precedes S_TEMP_ADV... Aoife Cahill nformation Status in Generation Ranking 25 / 57

37 Aoife Cahill nformation Status in Generation Ranking 26 / 57 mproved Generation Ranking Model We include these new features in our svm model for generation ranking Feature Types 1. C-structure number of NPs, number of children of PP 2. C- & F-Structure SUBJ precedes OBJ 3. Language Model tri-gram score 4. S asymmetric syntactic patterns S_PERS_PRON precedes S_TEMP_ADV

38 Aoife Cahill nformation Status in Generation Ranking 27 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

39 Aoife Cahill nformation Status in Generation Ranking 28 / 57 System Overview Machine Translation Sentence Condensation LFG F-Structure Grammar All Strings Summarisation Corpus Sentences Language Model Features Linguistically Motivated Features Ranking Model S features? Best String

40 Aoife Cahill nformation Status in Generation Ranking 29 / 57 Experimental Setup Experiment Train svm ranking model on 7161 syntactically annotated sentences from TGER Tune model parameters on development set of 55 sentences Carry out final evaluation on test set of 260 sentences

41 Aoife Cahill nformation Status in Generation Ranking 30 / 57 Results Evaluation on 260 sentences BLEU measures string similarity using ngrams Slightly different to Cahill and Riester (2009): Uses SVM rank instead of log-linear model asymmetries calculated from more data... but same features

42 Aoife Cahill nformation Status in Generation Ranking 30 / 57 Results Evaluation on 260 sentences BLEU measures string similarity using ngrams Slightly different to Cahill and Riester (2009): Uses SVM rank instead of log-linear model asymmetries calculated from more data... but same features BLEU Exact Match (%) Baseline S Approx

43 Aoife Cahill nformation Status in Generation Ranking 30 / 57 Results Evaluation on 260 sentences BLEU measures string similarity using ngrams Slightly different to Cahill and Riester (2009): Uses SVM rank instead of log-linear model asymmetries calculated from more data... but same features BLEU Exact Match (%) Baseline S Approx Statistically significant improvement with model including new S-inspired syntactic features

44 Aoife Cahill nformation Status in Generation Ranking 31 / 57 Example Sentences We have learnt from the scandal Gold Man hat aus der Affäre gelernt. One has from the scandal learnt.

45 Aoife Cahill nformation Status in Generation Ranking 31 / 57 Example Sentences We have learnt from the scandal Gold Man hat aus der Affäre gelernt. One has from the scandal learnt. Baseline Aus der Affäre hat man gelernt. From the scandal has one learnt.

46 Aoife Cahill nformation Status in Generation Ranking 31 / 57 Example Sentences We have learnt from the scandal Gold Man hat aus der Affäre gelernt. One has from the scandal learnt. Baseline Aus der Affäre hat man gelernt. From the scandal has one learnt. New Man hat aus der Affäre gelernt. One has from the scandal learnt.

47 Aoife Cahill nformation Status in Generation Ranking 32 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

48 Aoife Cahill nformation Status in Generation Ranking 33 / 57 Predicting nformation Status? We showed that for realisation ranking, the approximation of the morpho-syntactic features of the information status labels helped But what if we could automatically label raw text with information status labels?

49 Aoife Cahill nformation Status in Generation Ranking 34 / 57 Supervised Learning Task Given a corpus of manually annotated radio news 3454 sentences remove duplicates divide into 10% development (129 sentences), 90% training/test (1169 sentences) parse with XLE German grammar Task: sequence labelling Model: Conditional Random Field Designed Features to capture the basic geometry of the expressions

50 Aoife Cahill nformation Status in Generation Ranking 35 / 57 Capturing the Geometry of Expressions STUATVE STUATVE

51 Aoife Cahill nformation Status in Generation Ranking 36 / 57 Capturing the Geometry of Expressions GVEN-SHORT GVEN-PRONOUN

52 Aoife Cahill nformation Status in Generation Ranking 37 / 57 Capturing the Geometry of Expressions BRDGNG-CONTANED

53 Aoife Cahill nformation Status in Generation Ranking 38 / 57 Capturing the Geometry of Expressions UNUSED-UNKNOWN

54 Aoife Cahill nformation Status in Generation Ranking 39 / 57 Model Features Starting Point Morpho-syntactic features from previous work Things we count Words Specific syntactic categories: DP, NP, DP-APPOSS, LABELP, NAMEP, YEAR, A-CARD Children of the top category Maximum path length from top node to POS tags N-ary branching nodes (n > 1)

55 Aoife Cahill nformation Status in Generation Ranking 40 / 57 Model Features Binary Features Coordination Coreferent More than 1 DP and NP Pronoun First/Last label in the sentences Other Features Determiner type (definite, indefinite, unknown) Syntactic category of the top-most node dominating the string Syntactic function of the substring POS tag at left/right edge of the substring

56 Evaluation Carry out 10-fold cross validation on our test/train data (1169 sentence, 3705 labels) Evaluate on both sets of labels: full (20) and collapsed (9) Three Baselines: 1 Randomly assign a label to each phrase 2 Always assign the most frequent label to each phrase 3 nformed: assign the most frequent label, given the morpho-syntactic features from previous experiments Aoife Cahill nformation Status in Generation Ranking 41 / 57

57 Evaluation Carry out 10-fold cross validation on our test/train data (1169 sentence, 3705 labels) Evaluate on both sets of labels: full (20) and collapsed (9) Three Baselines: 1 Randomly assign a label to each phrase 2 Always assign the most frequent label to each phrase 3 nformed: assign the most frequent label, given the morpho-syntactic features from previous experiments Accuracy (%) Full Collapsed Random Most Frequent nformed Aoife Cahill nformation Status in Generation Ranking 41 / 57

58 Aoife Cahill nformation Status in Generation Ranking 42 / 57 CRF Model Prediction Results Accuracy (%) Full Collapsed Random Most Frequent nformed CRF % increase in full label set accuracy, 16.39% increase on collapsed set accuracy

59 Aoife Cahill nformation Status in Generation Ranking 43 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG CATAPHOR EXPLETVE GVEN NDEF NULL RELATVE STUATVE UNUSED High level prediction could be used to suggest possible labels to annotators and possibly speed up the manual annotation process

60 Aoife Cahill nformation Status in Generation Ranking 43 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG CATAPHOR EXPLETVE GVEN NDEF NULL RELATVE STUATVE UNUSED High level prediction could be used to suggest possible labels to annotators and possibly speed up the manual annotation process

61 Aoife Cahill nformation Status in Generation Ranking 44 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG BRDGNG-CONTANED CATAPHOR EXPLETVE GVEN-EPTHET GVEN-PRONOUN GVEN-REFLEXVE GVEN-REPEATED GVEN-SHORT NDEF-GENERC NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED NDEF-RESUMPTVE NULL RELATVE STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

62 Aoife Cahill nformation Status in Generation Ranking 44 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG BRDGNG-CONTANED CATAPHOR EXPLETVE GVEN-EPTHET GVEN-PRONOUN GVEN-REFLEXVE GVEN-REPEATED GVEN-SHORT NDEF-GENERC NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED NDEF-RESUMPTVE NULL RELATVE STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

63 Aoife Cahill nformation Status in Generation Ranking 44 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG BRDGNG-CONTANED CATAPHOR EXPLETVE GVEN-EPTHET GVEN-PRONOUN GVEN-REFLEXVE GVEN-REPEATED GVEN-SHORT NDEF-GENERC NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED NDEF-RESUMPTVE NULL RELATVE STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

64 Aoife Cahill nformation Status in Generation Ranking 44 / 57 Detailed CRF Prediction Results Label Total Precision Recall F-Score BRDGNG BRDGNG-CONTANED CATAPHOR EXPLETVE GVEN-EPTHET GVEN-PRONOUN GVEN-REFLEXVE GVEN-REPEATED GVEN-SHORT NDEF-GENERC NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED NDEF-RESUMPTVE NULL RELATVE STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

65 Aoife Cahill nformation Status in Generation Ranking 45 / 57 Confusion Matrix (Human Annotators) Riester, Lorenz, Seemann (2010) A B C D E F G H J K L M N O P Q R S T A B C D E F G 65 1 H J K L M N O P Q 11 R 4 S 1 5 T 1 45

66 Aoife Cahill nformation Status in Generation Ranking 46 / 57 Confusion Matrix (Automatic System) A B C D E F G H J K L M N O P Q R S T A B C D 73 E F G 2 95 H J K L M N 3 19 O 1 P 7 Q R S T

67 Aoife Cahill nformation Status in Generation Ranking 46 / 57 Confusion Matrix (Automatic System) A B C D E F G H J K L M N O P Q R S T A B C D 73 E F G 2 95 H J K L M N 3 19 O 1 P 7 Q R S T

68 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix BRDGNG K R A BRDGNG-CONTANED C D E F G H NDEF-GENERC 1 76 NDEF-NEW NDEF-PARTTVE M 35 1 N 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

69 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix BRDGNG K R A BRDGNG-CONTANED C Confusing BRDGNG with UNUSED-KNOWN D Human annotators have E the same confusion 5/89 times F (4) Die Behörden gaben G eine Tsunami-Warnung für die H The authorities gave a Tsunami-warning for the Westküste heraus. NDEF-GENERC 1 76 west coast out. NDEF-NEW The authorities NDEF-PARTTVE gave a Tsunami-warning 3 78for the3 west M 35 1 coast N 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

70 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix A NDEF-NEW R BRDGNG BRDGNG-CONTANED C D E F G H NDEF-GENERC 1 76 K NDEF-PARTTVE NDEF-PARTTVE-CONTANED 35 1 NDEF-RESUMPTVE 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

71 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix A NDEF-NEW R BRDGNG BRDGNG-CONTANED C Confusing NDEF-NEW with NDEF-GENERC D Human annotators have E the same confusion 20/144 times F (5) Nach Angaben G japanischer Medien kam ein Mensch H According to reports Japanese media came a person ums Leben, NDEF-GENERC viele Einwohner wurden 1 verletzt. 76 for life, manykinhabitants were1 injured NDEF-PARTTVE According to Japanese media reports, one person died, NDEF-PARTTVE-CONTANED 35 1 many inhabitants were injured NDEF-RESUMPTVE 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

72 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix A K UNUSED-KNOWN BRDGNG BRDGNG-CONTANED C D E F G H J 1 76 NDEF-NEW NDEF-PARTTVE NDEF-PARTTVE-CONTANED 35 1 N 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN

73 Aoife Cahill nformation Status in Generation Ranking 47 / 57 Confusion Matrix A K UNUSED-KNOWN BRDGNG BRDGNG-CONTANED C Confusing UNUSED-KNOWN with UNUSED-UNKNOWN D E F (6) Der Kölner Erzbischof G Meisner kritisiert die H The Cologne Archbishop Meisner criticised the Familienpolitik Jder Bundesregierung family politics NDEF-NEW of the federal government The Archbishop NDEF-PARTTVE of Cologne, Meisner, 3 78 criticised3the NDEF-PARTTVE-CONTANED 35 1 family policies of the federal government N 19 O P STUATVE UNUSED-KNOWN UNUSED-TYPE UNUSED-UNKNOWN Human annotators have the same confusion 7 / 134 times

74 Aoife Cahill nformation Status in Generation Ranking 48 / 57 Addressing our underlying assumptions 1 Gold-standard co-reference information (D-GVEN) 2 Gold-standard markables

75 Aoife Cahill nformation Status in Generation Ranking 48 / 57 Addressing our underlying assumptions 1 Gold-standard co-reference information (D-GVEN) 2 Gold-standard markables Real-world applications will not have access to this information Test two automatic co-reference systems on the data Accuracy (%) Full Collapsed Gold None Simple Unsupervised

76 Aoife Cahill nformation Status in Generation Ranking 49 / 57 Summary of Automatic S Label Prediction Trained a CRF on manually annotated text Results are high for collapsed label set (81.65%) and well above baseline for full label set (64.87%) Often the mistakes made by the automatic system are similar to the disagreements that human annotators have

77 Aoife Cahill nformation Status in Generation Ranking 49 / 57 Summary of Automatic S Label Prediction Trained a CRF on manually annotated text Results are high for collapsed label set (81.65%) and well above baseline for full label set (64.87%) Often the mistakes made by the automatic system are similar to the disagreements that human annotators have Q: How useful is it in practice?

78 Aoife Cahill nformation Status in Generation Ranking 50 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

79 Aoife Cahill nformation Status in Generation Ranking 51 / 57 An application for S Label Prediction Revisit our earlier realisation ranking experiments No need to use approximations of S Labels any more Train CRF on 1169 sentences of manually annotated corpus (test/train) Automatically assign an S label to every DP/NP in our TGER training data (21,341 phrases) Extract S Label order patterns directly

80 Aoife Cahill nformation Status in Generation Ranking 52 / 57 Even Newer Generation Ranking Model We include the S Label asymmetric patterns directly into the svm ranking model now Feature Types 1. C-structure number of NPs, number of children of PP 2. C- & F-Structure SUBJ precedes OBJ 3. Language Model tri-gram score 4. S asymmetric syntactic patterns S_PERS_PRON precedes S_TEMP_ADV 4. S label asymmetric patterns D-GVEN-SHORT precedes NDEF-NEW

81 Aoife Cahill nformation Status in Generation Ranking 53 / 57 Evaluation Evaluate on 260 sentences BLEU Exact Match (%) Baseline S Approx S Label (full) S Label (collapsed) Difference between the S Label (full) model and all other models is statistically significant

82 Aoife Cahill nformation Status in Generation Ranking 53 / 57 Evaluation Evaluate on 260 sentences BLEU Exact Match (%) Baseline S Approx S Label (full) S Label (collapsed) Difference between the S Label (full) model and all other models is statistically significant

83 Aoife Cahill nformation Status in Generation Ranking 53 / 57 Evaluation Evaluate on 260 sentences BLEU Exact Match (%) Baseline S Approx S Label (full) S Label (collapsed) Difference between the S Label (full) model and all other models is statistically significant

84 Aoife Cahill nformation Status in Generation Ranking 54 / 57 Sample mprovement (7) m September forderten Demonstranten den Abzug in September demanded 85,000 demonstrators the withdrawal der auf der nsel stationierten US-Soldaten. of the 29,000 on the island stationed US soldiers. 85,000 demonstrators demanded the withdrawal of the 29,000 US soldiers that were stationed on the island S Approximations Demonstranten forderten den Abzug der auf der nsel stationierten US-Soldaten im September. S Labels m September forderten Demonstranten den Abzug der auf der nsel stationierten US-Soldaten.

85 Aoife Cahill nformation Status in Generation Ranking 55 / 57 Outline 1 ntroduction 2 nformation Status 3 Approximating nformation Status 4 Generation Ranking 5 Predicting nformation Status 6 Generation Ranking Revisited 7 Conclusion

86 Aoife Cahill nformation Status in Generation Ranking 56 / 57 Conclusions We have shown that a realisation ranking system can benefit from information status Approximating the information status markup using morpho-syntactic features works well Using automatically assigned information status labels works better We trained a CRF model to automatically predict an S label for a phrase, given its parse Prediction quality on a subset of more general labels is high (81.65%) and for the full label set is well above the informed baseline (64.87%)

87 Aoife Cahill nformation Status in Generation Ranking 57 / 57 Outstanding ssues and Future Directions nvestigate the integration of lexical (and other) resources to improve the classification of certain phrases Currently we still only consider single sentences. Future work will also look at preceding context Look into carrying out an experiment with human annotators, automatically suggesting labels for them Continue working with colleagues to improve the automatic co-reference detection for our purposes and also apply it to the TGER training corpuse nvestigate other parsers during feature extraction for S label prediction model

88 Aoife Cahill nformation Status in Generation Ranking 58 / 57 Thank you! This work was funded by the Collaborative Research Centre (SFB 732) at the University of Stuttgart.

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,