Toward a methodology of chunking: applications and extensions of Linear Unit Grammar Ray Carey University of Helsinki
|
|
- Scot Armstrong
- 5 years ago
- Views:
Transcription
1 Toward a methodology of chunking: applications and extensions of Linear Unit Grammar Ray Carey University of Helsinki
2 What & why ELFA corpus (2008, English as a Lingua Franca in Academic Settings 1 million words of naturally occurring academic ELF corpus-based fluency research (cf. Götz 2013) chunking as descriptive methodology chunk-based patterns of ordinary dysfluency description of (dys)fluency features among L2 users implications for language testing?
3 What kind of chunking? chunking as storage / production Ellis 2002, Bybee 2010 chunking as processing / parsing Hunston & Francis 2000, Mason 2007 Linear Unit Grammar (LUG) Sinclair & Mauranen 2006 oriented toward processing, with implications for storage robust parsing of spoken and written text
4 LUG: core concepts chunking is intuitive & pre-theoretical: prospection (suspension) completion (extension) Brazil 1995, A Grammar of Speech different analysts will have different intuitions much of LUG chunking is routinized & easily learned labeling chunks is systematic: organising (O) or incrementing message (M) prospection (M-) / completion (+M) linear analysis: what precedes & what likely follows? neither dependent on nor incompatible with traditional grammatical categories
5 Studies incorporating LUG Describing the LUG model Sinclair & Mauranen 2006; Mauranen 2013 Organizing chunks in ELF Mauranen 2009, Carey 2013 LUG and metadiscourse Smart 2014 LUG & discourse markers Huang 2013 LUG-based model of turn-taking & interruption Carey 2011 Applying LUG to literary analysis Stone 2011 Tone unit boundaries & LUG chunking boundaries Cheng, Greaves, Warren
6 Linear, real-time processing whatidefendisthater(.)wecanergivethe mtheermtheknowledgethisspecifickno wledgetheytheywant
7 Words are chunks too what i defend is that er (.) we can er give them the erm the knowledge this specific knowledge they they want
8 Provisional Unit Boundaries what i defend is that er (.) we can er give them the erm the knowledge this specific knowledge they they want
9 M: message-oriented O: organisation-oriented what i defend is that er (.) we can er give them the erm the knowledge this specific knowledge they they want
10 M: message-oriented O: organisation-oriented M what i defend is O that O er O (.) M we can O er M O M M M M give them the erm the knowledge this specific knowledge they they want
11 OT: text organising OI: interaction organising M what i defend is OT that OI er OI (.) M we can OI er M OI M M M M give them the erm the knowledge this specific knowledge they they want
12 5 types of M (message) chunks M- what i defend is OT that OI er OI (.) +M- we can OI er +MA give them the OI erm +M the knowledge MR this specific knowledge MF they MS they want MA = Message Adjustment / MF = Message Fragment MR = Message Revision / MS = Message Supplement
13 Toward LUG-annotated corpora Smart 2014: first LUG-annotated written corpus 40,000 words of IMDb message board discussions Two major challenges: consistency in judgments (systematization) takes forever to do manual annotation Partial automation no substitute for human intuition advantage of provisional annotations
14 Step 1: find the O chunks Following Sinclair & Mauranen 2006: O chunks are fixed & recurring, easy to find good place to start placing boundaries Data-driven lists of O chunks ELFA corpus n-grams: 2-6 words manually examining concordance lines single units: er, and, but, yeah, erm, mhm bigrams: high-frequency chunk boundaries
15 Step 2: guess the M chunks Message Fragments (MF) preliminary boundaries at repeats, false starts (wha-) Shallow noun phrase chunking: remaining data is part-of-speech tagged by TreeTagger Natural Language Tool Kit: regular expression chunker Bird, Loper & Klein 2009 data-driven regex patterns (based on 24 min. of data) Shallow NP combined with surrounding words, rough guesses at M chunk labels
16 Human analysis in XML Preliminary chunker outputs data in XML format data is overchunked by design: easier to merge than create new annotations XML is formatted using cascading stylesheet (CSS) analyzed & revised in Oxygen XML editor chunk boundaries are merged, altered chunk labels are assigned through drop-down menus
17 Working with XML through CSS
18 Measuring chunker accuracy Each token treated as an observation ends with chunk boundary or no boundary chunker output compared to gold standard true/false positives, true/false negatives calculated After 32,000 words (3.5 hours of data) accuracy: 87% (% of observations in agreement) recall: 92% (% of gold boundaries predicted) precision: 80% (% of predicted boundaries correct)
19 Is LUG reproducible among different analysts? inter-rater reliability test trained research assistant using 1400 words of text developed set of chunking guidelines (<2 pages) independently chunked 5210 words of spoken text 2k words from ELFA corpus, 3k words from Hong Kong Corpus of Spoken English (prosodic) Cohen s Kappa = >.81 almost perfect (Landis & Koch 1977) Smart 2014: test with 288 words, Kappa=
20 Relation between chunks and tone units? Hong Kong Corpus of Spoken English (prosodic) HKCSE: Cheng, Greaves, Warren ,000 words of spoken English (L1/L2), annotated for tone unit boundaries 3135 words for inter-rater chunking (1108 TUBs) How well does LUG predict tone unit boundaries? accuracy: 80% (% of observations in agreement) recall: 70% (% tone unit boundaries predicted) precision: 73% (% of predicted boundaries correct)
21 Conclusions LUG is a robust descriptive model of language can handle transcriptions of challenging spoken data judgments can be regularised and systematised good potential for inter-rater reproducibility supported by intonation patterns automatisation is within reach Ideal for corpus-based studies of (dys)fluency Message Fragments (MF), Message Adjustments (MA) inter-speaker distribution of (dys)fluency features
22 elfaproject.wordpress.com Bird, S., Loper, E. & Klein, E. (2009) Natural Language Processing with Python. O Reilly Media Inc. Brazil, D. (1995) A Grammar of Speech. Describing English Language. Oxford: OUP. Bybee, J. (2010) Language, Usage and Cognition. Cambridge: CUP. Carey, R. (2011) Interruption and Uncooperativeness in Academic ELF Group Work: An Application of Linear Unit Grammar. MA thesis, University of Helsinki, Finland. Carey, R. (2013) On the other side: formulaic organizing chunks in spoken and written academic ELF. Journal of English as a Lingua Franca, 2(2), Cheng, W., Greaves, C. & Warren, M. (2008) A corpus-driven study of discourse intonation: the Hong Kong corpus of spoken English (prosodic). Studies in Corpus Linguistics, vol. 32. Amsterdam: John Benjamins. ELFA (2008) The Corpus of English as a Lingua Franca in Academic Settings. Director: Anna Mauranen. Ellis, N.C. (2002) Frequency effects in language processing. Studies in second language acquisition, 24(2), Götz, S. (2013) Fluency in native and non-native English speech. Studies in Corpus Linguistics, vol. 53. Amsterdam: John Benjamins
23 elfaproject.wordpress.com Huang, Lan-fen (2013) The use of Linear Unit Grammar (LUG) in the investigation of discourse markers in spoken English. International Journal of Language Studies, 7(3), Hunston, S. & Francis, G. (2000) Pattern Grammar: a corpus-driven approach to the lexical grammar of English. Studies in Corpus Linguistics, vol. 4. Amsterdam: John Benjamins. Mason, O. (2007) From lexis to syntax: the use of multi-word units in grammatical description. Proceedings of the 26th Intl. Conference on Lexis and Grammar, Bonifacio, 2-6 October Landis, J.R. & Koch, G.G. (1977) The measurement of observer agreement for categorical data. Biometrics, 33(1), Mauranen, A. (2009) Chunking in ELF: Expressions for managing interaction. Intercultural Pragmatics, 6(2), Mauranen, A. (2013) Linear Unit Grammar. In The Encyclopedia of Applied Linguistics, Chapelle, C.A. (ed.). Oxford: Wiley-Blackwell. Sinclair, J.McH. & Mauranen, A. (2006) Linear Unit Grammar: Integrating Speech and Writing. Studies in Corpus Linguistics, vol. 25. Amsterdam: John Benjamins. Smart, C. (2014) The role of discourse reflexivity in a linear description of grammar and discourse: the case of IMDb message boards. Unpublished doctoral dissertation, University of Birmingham. Stone, L. (2011) Grammatical patterning in literary texts and Linear Unit Grammar in the dialogue of Hills like White Elephants by Ernest Hemingway. Innervate, 4 ( ),
Procedia - Social and Behavioral Sciences 154 ( 2014 )
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 154 ( 2014 ) 263 267 THE XXV ANNUAL INTERNATIONAL ACADEMIC CONFERENCE, LANGUAGE AND CULTURE, 20-22 October
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationReview in ICAME Journal, Volume 38, 2014, DOI: /icame
Review in ICAME Journal, Volume 38, 2014, DOI: 10.2478/icame-2014-0012 Gaëtanelle Gilquin and Sylvie De Cock (eds.). Errors and disfluencies in spoken corpora. Amsterdam: John Benjamins. 2013. 172 pp.
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationCorpus Linguistics (L615)
(L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives
More informationCandidates must achieve a grade of at least C2 level in each examination in order to achieve the overall qualification at C2 Level.
The Test of Interactive English, C2 Level Qualification Structure The Test of Interactive English consists of two units: Unit Name English English Each Unit is assessed via a separate examination, set,
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationDOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY?
DOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY? Noor Rachmawaty (itaw75123@yahoo.com) Istanti Hermagustiana (dulcemaria_81@yahoo.com) Universitas Mulawarman, Indonesia Abstract: This paper is based
More informationEnglish Language and Applied Linguistics. Module Descriptions 2017/18
English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,
More informationCEFR Overall Illustrative English Proficiency Scales
CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationProcedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 141 ( 2014 ) 124 128 WCLTA 2013 Using Corpus Linguistics in the Development of Writing Blanka Frydrychova
More informationAN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC PP. VI, 282)
B. PALTRIDGE, DISCOURSE ANALYSIS: AN INTRODUCTION (2 ND ED.) (LONDON, BLOOMSBURY ACADEMIC. 2012. PP. VI, 282) Review by Glenda Shopen _ This book is a revised edition of the author s 2006 introductory
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationMandarin Lexical Tone Recognition: The Gating Paradigm
Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition
More informationLinguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis
International Journal of Arts Humanities and Social Sciences (IJAHSS) Volume 1 Issue 1 ǁ August 216. www.ijahss.com Linguistic Variation across Sports Category of Press Reportage from British Newspapers:
More informationDialog Act Classification Using N-Gram Algorithms
Dialog Act Classification Using N-Gram Algorithms Max Louwerse and Scott Crossley Institute for Intelligent Systems University of Memphis {max, scrossley } @ mail.psyc.memphis.edu Abstract Speech act classification
More informationIntroduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationThink A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -
C.E.F.R. Oral Assessment Criteria Think A F R I C A - 1 - 1. The extracts in the left hand column are taken from the official descriptors of the CEFR levels. How would you grade them on a scale of low,
More informationFormulaic Language and Fluency: ESL Teaching Applications
Formulaic Language and Fluency: ESL Teaching Applications Formulaic Language Terminology Formulaic sequence One such item Formulaic language Non-count noun referring to these items Phraseology The study
More informationSpoken English, TESOL and Applied Linguistics
Spoken English, TESOL and Applied Linguistics Also by Rebecca Hughes ENGLISH IN SPEECH AND WRITING: Investigating Language and Literature EXPLORING GRAMMAR IN CONTEXT (co-author) TEACHING AND RESEARCHING
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationTo appear in The TESOL encyclopedia of ELT (Wiley-Blackwell) 1 RECASTING. Kazuya Saito. Birkbeck, University of London
To appear in The TESOL encyclopedia of ELT (Wiley-Blackwell) 1 RECASTING Kazuya Saito Birkbeck, University of London Abstract Among the many corrective feedback techniques at ESL/EFL teachers' disposal,
More informationAtypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty
Atypical Prosodic Structure as an Indicator of Reading Level and Text Difficulty Julie Medero and Mari Ostendorf Electrical Engineering Department University of Washington Seattle, WA 98195 USA {jmedero,ostendor}@uw.edu
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationAdvanced Grammar in Use
Advanced Grammar in Use A self-study reference and practice book for advanced learners of English Third Edition with answers and CD-ROM cambridge university press cambridge, new york, melbourne, madrid,
More informationThe Ups and Downs of Preposition Error Detection in ESL Writing
The Ups and Downs of Preposition Error Detection in ESL Writing Joel R. Tetreault Educational Testing Service 660 Rosedale Road Princeton, NJ, USA JTetreault@ets.org Martin Chodorow Hunter College of CUNY
More informationThe Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh
The Effect of Discourse Markers on the Speaking Production of EFL Students Iman Moradimanesh Abstract The research aimed at investigating the relationship between discourse markers (DMs) and a special
More informationLEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE
LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationLexical Collocations (Verb + Noun) Across Written Academic Genres In English
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 182 ( 2015 ) 433 440 4th WORLD CONFERENCE ON EDUCATIONAL TECHNOLOGY RESEARCHES, WCETR- 2014 Lexical Collocations
More informationInternational Conference on Current Trends in ELT
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Scien ce s 98 ( 2014 ) 52 59 International Conference on Current Trends in ELT Pragmatic Aspects of English for
More informationIndividual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION
L I S T E N I N G Individual Component Checklist for use with ONE task ENGLISH VERSION INTRODUCTION This checklist has been designed for use as a practical tool for describing ONE TASK in a test of listening.
More informationVocabulary Usage and Intelligibility in Learner Language
Vocabulary Usage and Intelligibility in Learner Language Emi Izumi, 1 Kiyotaka Uchimoto 1 and Hitoshi Isahara 1 1. Introduction In verbal communication, the primary purpose of which is to convey and understand
More informationCreate Quiz Questions
You can create quiz questions within Moodle. Questions are created from the Question bank screen. You will also be able to categorize questions and add them to the quiz body. You can crate multiple-choice,
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationLessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities
Lessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities Simon Clematide, Isabel Meraner, Noah Bubenhofer, Martin Volk Institute of Computational Linguistics
More informationCONTENUTI DEL CORSO (presentazione di disciplina, argomenti, programma):
1 DOCENTE: VIRDIS DANIELA FRANCESCA DENOMINAZIONE INSEGNAMENTO: LINGUA INGLESE 3 CORSO DI LAUREA: LINGUE E CULTURE PER LA MEDIAZIONE LINGUISTICA CFU: 12 / 9 / 6 CONTENUTI DEL CORSO (presentazione di disciplina,
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationKeynote. Developments in English for Specific Purposes Research. Brian Paltridge University of Sydney
Keynote Developments in English for Specific Purposes Research Brian Paltridge University of Sydney This paper reviews trends and developments in English for specific purposes (ESP) research. Topics covered
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationLearning Computational Grammars
Learning Computational Grammars John Nerbonne, Anja Belz, Nicola Cancedda, Hervé Déjean, James Hammerton, Rob Koeling, Stasinos Konstantopoulos, Miles Osborne, Franck Thollard and Erik Tjong Kim Sang Abstract
More informationProblems of the Arabic OCR: New Attitudes
Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing
More informationLearning and Retaining New Vocabularies: The Case of Monolingual and Bilingual Dictionaries
Learning and Retaining New Vocabularies: The Case of Monolingual and Bilingual Dictionaries Mohsen Mobaraki Assistant Professor, University of Birjand, Iran mmobaraki@birjand.ac.ir *Amin Saed Lecturer,
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationLower and Upper Secondary
Lower and Upper Secondary Type of Course Age Group Content Duration Target General English Lower secondary Grammar work, reading and comprehension skills, speech and drama. Using Multi-Media CD - Rom 7
More informationWhat the National Curriculum requires in reading at Y5 and Y6
What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationHandbook for Graduate Students in TESL and Applied Linguistics Programs
Handbook for Graduate Students in TESL and Applied Linguistics Programs Section A Section B Section C Section D M.A. in Teaching English as a Second Language (MA-TESL) Ph.D. in Applied Linguistics (PhD
More informationCELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom
CELTA Syllabus and Assessment Guidelines Third Edition CELTA (Certificate in Teaching English to Speakers of Other Languages) is accredited by Ofqual (the regulator of qualifications, examinations and
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationGACE Computer Science Assessment Test at a Glance
GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science
More informationDegree Qualification Profiles Intellectual Skills
Degree Qualification Profiles Intellectual Skills Intellectual Skills: These are cross-cutting skills that should transcend disciplinary boundaries. Students need all of these Intellectual Skills to acquire
More informationThe Language of Football England vs. Germany (working title) by Elmar Thalhammer. Abstract
The Language of Football England vs. Germany (working title) by Elmar Thalhammer Abstract As opposed to about fifteen years ago, football has now become a socially acceptable phenomenon in both Germany
More informationBigrams in registers, domains, and varieties: a bigram gravity approach to the homogeneity of corpora
Bigrams in registers, domains, and varieties: a bigram gravity approach to the homogeneity of corpora Stefan Th. Gries Department of Linguistics University of California, Santa Barbara stgries@linguistics.ucsb.edu
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationThe Use of Drama and Dramatic Activities in English Language Teaching
The Crab: Journal of Theatre and Media Arts (Number 7/June 2012, 151-159) The Use of Drama and Dramatic Activities in English Language Teaching Chioma O.C. Chukueggu Abstract The purpose of this paper
More informationDeveloping a Language for Assessing Creativity: a taxonomy to support student learning and assessment
Investigations in university teaching and learning vol. 5 (1) autumn 2008 ISSN 1740-5106 Developing a Language for Assessing Creativity: a taxonomy to support student learning and assessment Janette Harris
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationWhy PPP won t (and shouldn t) go away
(and shouldn t) go IATEFL Birmingham 2016 jasonanderson1@gmail.com www.jasonanderson.org.uk speakinggames.wordpress.com Structure of my talk 1. Introduction 3. Why is it so enduring / popular? (i.e. Does
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationGCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education
GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge
More informationProgressive Aspect in Nigerian English
ISLE 2011 17 June 2011 1 New Englishes Empirical Studies Aspect in Nigerian Languages 2 3 Nigerian English Other New Englishes Explanations Progressive Aspect in New Englishes New Englishes Empirical Studies
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationGCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)
GCSE Mathematics A General Certificate of Secondary Education Unit A503/0: Mathematics C (Foundation Tier) Mark Scheme for January 203 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge and RSA)
More informationThe Potential of Corpus-Informed L2 Pedagogy. Jonathon Reinhardt University of Arizona
The Potential of Corpus-Informed L2 Pedagogy Jonathon Reinhardt University of Arizona Abstract Corpus linguistic methods have led to many revelations about the nature of language use and language learning
More informationLANGUAGE LEARNING MOOCS : REFLECTING ON THE CREATION OF TECHNOLOGY-BASED LEARNING MATERIALS IN A MOOLC" Research collaboration
LANGUAGE LEARNING MOOCS : REFLECTING ON THE CREATION OF TECHNOLOGY-BASED LEARNING MATERIALS IN A MOOLC" Research collaboration Context and problem Downes (2014) claims that the success of a MOOC is process-defined
More informationTHE EFFECTS OF TASK COMPLEXITY ALONG RESOURCE-DIRECTING AND RESOURCE-DISPERSING FACTORS ON EFL LEARNERS WRITTEN PERFORMANCE
THE EFFECTS OF TASK COMPLEXITY ALONG RESOURCE-DIRECTING AND RESOURCE-DISPERSING FACTORS ON EFL LEARNERS WRITTEN PERFORMANCE Zahra Talebi PhD candidate in TEFL, Faculty of Humanities, University of Payame
More informationHow to analyze visual narratives: A tutorial in Visual Narrative Grammar
How to analyze visual narratives: A tutorial in Visual Narrative Grammar Neil Cohn 2015 neilcohn@visuallanguagelab.com www.visuallanguagelab.com Abstract Recent work has argued that narrative sequential
More informationEvidence for Reliability, Validity and Learning Effectiveness
PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies
More informationConstruction Grammar. University of Jena.
Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What
More information282 About the Authors
About the Authors Halina Chodkiewicz is Professor of Applied Linguistics at the Department of English, Maria Curie-Skłodowska University, Lublin, Poland. She teaches psycholinguistics, second language
More informationAssessing speaking skills:. a workshop for teacher development. Ben Knight
Assessing speaking skills:. a workshop for teacher development Ben Knight Speaking skills are often considered the most important part of an EFL course, and yet the difficulties in testing oral skills
More informationSpoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers
Spoken Language Parsing Using Phrase-Level Grammars and Trainable Classifiers Chad Langley, Alon Lavie, Lori Levin, Dorcas Wallace, Donna Gates, and Kay Peterson Language Technologies Institute Carnegie
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationThe Common European Framework of Reference for Languages p. 58 to p. 82
The Common European Framework of Reference for Languages p. 58 to p. 82 -- Chapter 4 Language use and language user/learner in 4.1 «Communicative language activities and strategies» -- Oral Production
More informationSyntactic and Lexical Simplification: The Impact on EFL Listening Comprehension at Low and High Language Proficiency Levels
ISSN 1798-4769 Journal of Language Teaching and Research, Vol. 5, No. 3, pp. 566-571, May 2014 Manufactured in Finland. doi:10.4304/jltr.5.3.566-571 Syntactic and Lexical Simplification: The Impact on
More informationInternational Conference on Education and Educational Psychology (ICEEPSY 2012)
Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 69 ( 2012 ) 984 989 International Conference on Education and Educational Psychology (ICEEPSY 2012) Second language research
More informationSpecification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments
Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,
More informationSEMAFOR: Frame Argument Resolution with Log-Linear Models
SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon
More informationThe role of the first language in foreign language learning. Paul Nation. The role of the first language in foreign language learning
1 Article Title The role of the first language in foreign language learning Author Paul Nation Bio: Paul Nation teaches in the School of Linguistics and Applied Language Studies at Victoria University
More informationLQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization
LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization Annemarie Friedrich, Marina Valeeva and Alexis Palmer COMPUTATIONAL LINGUISTICS & PHONETICS SAARLAND UNIVERSITY, GERMANY
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,
More informationWonderworks Tier 2 Resources Third Grade 12/03/13
Wonderworks Tier 2 Resources Third Grade Wonderworks Tier II Intervention Program (K 5) Guidance for using K 1st, Grade 2 & Grade 3 5 Flowcharts This document provides guidelines to school site personnel
More informationLANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume 11 : 12 December 2011 ISSN
LANGUAGE IN INDIA Strength for Today and Bright Hope for Tomorrow Volume ISSN 1930-2940 Managing Editor: M. S. Thirumalai, Ph.D. Editors: B. Mallikarjun, Ph.D. Sam Mohanlal, Ph.D. B. A. Sharada, Ph.D.
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More informationEvaluation of the coursebooks used in the Chungbuk Provincial Board. of Education Secondary School Teachers Training Sessions
Evaluation of the coursebooks used in the Chungbuk Provincial Board of Education Secondary School Teachers Training Sessions Yvette Murdoch March 2000 Module 3 assignment for: MA TEFL / TESL (ODL) Centre
More informationTHE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING
SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,
More informationThe Singapore Copyright Act applies to the use of this document.
Title Learning for listening: Metacognitive awareness and strategy use to develop listening comprehension Author(s) Zhang Donglan Source REACT, 2001(1), 21-26 Published by National Institute of Education
More informationThe taming of the data:
The taming of the data: Using text mining in building a corpus for diachronic analysis Stefania Degaetano-Ortlieb, Hannah Kermes, Ashraf Khamis, Jörg Knappen, Noam Ordan and Elke Teich Background Big data
More informationCommon Core State Standards for English Language Arts
Reading Standards for Literature 6-12 Grade 9-10 Students: 1. Cite strong and thorough textual evidence to support analysis of what the text says explicitly as well as inferences drawn from the text. 2.
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More information