Lexical loss as a shared linguistic innovation

Size: px
Start display at page:

Download "Lexical loss as a shared linguistic innovation"

Transcription

1 Lexical loss as a shared linguistic innovation FIN-CLARIN seminar on Fenno-Ugric Computational Linguistics University of Helsinki, Juho Pystynen juho.pystynen@helsinki.fi

2 Why lexical loss? Loss of inherited linguistic material is a simple and commonplace linguistic innovation. Lexical material is a numerically rich source of data: any given language variety can be characterized by the presence of thousands of lexemes. If a language's history is known in some detail (back to a recent proto-language stage), often several hundreds of lexemes can be analyzed as lost vs. not lost.

3 Modelling loss (0) Loss is not a mirror image of the innovation of new vocabulary. Synonymy: words can persist in use even after the introduction of a new word of the same meaning. {a} > {a, b} Multiple innovations: a word's "replacement" can itself also be later lost. {a} > {b} > {c} Total loss: a lost word can end up replaced not by a new innovative word, but instead by an analytic expression or, by a pre-existing synonym. {a} > {a, b} > {b}

4 Modelling loss (1) Given a set of vocabulary in a (possibly reconstructed) protolanguage, we can at first approximation model lexical loss as the presence vs. absence of a reflex of a given proto-form. *{a, b, c, d, e, f } > {a, c, d, f } A simple metric of total losses in a given descendant variety will then be the total percentage of lexical material preserved vs. lost. Again at first approximation, we can model the loss process as essentially random. Much more finer-grained sociolinguistic and corpus analysis would be possible: recognition percentage among a speaker community, frequency of usage, median age for the acquisition of a particular word, usage competition between synonyms, variation in what proto-states may be assumed for these factors, etc.

5 Modelling loss (1) When a language variety has for some reason not been documented in detail (extinct, endangered, remote, etc.), some losses may be "virtual": a lexical item still remains in use, but has not been recorded by researchers. Working with a percentage measure, modellable as simply an additional loss factor "loss during documentation": p observed loss = p historical loss p documentation loss At the low end, documentation loss is highly unlikely to be random, due to early fieldwork surveys often having been based on lists of basic vocabulary. 'Five', 'head', 'woman' unlikely to be lost in the documentation process; 'multitude', 'pancreas', 'midwife' more likely

6 Modelling loss (1) Documentation loss is intrinsically not observable in (a given set of) data. Consequence 1: observed total losses are a measure of the comparative data, not directly of history. Consequence 2: comparison of lexical losses is unlikely to be immediately useful between languages documented to significantly differing degrees.

7 Modelling loss (2) Modelling losses as a binary variable at the lexeme level often runs into difficult edge cases. An etymological comparison is never a strictly proven fact. Solution: apply probabilistic modelling here as well. Exact figures are not possible to derive, but rough ballpark figures can be applied. A highly regular etymology 100% probability A plausible etymology with irregularities 50 90% probability A speculative etymology 1 10% probability Lack of etymology 0% probability

8 An example dataset: Samoyedic The Samoyedic languages: a relatively compact and homogeneously documented language group Eight reasonably well documented languages: Nganasan, Tundra Enets, Forest Enets, Tundra Nenets, Forest Nenets, Selkup, Kamass, Mator No overshadowing major literary languages Language boundaries fairly clear Substantial reconstruction work is available Status as a part of the larger Uralic family allows improved grounding

9 An example dataset: Samoyedic A work-in-progress etymological database: Main lexical data source: the etymological dictionary of Janhunen (1977) Addenda from later studies, e.g. Helimski (1986, 1993), Aikio (2002, 2006) Thus far in humble spreadsheet form 790 lexemes (and growing) with rough probabilistic encoding Reconstruction, distribution of reflexes, further etymology

10 An example dataset: Samoyedic

11 An example dataset: Samoyedic Basic retention percentages: Nganasan 61% Selkup 80% Enets 67% Kamassian 57% Yurats 17% Koibal 36% Tundra Nenets 87% Mator 44% Forest Nenets 78%

12 Modelling subgrouping We need to allow the possibility that different observed loss rates reflect also different historical loss rates, and not merely different documentation losses Within a family tree model, we can assign loss rates not just for languages, but in more general for branches Could we however do the inverse: identify branches from losses?

13 Modelling subgrouping Isolated retention percentages provide no subgrouping information: for any arbitrary tree, we can always assign branch loss rates that multiply to the observed top node loss rates.

14 Modelling subgrouping We need to look at shared losses vs. retentions (between a given pair of varieties) on a wordby-word level to be able to locate common innovations. A shared loss (in the data) is, however, not automatically a common loss (in actual history). Indeed, for languages 1 and 2 with loss rates p 1 and p 2, we expect to see a shared loss rate p 1 p 2 already purely by chance.

15 Modelling subgrouping What we can do with ease is to calculate the expected shared loss, or retention, rates, and compare these with the attested rates. (With detailed statistical analysis, if we wish; for today's purposes a simple look at these metrics will however suffice) With probabilistic etymological coding, for a single lexeme we have, at a pinch: p(shared retention) = p 1 p 2 1,1 1 0,p 0 p(shared loss) = (1-p 1 ) (1-p 2 ) 1,p 0 0,0 1

16 Shared retentions Nganasan Selkup: predicted: 361 shared items; attested: 373 (103%) Tundra Nenets Forest Nenets: predicted: 505 shared items; attested: 565 (112%) Yurats Mator: predicted: 58 shared items; attested: 93 (160%) Kamassian Koibal: predicted: 150 shared items; attested: 260 (173%) main trend: generally elevated rates across board

17 Shared retentions Phenomenon 1: reconstructed vocabulary is not known independently of the descendants. Lexemes surviving in one language are usually not reconstructible (exception: words with wider Uralic pedigree) Lexemes surviving in zero languages are entirely unreconstructible. Observed retention rates are actually slightly elevated, loss rates slightly diminished. p L, accurate = n L / N (n L = # of lexemes attested in variety L; N = total number of proto-lexemes) p L, observed = n L / (N-N 0 ) (N 0 = total number of unreconstructible proto-lexemes) If N 0 /N small: p L, obs n L / N + n L / N 0 = p L, acc + p L, 0 An approximately linear error factor for retention rates in turn, constant error term for the predicted observed ratio

18 Shared retentions Phenomenon 2: as covered before, documentation loss is likely to introduce a bias towards basic vocabulary. Which is constant with respect to languages. Substantially poorer-documented languages will appear closer to all other languages than expected. The effect will cumulate, showing poorer-documented languages especially close to each other. The position of poorer-documented languages is not resolvable without a detailed model of documentation practices.

19 Shared retentions Naive approaches to quantitative lexical comparison often attempt to interpret a higher proportion of shared vocabulary as indicative of closer relationship. Innovative shared vocabulary may indeed constitute historically common innovations However, historically common retentions are by contrast unindicative of common descent In principle, statistically significant upticks in shared retentions could instead indicate unidentified family-internal loaning Emerging bias among shared retention rates are however most likely to simply constitute methodological artifacts in the data.

20 Shared losses Nganasan Selkup: predicted: 73 shared losses; attested: 84 (115%) Tundra Nenets Forest Nenets: predicted: 29 shared losses; attested: 89 (307%) Yurats Mator: predicted: 372 shared losses; attested: 407 (109%) Kamassian Koibal: predicted: 233 shared losses; attested: 341 (146%) again, main trend is generally elevated rates

21 Shared losses The Nenets subgroup now clearly stands out among the material Poorly recorded varieties become now distant rather than close Losses will concentrate among less basic vocabulary, likely to be lost during documentation. If non-basic vocabulary is a numerical majority, losses among it will also be left less likely to co-occur Elevated overall rates, however, are likely to indicate the existence of large subgroups Subgroups may have historically undergone common losses While their complements may have missed out on lexical innovations In principle investigable by iterative subgrouping: pool Nenets and Km-Kb together as single varieties, repeat count for new results?

22 Shared losses vs. retentions Next, let's consider the comparative data between the two Nenets varieties a bit closer. Four surface categories can be identified: retained in both TN and FN: n RR = 565 lost in both TN and FN: n LL = 89 retained in TN, lost in FN: n RL = 104 lost in TN, retained in FN: n LR = 31 The surface retention and loss rates: p R, TN = (n RR + n RL ) / N; p R, FN = (n RR + n LR ) / N p L, TN = (n LR + n LL ) / N; p L, FN = (n RL + n LL ) / N

23 Shared losses vs. retentions However, if a Nenets subgroup indeed exists, we can divide loss events in two sets: early common losses in Proto- Nenets, vs. late losses separately TN vs. FN (some again may occur in parallel in both!) Also retentions during the Proto-Nenets period will exist in common. Moreover: a slightly elevated rate of common retentions is therefore indeed expected as well! But note the order of inference: shared losses common subgroup common retentions Retentions themselves continue to not suffice as evidence for common ancestry.

Pre-Algebra A. Syllabus. Course Overview. Course Goals. General Skills. Credit Value

Pre-Algebra A. Syllabus. Course Overview. Course Goals. General Skills. Credit Value Syllabus Pre-Algebra A Course Overview Pre-Algebra is a course designed to prepare you for future work in algebra. In Pre-Algebra, you will strengthen your knowledge of numbers as you look to transition

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information

Probability estimates in a scenario tree

Probability estimates in a scenario tree 101 Chapter 11 Probability estimates in a scenario tree An expert is a person who has made all the mistakes that can be made in a very narrow field. Niels Bohr (1885 1962) Scenario trees require many numbers.

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

SAT MATH PREP:

SAT MATH PREP: SAT MATH PREP: 2015-2016 NOTE: The College Board has redesigned the SAT Test. This new test will start in March of 2016. Also, the PSAT test given in October of 2015 will have the new format. Therefore

More information

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Using and applying mathematics objectives (Problem solving, Communicating and Reasoning) Select the maths to use in some classroom

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Mathematics subject curriculum

Mathematics subject curriculum Mathematics subject curriculum Dette er ei omsetjing av den fastsette læreplanteksten. Læreplanen er fastsett på Nynorsk Established as a Regulation by the Ministry of Education and Research on 24 June

More information

Probability and Statistics Curriculum Pacing Guide

Probability and Statistics Curriculum Pacing Guide Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods

More information

learning collegiate assessment]

learning collegiate assessment] [ collegiate learning assessment] INSTITUTIONAL REPORT 2005 2006 Kalamazoo College council for aid to education 215 lexington avenue floor 21 new york new york 10016-6023 p 212.217.0700 f 212.661.9766

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

An application of student learner profiling: comparison of students in different degree programs

An application of student learner profiling: comparison of students in different degree programs An application of student learner profiling: comparison of students in different degree programs Elizabeth May, Charlotte Taylor, Mary Peat, Anne M. Barko and Rosanne Quinnell, School of Biological Sciences,

More information

What the National Curriculum requires in reading at Y5 and Y6

What the National Curriculum requires in reading at Y5 and Y6 What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the

More information

Introduction to Simulation

Introduction to Simulation Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /

More information

NCEO Technical Report 27

NCEO Technical Report 27 Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students

More information

The Importance of Social Network Structure in the Open Source Software Developer Community

The Importance of Social Network Structure in the Open Source Software Developer Community The Importance of Social Network Structure in the Open Source Software Developer Community Matthew Van Antwerp Department of Computer Science and Engineering University of Notre Dame Notre Dame, IN 46556

More information

Chapter 5: Language. Over 6,900 different languages worldwide

Chapter 5: Language. Over 6,900 different languages worldwide Chapter 5: Language Over 6,900 different languages worldwide Language is a system of communication through speech, a collection of sounds that a group of people understands to have the same meaning Key

More information

Graduation Initiative 2025 Goals San Jose State

Graduation Initiative 2025 Goals San Jose State Graduation Initiative 2025 Goals San Jose State Metric 2025 Goal Most Recent Rate Freshman 6-Year Graduation 71% 57% Freshman 4-Year Graduation 35% 10% Transfer 2-Year Graduation 36% 24% Transfer 4-Year

More information

Evaluation of Hybrid Online Instruction in Sport Management

Evaluation of Hybrid Online Instruction in Sport Management Evaluation of Hybrid Online Instruction in Sport Management Frank Butts University of West Georgia fbutts@westga.edu Abstract The movement toward hybrid, online courses continues to grow in higher education

More information

5. UPPER INTERMEDIATE

5. UPPER INTERMEDIATE Triolearn General Programmes adapt the standards and the Qualifications of Common European Framework of Reference (CEFR) and Cambridge ESOL. It is designed to be compatible to the local and the regional

More information

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA Beba Shternberg, Center for Educational Technology, Israel Michal Yerushalmy University of Haifa, Israel The article focuses on a specific method of constructing

More information

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best

More information

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Learning Disability Functional Capacity Evaluation. Dear Doctor, Dear Doctor, I have been asked to formulate a vocational opinion regarding NAME s employability in light of his/her learning disability. To assist me with this evaluation I would appreciate if you can

More information

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning Overview Motivation for Analyses Analyses and

More information

Multi-Lingual Text Leveling

Multi-Lingual Text Leveling Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency

More information

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Linking the Ohio State Assessments to NWEA MAP Growth Tests * Linking the Ohio State Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. August 2016 Introduction Northwest Evaluation Association (NWEA

More information

Unit 3 Ratios and Rates Math 6

Unit 3 Ratios and Rates Math 6 Number of Days: 20 11/27/17 12/22/17 Unit Goals Stage 1 Unit Description: Students study the concepts and language of ratios and unit rates. They use proportional reasoning to solve problems. In particular,

More information

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach BILINGUAL LEARNERS DICTIONARIES The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach Mark VAN MOL, Leuven, Belgium Abstract This paper reports on the

More information

A cognitive perspective on pair programming

A cognitive perspective on pair programming Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2006 Proceedings Americas Conference on Information Systems (AMCIS) December 2006 A cognitive perspective on pair programming Radhika

More information

Library Consortia: Advantages and Disadvantages

Library Consortia: Advantages and Disadvantages International Journal of Information Technology and Library Science. Volume 2, Number 1 (2013), pp. 1-5 Research India Publications http://www.ripublication.com Library Consortia: Advantages and Disadvantages

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

1. Introduction. 2. The OMBI database editor

1. Introduction. 2. The OMBI database editor OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper

More information

MERGA 20 - Aotearoa

MERGA 20 - Aotearoa Assessing Number Sense: Collaborative Initiatives in Australia, United States, Sweden and Taiwan AIistair McIntosh, Jack Bana & Brian FarreII Edith Cowan University Group tests of Number Sense were devised

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS ELIZABETH ANNE SOMERS Spring 2011 A thesis submitted in partial

More information

Ontologies vs. classification systems

Ontologies vs. classification systems Ontologies vs. classification systems Bodil Nistrup Madsen Copenhagen Business School Copenhagen, Denmark bnm.isv@cbs.dk Hanne Erdman Thomsen Copenhagen Business School Copenhagen, Denmark het.isv@cbs.dk

More information

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS Arizona s English Language Arts Standards 11-12th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS 11 th -12 th Grade Overview Arizona s English Language Arts Standards work together

More information

Commanding Officer Decision Superiority: The Role of Technology and the Decision Maker

Commanding Officer Decision Superiority: The Role of Technology and the Decision Maker Commanding Officer Decision Superiority: The Role of Technology and the Decision Maker Presenter: Dr. Stephanie Hszieh Authors: Lieutenant Commander Kate Shobe & Dr. Wally Wulfeck 14 th International Command

More information

Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5. October 21, Research Conducted by Empirical Education Inc.

Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5. October 21, Research Conducted by Empirical Education Inc. Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5 October 21, 2010 Research Conducted by Empirical Education Inc. Executive Summary Background. Cognitive demands on student knowledge

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

10.2. Behavior models

10.2. Behavior models User behavior research 10.2. Behavior models Overview Why do users seek information? How do they seek information? How do they search for information? How do they use libraries? These questions are addressed

More information

University-Based Induction in Low-Performing Schools: Outcomes for North Carolina New Teacher Support Program Participants in

University-Based Induction in Low-Performing Schools: Outcomes for North Carolina New Teacher Support Program Participants in University-Based Induction in Low-Performing Schools: Outcomes for North Carolina New Teacher Support Program Participants in 2014-15 In this policy brief we assess levels of program participation and

More information

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 141 ( 2014 ) 124 128 WCLTA 2013 Using Corpus Linguistics in the Development of Writing Blanka Frydrychova

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Ryerson University Sociology SOC 483: Advanced Research and Statistics

Ryerson University Sociology SOC 483: Advanced Research and Statistics Ryerson University Sociology SOC 483: Advanced Research and Statistics Prerequisites: SOC 481 Instructor: Paul S. Moore E-mail: psmoore@ryerson.ca Office: Sociology Department Jorgenson JOR 306 Phone:

More information

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters Which verb classes and why? ean-pierre Koenig, Gail Mauner, Anthony Davis, and reton ienvenue University at uffalo and Streamsage, Inc. Research questions: Participant roles play a role in the syntactic

More information

A Study of Successful Practices in the IB Program Continuum

A Study of Successful Practices in the IB Program Continuum FINAL REPORT Time period covered by: September 15 th 009 to March 31 st 010 Location of the project: Thailand, Hong Kong, China & Vietnam Report submitted to IB: April 5 th 010 A Study of Successful Practices

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance

The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance James J. Kemple, Corinne M. Herlihy Executive Summary June 2004 In many

More information

ReFresh: Retaining First Year Engineering Students and Retraining for Success

ReFresh: Retaining First Year Engineering Students and Retraining for Success ReFresh: Retaining First Year Engineering Students and Retraining for Success Neil Shyminsky and Lesley Mak University of Toronto lmak@ecf.utoronto.ca Abstract Student retention and support are key priorities

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Radius STEM Readiness TM

Radius STEM Readiness TM Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and

More information

I. INTRODUCTION. for conducting the research, the problems in teaching vocabulary, and the suitable

I. INTRODUCTION. for conducting the research, the problems in teaching vocabulary, and the suitable 1 I. INTRODUCTION This chapter describes the background of the problem which includes the reasons for conducting the research, the problems in teaching vocabulary, and the suitable activity which is needed

More information

Longitudinal Analysis of the Effectiveness of DCPS Teachers

Longitudinal Analysis of the Effectiveness of DCPS Teachers F I N A L R E P O R T Longitudinal Analysis of the Effectiveness of DCPS Teachers July 8, 2014 Elias Walsh Dallas Dotter Submitted to: DC Education Consortium for Research and Evaluation School of Education

More information

How to analyze visual narratives: A tutorial in Visual Narrative Grammar

How to analyze visual narratives: A tutorial in Visual Narrative Grammar How to analyze visual narratives: A tutorial in Visual Narrative Grammar Neil Cohn 2015 neilcohn@visuallanguagelab.com www.visuallanguagelab.com Abstract Recent work has argued that narrative sequential

More information

An Introduction to Simio for Beginners

An Introduction to Simio for Beginners An Introduction to Simio for Beginners C. Dennis Pegden, Ph.D. This white paper is intended to introduce Simio to a user new to simulation. It is intended for the manufacturing engineer, hospital quality

More information

Self Study Report Computer Science

Self Study Report Computer Science Computer Science undergraduate students have access to undergraduate teaching, and general computing facilities in three buildings. Two large classrooms are housed in the Davis Centre, which hold about

More information

Corpus Linguistics (L615)

Corpus Linguistics (L615) (L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives

More information

TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system

TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system Curriculum Overview Mathematics 1 st term 5º grade - 2010 TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system Multiplies and divides decimals by 10 or 100. Multiplies and divide

More information

IS USE OF OPTIONAL ATTRIBUTES AND ASSOCIATIONS IN CONCEPTUAL MODELING ALWAYS PROBLEMATIC? THEORY AND EMPIRICAL TESTS

IS USE OF OPTIONAL ATTRIBUTES AND ASSOCIATIONS IN CONCEPTUAL MODELING ALWAYS PROBLEMATIC? THEORY AND EMPIRICAL TESTS IS USE OF OPTIONAL ATTRIBUTES AND ASSOCIATIONS IN CONCEPTUAL MODELING ALWAYS PROBLEMATIC? THEORY AND EMPIRICAL TESTS Completed Research Paper Andrew Burton-Jones UQ Business School The University of Queensland

More information

Do multi-year scholarships increase retention? Results

Do multi-year scholarships increase retention? Results Do multi-year scholarships increase retention? In the past, Boise State has mainly offered one-year scholarships to new freshmen. Recently, however, the institution moved toward offering more two and four-year

More information

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt Certification Singapore Institute Certified Six Sigma Professionals Certification Courses in Six Sigma Green Belt ly Licensed Course for Process Improvement/ Assurance Managers and Engineers Leading the

More information

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and

More information

1GOOD LEADERSHIP IS IMPORTANT. Principal Effectiveness and Leadership in an Era of Accountability: What Research Says

1GOOD LEADERSHIP IS IMPORTANT. Principal Effectiveness and Leadership in an Era of Accountability: What Research Says B R I E F 8 APRIL 2010 Principal Effectiveness and Leadership in an Era of Accountability: What Research Says J e n n i f e r K i n g R i c e For decades, principals have been recognized as important contributors

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

CollaboFramework. Framework and Methodologies for Collaborative Research in Digital Humanities. DHN Workshop. Organizers:

CollaboFramework. Framework and Methodologies for Collaborative Research in Digital Humanities. DHN Workshop. Organizers: CollaboFramework Framework and Methodologies for Collaborative Research in Digital Humanities DHN Workshop Organizers: Sasha Mile Rudan (Oslo University, sasharu@ifi.uio.no) Sinisa Rudan (Belgrade University,

More information

Evaluation of a College Freshman Diversity Research Program

Evaluation of a College Freshman Diversity Research Program Evaluation of a College Freshman Diversity Research Program Sarah Garner University of Washington, Seattle, Washington 98195 Michael J. Tremmel University of Washington, Seattle, Washington 98195 Sarah

More information

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks presentation First timelines to explain TVM First financial

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

The Evaluation of Students Perceptions of Distance Education

The Evaluation of Students Perceptions of Distance Education The Evaluation of Students Perceptions of Distance Education Assoc. Prof. Dr. Aytekin İŞMAN - Eastern Mediterranean University Senior Instructor Fahme DABAJ - Eastern Mediterranean University Research

More information

Model Ensemble for Click Prediction in Bing Search Ads

Model Ensemble for Click Prediction in Bing Search Ads Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com

More information

CS 101 Computer Science I Fall Instructor Muller. Syllabus

CS 101 Computer Science I Fall Instructor Muller. Syllabus CS 101 Computer Science I Fall 2013 Instructor Muller Syllabus Welcome to CS101. This course is an introduction to the art and science of computer programming and to some of the fundamental concepts of

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR ROLAND HAUSSER Institut für Deutsche Philologie Ludwig-Maximilians Universität München München, West Germany 1. CHOICE OF A PRIMITIVE OPERATION The

More information

Reading Horizons. A Look At Linguistic Readers. Nicholas P. Criscuolo APRIL Volume 10, Issue Article 5

Reading Horizons. A Look At Linguistic Readers. Nicholas P. Criscuolo APRIL Volume 10, Issue Article 5 Reading Horizons Volume 10, Issue 3 1970 Article 5 APRIL 1970 A Look At Linguistic Readers Nicholas P. Criscuolo New Haven, Connecticut Public Schools Copyright c 1970 by the authors. Reading Horizons

More information

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General Grade(s): None specified Unit: Creating a Community of Mathematical Thinkers Timeline: Week 1 The purpose of the Establishing a Community

More information

University Library Collection Development and Management Policy

University Library Collection Development and Management Policy University Library Collection Development and Management Policy 2017-18 1 Executive Summary Anglia Ruskin University Library supports our University's strategic objectives by ensuring that students and

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Capitalism and Higher Education: A Failed Relationship

Capitalism and Higher Education: A Failed Relationship Capitalism and Higher Education: A Failed Relationship November 15, 2015 Bryan Hagans ENGL-101-015 Ighade Hagans 2 Bryan Hagans Ighade English 101-015 8 November 2015 Capitalism and Higher Education: A

More information

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

What is Research? A Reconstruction from 15 Snapshots. Charlie Van Loan

What is Research? A Reconstruction from 15 Snapshots. Charlie Van Loan What is Research? A Reconstruction from 15 Snapshots Charlie Van Loan Warm-Up Question How do you evaluate the quality of a PhD Dissertation? The Skyline Factor It depends on the eye of the beholder. The

More information

Language Arts: ( ) Instructional Syllabus. Teachers: T. Beard address

Language Arts: ( ) Instructional Syllabus. Teachers: T. Beard  address Renaissance Middle School 7155 Hall Road Fairburn, Georgia 30213 Phone: 770-306-4330 Fax: 770-306-4338 Dr. Sandra DeShazier, Principal Benzie Brinson, 7 th grade Administrator Language Arts: (2013-2014)

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT D1.3: 2 nd Annual Report Project Number: 212879 Reporting period: 1/11/2008-31/10/2009 PROJECT PERIODIC REPORT Grant Agreement number: 212879 Project acronym: EURORIS-NET Project title: European Research

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand 1 Introduction Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand heidi.quinn@canterbury.ac.nz NWAV 33, Ann Arbor 1 October 24 This paper looks at

More information

Common Core State Standards

Common Core State Standards Common Core State Standards Common Core State Standards 7.NS.3 Solve real-world and mathematical problems involving the four operations with rational numbers. Mathematical Practices 1, 3, and 4 are aspects

More information

Designing a case study

Designing a case study Designing a case study Case studies are problem situations based on real life like situations, the outcome of the case is already known (at least to the lecturer). Cees van Westen International Institute

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information