LING/CSC 439/539: Statistical Natural Language Processing

Similar documents
Scottsdale Community College Spring 2016 CIS190 Intro to LANs CIS105 or permission of Instructor

ECD 131 Language Arts Early Childhood Development Business and Public Service

Course Syllabus It is the responsibility of each student to carefully review the course syllabus. The content is subject to revision with notice.

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

Monday/Wednesday, 9:00 AM 10:30 AM

CMST 2060 Public Speaking

CS 100: Principles of Computing

Intensive English Program Southwest College

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Financial Accounting Concepts and Research

University of Arkansas at Little Rock Graduate Social Work Program Course Outline Spring 2014

IPHY 3410 Section 1 - Introduction to Human Anatomy Lecture Syllabus (Spring, 2017)

Indiana University Northwest Chemistry C110 Chemistry of Life

VIRTUAL LEARNING. Alabama Connecting Classrooms, Educators, & Students Statewide. for FACILITATORS

IST 440, Section 004: Technology Integration and Problem-Solving Spring 2017 Mon, Wed, & Fri 12:20-1:10pm Room IST 202

ACC : Accounting Transaction Processing Systems COURSE SYLLABUS Spring 2011, MW 3:30-4:45 p.m. Bryan 202

Introduction to Sociology SOCI 1101 (CRN 30025) Spring 2015

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

Course Syllabus Advanced-Intermediate Grammar ESOL 0352

Required Materials: The Elements of Design, Third Edition; Poppy Evans & Mark A. Thomas; ISBN GB+ flash/jump drive

English Policy Statement and Syllabus Fall 2017 MW 10:00 12:00 TT 12:15 1:00 F 9:00 11:00

CRITICAL THINKING AND WRITING: ENG 200H-D01 - Spring 2017 TR 10:45-12:15 p.m., HH 205

Texas A&M University-Kingsville Department of Language and Literature Summer 2017: English 1302: Rhetoric & Composition I, 3 Credit Hours

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

South Peace Campus Student Code of Conduct. dcss.sd59.bc.ca th St., th St., (250) (250)

Cleveland State University Introduction to University Life Course Syllabus Fall ASC 101 Section:

RM 2234 Retailing in a Digital Age SPRING 2016, 3 credits, 50% face-to-face (Wed 3pm-4:15pm)

BSW Student Performance Review Process

Mel and Enid Zuckerman College of Public Health University of Arizona. SYLLABUS CPH 608A: Public Health Law and Ethics Spring 2016

Policy Name: Students Rights, Responsibilities, and Disciplinary Procedures

INTRODUCTION TO SOCIOLOGY SOCY 1001, Spring Semester 2013

Course Syllabus. Alternatively, a student can schedule an appointment by .

Academic Freedom Intellectual Property Academic Integrity

COURSE SYLLABUS for PTHA 2250 Current Concepts in Physical Therapy

MGMT 3362 Human Resource Management Course Syllabus Spring 2016 (Interactive Video) Business Administration 222D (Edinburg Campus)

University of Colorado Boulder, Program in Environmental Design. ENVD : Urban Site Analysis and Design Studio, Summer 2017

Office Location: LOCATION: BS 217 COURSE REFERENCE NUMBER: 93000

Required Texts: Intermediate Accounting by Spiceland, Sepe and Nelson, 8E Course notes are available on UNM Learn.

Non-Academic Disciplinary Procedures

The Policymaking Process Course Syllabus

STA2023 Introduction to Statistics (Hybrid) Spring 2013

General Microbiology (BIOL ) Course Syllabus

SPANISH 102, Basic Spanish, Second Semester, 4 Credit Hours Winter, 2013

COURSE INFORMATION. Course Number SER 216. Course Title Software Enterprise II: Testing and Quality. Credits 3. Prerequisites SER 215

SYLLABUS: RURAL SOCIOLOGY 1500 INTRODUCTION TO RURAL SOCIOLOGY SPRING 2017

San José State University

Introduction to World Philosophy Syllabus Fall 2013 PHIL 2010 CRN: 89658

HARRISBURG AREA COMMUNITY COLLEGE ONLINE COURSE SYLLABUS

(Sub)Gradient Descent

Spring Valley Academy Credit Flexibility Plan (CFP) Overview

Language Arts Methods

STUDENT WELFARE FREEDOM FROM BULLYING

Student Code of Conduct dcss.sd59.bc.ca th St th St. (250) (250)

Credit Flexibility Plan (CFP) Information and Guidelines

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Greek Life Code of Conduct For NPHC Organizations (This document is an addendum to the Student Code of Conduct)

COURSE SYLLABUS: CPSC6142 SYSTEM SIMULATION-SPRING 2015

INTRODUCTION TO HEALTH PROFESSIONS HHS CREDITS FALL 2012 SYLLABUS

Social Gerontology: 920:303:01 Department of Sociology Rutgers University Fall 2017 Tuesday & Thursday, 6:40 8:00 pm Beck Hall 251

CPMT 1347 Computer System Peripherals COURSE SYLLABUS

MANAGERIAL LEADERSHIP

SOLANO. Disability Services Program Faculty Handbook

DEPARTMENT OF HISTORY AND CLASSICS Academic Year , Classics 104 (Summer Term) Introduction to Ancient Rome

DISCIPLINARY PROCEDURES

Human Development: Life Span Spring 2017 Syllabus Psych 220 (Section 002) M/W 4:00-6:30PM, 120 MARB

Class Mondays & Wednesdays 11:00 am - 12:15 pm Rowe 161. Office Mondays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

Our Hazardous Environment

California Professional Standards for Education Leaders (CPSELs)

INTRODUCTION TO CULTURAL ANTHROPOLOGY ANT 2410 FALL 2015

HMS 241 Lab Introduction to Early Childhood Education Fall 2015

Adler Graduate School

Clatsop Community College

ARLINGTON PUBLIC SCHOOLS Discipline

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

THE UNIVERSITY OF WESTERN ONTARIO. Department of Psychology

Introduction, Organization Overview of NLP, Main Issues

Professors will not accept Extra Credit work nor should students ask a professor to make Extra Credit assignments.

COMMUNICATION AND JOURNALISM Introduction to Communication Spring 2010

Course Syllabus MFG Modern Manufacturing Techniques I Spring 2017

HIST 3300 HISTORIOGRAPHY & METHODS Kristine Wirts

Santa Fe Community College Teacher Academy Student Guide 1

GRADUATE COLLEGE Dual-Listed Courses

IDS 240 Interdisciplinary Research Methods

ECO 2013: PRINCIPLES OF MACROECONOMICS Spring 2017

MTH 215: Introduction to Linear Algebra

MGMT 479 (Hybrid) Strategic Management

Spring Course Syllabus. Course Number and Title: SPCH 1318 Interpersonal Communication

Neuroscience I. BIOS/PHIL/PSCH 484 MWF 1:00-1:50 Lecture Center F6. Fall credit hours

Marketing Management MBA 706 Mondays 2:00-4:50

Individual Instruction Voice (MPVA 300, 301, 501) COURSE INFORMATION Course Description Learning Objectives: Course Information

STANDARDIZED COURSE SYLLABUS

Maintaining Resilience in Teaching: Navigating Common Core and More Site-based Participant Syllabus

UNIVERSITY OF BALTIMORE SCHOOL OF LAW FALL SEMESTER 2017

Class Tuesdays & Thursdays 12:30-1:45 pm Friday 107. Office Tuesdays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

CS/SE 3341 Spring 2012

Chromatography Syllabus and Course Information 2 Credits Fall 2016

Fullerton College Business/CIS Division CRN CIS 111 Introduction to Information Systems 4 Units Course Syllabus Spring 2016

ANNUAL SCHOOL REPORT SEDA COLLEGE SUITE 1, REDFERN ST., REDFERN, NSW 2016

FINANCE 3320 Financial Management Syllabus May-Term 2016 *

Transcription:

LING/CSC 439/539: Statistical Natural Language Processing Communication #113, Tue/Thu 2:00 3:15 Last modified: August 21, 2017 Description of Course This course focuses on building statistical models of natural language. We do this with two aims. First, these models have tremendous value in the practical/computational domain and are widely used in human language technology applications. Second, these models have significant appeal as theoretical models of how language is processed, or how grammars are organized. This is a highly interdisciplinary course, bringing together elements of both linguistics and computer science. Natural Language Processing (NLP) has a large applied component, and as such this course will have a considerable focus on project-based assignments rather than written ones. Course Prerequisites or Co-requisites The students taking this course must know how to program, and have a decent understanding of data structures such as hash maps and trees. Ideally, the students should have taken a calculus course. We will, however, cover the necessary math background in class. Prerequisites: Ling 438/538, or CSC 483/583. Recommended: Math 129 (Calc II) Programming: Programming skills are required for this course. We will be using Python 2. Students unfamiliar with Python must have a working Python 2.x environment up and running and read through Chapter 1 in Natural Language Processing with Python (see below) within the first week. Instructor and Contact Information Instructor: Mihai Surdeanu Email: msurdeanu@email.arizona.edu Web: http://surdeanu.info/mihai Office: Gould-Simpson 746 Office hours: Tue 12:30 2 Teaching assistant: Gustave Hahn-Powell Email: hahnpowell@email.arizona.edu Office: Gould-Simpson 903 Office hours: Wed 2 3 Teaching assistant: Patricia Lee Email: pllee@email.arizona.edu Office: TBA Office hours: TBA

Course Format and Teaching Methods The course will be delivered using in-person lectures. No lab sections will be offered but the instructor encourages additional discussion on the topics introduced in the lecture materials. These discussions will be managed on a Piazza site controlled by the instructor. The Piazza site is available here: https://piazza.com/arizona/fall2017/ling439539/home Course Objectives and Expected Learning Outcomes At the conclusion of this course students should understand fundamental statistical methods for the processing of natural language, including: (a) text classification, (b) sequence modeling and its applications to part-of-speech tagging, (c) algorithms for structured learning such as shift-reduce and applications to syntactic parsing, and (d) cross-lingual and mono-lingual alignment algorithms such as IBM Model 1 and their applications to machine translation and question answering. Graduate students are expected to have an in-depth understanding of these techniques. For example, graduate students are expected to know how to code the underlying machine learning framework necessary for text classification such as logistic regression. Absence and Class Participation Policy UA s policy concerning Class Attendance, Participation, and Administrative Drops is available at http://catalog.arizona.edu/policy/class-attendance-participation-and-administrative-drop The UA policy regarding absences for any sincerely held religious belief, observance or practice will be accommodated where reasonable: http://policy.arizona.edu/human-resources/religiousaccommodation-policy. Absences preapproved by the UA Dean of Students (or dean s designee) will be honored. See https://deanofstudents.arizona.edu/absences Participating in the course and attending lectures and other course events are vital to the learning process. As such, attendance is required at all lectures and discussion section meetings. Students who miss class due to illness or emergency are required to bring documentation from their healthcare provider or other relevant, professional third parties. Failure to submit third-party documentation will result in unexcused absences. Course Communications Please use the email addresses above to contact the instructor or the TA. All course materials will be posted in D2L. Please use the Piazza site above to ask clarification questions about the material. Required Texts or Readings This course follows the following textbook: Christopher D. Manning and Hinrich Schutze. 1999. Foundations of Statistical Natural Language Processing. 6th printing with corrections, 2003. The MIT Press. http://nlp.stanford.edu/fsnlp/ (available for free electronically through UA library) Additional research articles covered in class will be distributed by the instructor. Highly recommended: For students not comfortable with natural language processing in Python, a companion reference such as the following is also highly recommended: Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. http://nltk.org/book/ (currently available for free electronically through the NLTK website) 2

Required or Special Materials No special tools or supplies needed. Assignments and Examinations: Schedule/Due Dates Grading will be based on four assignments, two exams (midterm and final), a programming project, and overall in-class participation. Please note that all four assignments will have a considerable programming component. For the final programming project, the students may propose a NLP topic that interests them, or implement one of the topics suggested by the instructor. As a rule, work will not be accepted late except in case of documented emergency or illness. You may petition the professor in writing for an exception if you feel you have a compelling reason for turning work in late. The due dates are as follows: Task Deadline HW 1 August 27 HW 2 September 24 Midterm review October 10 Midterm October 12 HW 3 October 29 HW 4 November 26 Final review December 5 Project December 7 This course will have a comprehensive written final examination. Information on the final exam regulations and schedule: https://www.registrar.arizona.edu/courses/final-examination-regulations-and-information http://www.registrar.arizona.edu/schedules/finals.htm Grading Scale and Policies The grading scheme is as follows: Component Assignments Midterm exam Final exam Programming project In-class participation Total Weight 300 pts 200 pts 275 pts 200 pts 25 pts 1000 pts Grade Point Range A 900 1000 B 800 899 C 700 799 3

D 600 699 E 0 599 Undergraduate vs. Graduate Requirements This course will be co-convened. To differentiate between graduate and undergraduate students, the instructor will require graduate students to implement more complex algorithms for the programming project, which might require additional reading of research articles. The instructor will provide the additional reading material and will guide the research process. Similarly, assignments and exams will have additional requirements/questions for graduate students. The overall grading scheme will be the same between graduate and undergraduate students (see the two tables above). Requests for incomplete (I) or withdrawal (W) must be made in accordance with University policies, which are available at http://catalog.arizona.edu/policy/grades-and-gradingsystem#incomplete and http://catalog.arizona.edu/policy/grades-and-grading-system#withdrawal, respectively. Scheduled Topics/Activities The course will cover the topics listed below ( MS x indicates the corresponding chapter in the Manning/Schutze textbook): Week Topics Readings 1 Introduction, text categorization MS 1, 16 2 Crash course in ML for text categorization: knn, perceptron, logistic regression, feed forward neural networks Materials provided by instructor 3 Crash course in ML, part 2 4 Distributional similarity: count-based methods, word embeddings Materials provided by instructor 5 Probability theory MS 2 + materials 6 Probability theory, part 2 7 N-gram models MS 5, 6 + parts of MS 7, 8 8 Midterm review and exam 9 Sequence models: HMM, MEMM, LSTM, and applications to part-of-speech tagging and information extraction 10 Sequence models, part 2 11 Structured learning: shift-reduce algorithms, PCFG, tree LSTM, and applications to syntactic parsing 12 Structured learning, part 2 13 Alignment models and applications to machine translation and question answering 14 Alignment models, part 2 15 Advanced techniques: question answering, reading comprehension, summarization MS 9, 10 + materials MS 11, 12 + materials MS 13 + materials Materials provided by instructor Classroom Behavior Policy To foster a positive learning environment, students and instructors have a shared responsibility. We want a safe, welcoming, and inclusive environment where all of us feel comfortable with each other and where we can challenge ourselves to succeed. To that end, our focus is on the tasks at hand and not on extraneous activities (e.g., texting, chatting, reading a newspaper, making phone calls, web surfing, etc.). 4

Inclusive Excellence is a fundamental part of the University of Arizona s strategic plan and culture. As part of this initiative, the institution embraces and practices diversity and inclusiveness. These values are expected, respected and welcomed in this course. Students are asked to refrain from disruptive conversations with people sitting around them during lecture. Students observed engaging in disruptive activity will be asked to cease this behavior. Those who continue to disrupt the class will be asked to leave lecture or discussion and may be reported to the Dean of Students. Some learning styles are best served by using personal electronics, such as laptops and ipads. These devices can be distracting to other learners. Therefore, students who prefer to use electronic devices for note-taking during lecture should use one side of the classroom. Threatening Behavior Policy The UA Threatening Behavior by Students Policy prohibits threats of physical harm to any member of the University community, including to oneself. See http://policy.arizona.edu/education-andstudent-affairs/threatening-behavior-students. Elective Name and Pronoun Usage This course supports elective gender pronoun use and self-identification; rosters indicating such choices will be updated throughout the semester, upon student request. As the course includes group work and in-class discussion, it is vitally important for us to create an educational environment of inclusion and mutual respect. Accessibility and Accommodations Our goal in this classroom is that learning experiences be as accessible as possible. If you anticipate or experience physical or academic barriers based on disability, please let me know immediately so that we can discuss options. You are also welcome to contact the Disability Resource Center (520-621-3268) to establish reasonable accommodations. For additional information on the Disability Resource Center and reasonable accommodations, please visit http://drc.arizona.edu. If you have reasonable accommodations, please plan to meet with me by appointment or during office hours to discuss accommodations and how my course requirements and activities may impact your ability to fully participate. Please be aware that the accessible table and chairs in this room should remain available for students who find that standard classroom seating is not usable. Code of Academic Integrity Students are encouraged to share intellectual views and discuss freely the principles and applications of course materials. However, graded work/exercises must be the product of independent effort unless otherwise instructed. Students are expected to adhere to the UA Code of Academic Integrity as described in the UA General Catalog. See http://deanofstudents.arizona.edu/academicintegrity/students/academic-integrity. The University Libraries have some excellent tips for avoiding plagiarism, available at http://www.library.arizona.edu/help/tutorials/plagiarism/index.html. Selling class notes and/or other course materials to other students or to a third party for resale is not permitted without the instructor s express written consent. Violations to this and other course rules are subject to the Code of Academic Integrity and may result in course sanctions. Additionally, students who use D2L or UA e-mail to sell or buy these copyrighted materials are subject to Code of Conduct Violations for misuse of student e-mail addresses. This conduct may also constitute copyright infringement. UA Nondiscrimination and Anti-harassment Policy 5

The University is committed to creating and maintaining an environment free of discrimination; see http://policy.arizona.edu/human-resources/nondiscrimination-and-anti-harassment-policy Our classroom is a place where everyone is encouraged to express well-formed opinions and their reasons for those opinions. We also want to create a tolerant and open environment where such opinions can be expressed without resorting to bullying or discrimination of others. Department of Computer Science Code of Conduct The Department of Computer Science is committed to providing and maintaining a supportive educational environment for all. We strive to be welcoming and inclusive, respect privacy and confidentiality, behave respectfully and courteously, and practice intellectual honesty. Disruptive behaviors (such as physical or emotional harassment, dismissive attitudes, and abuse of department resources) will not be tolerated. The complete Code of Conduct is available on our department web site. We expect that you will adhere to this code, as well as the UA Student Code of Conduct, while you are a member of this class. Additional Resources for Students UA Academic policies and procedures are available at http://catalog.arizona.edu/policies Student Assistance and Advocacy information is available at http://deanofstudents.arizona.edu/student-assistance/students/student-assistance Office of Diversity information is available at http://diversity.arizona.edu/ Campus Health information may be found here: http://www.health.arizona.edu/counseling-andpsych-services OASIS Sexual Assault and Trauma Services http://oasis.health.arizona.edu/hpps_oasis_program.htm Subject to Change Statement Information contained in the course syllabus, other than the grade and absence policy, may be subject to change with advance notice, as deemed appropriate by the instructor. 6