Course Syllabus Jump to Today

Similar documents
CS 100: Principles of Computing

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

PSCH 312: Social Psychology

Cross Language Information Retrieval

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

SOCIAL PSYCHOLOGY. This course meets the following university learning outcomes: 1. Demonstrate an integrative knowledge of human and natural worlds

GEOG 473/573: Intermediate Geographic Information Systems Department of Geography Minnesota State University, Mankato

BA 130 Introduction to International Business

COURSE DESCRIPTION PREREQUISITE COURSE PURPOSE

PSYC 2700H-B: INTRODUCTION TO SOCIAL PSYCHOLOGY

Course Syllabus p. 1. Introduction to Web Design AVT 217 Spring 2017 TTh 10:30-1:10, 1:30-4:10 Instructor: Shanshan Cui

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Office: Colson 228 Office Hours: By appointment

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

95723 Managing Disruptive Technologies

LIN 6520 Syntax 2 T 5-6, Th 6 CBD 234

MAR Environmental Problems & Solutions. Stony Brook University School of Marine & Atmospheric Sciences (SoMAS)

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

MARKETING ADMINISTRATION MARK 6A61 Spring 2016

ACADEMIC EXCELLENCE REDEFINED American University of Ras Al Khaimah. Syllabus for IBFN 302 Room No: Course Class Timings:

Course Content Concepts

ACADEMIC POLICIES AND PROCEDURES

Introduction to Psychology

Data Structures and Algorithms

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

ACC : Accounting Transaction Processing Systems COURSE SYLLABUS Spring 2011, MW 3:30-4:45 p.m. Bryan 202

PSYCHOLOGY 353: SOCIAL AND PERSONALITY DEVELOPMENT IN CHILDREN SPRING 2006

HIST 3300 HISTORIOGRAPHY & METHODS Kristine Wirts

BUS Computer Concepts and Applications for Business Fall 2012

PBHL HEALTH ECONOMICS I COURSE SYLLABUS Winter Quarter Fridays, 11:00 am - 1:50 pm Pearlstein 308

BUS 4040, Communication Skills for Leaders Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes. Credits. Academic Integrity

University of Massachusetts Lowell Graduate School of Education Program Evaluation Spring Online

TU-E2090 Research Assignment in Operations Management and Services

Business Administration

Academic Integrity RN to BSN Option Student Tutorial

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

CSL465/603 - Machine Learning

International Environmental Policy Spring :374:315:01 Tuesdays, 10:55 am to 1:55 pm, Blake 131

THE UNIVERSITY OF WINNIPEG

COURSE WEBSITE:

Social Media Marketing BUS COURSE OUTLINE

Class Mondays & Wednesdays 11:00 am - 12:15 pm Rowe 161. Office Mondays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

MKT ADVERTISING. Fall 2016

Course Policies and Syllabus BUL3130 The Legal, Ethical, and Social Aspects of Business Syllabus Spring A 2017 ONLINE

BIOS 104 Biology for Non-Science Majors Spring 2016 CRN Course Syllabus

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)

SOC 1500 (Introduction to Rural Sociology)

MTH 141 Calculus 1 Syllabus Spring 2017

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Probabilistic Latent Semantic Analysis

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

A Case Study: News Classification Based on Term Frequency

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Welcome to WRT 104 Writing to Inform and Explain Tues 11:00 12:15 and ONLINE Swan 305

BHA 4053, Financial Management in Health Care Organizations Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes.

BI408-01: Cellular and Molecular Neurobiology

STUDENT ASSESSMENT, EVALUATION AND PROMOTION

SYLLABUS: RURAL SOCIOLOGY 1500 INTRODUCTION TO RURAL SOCIOLOGY SPRING 2017

KIN 366: Exercise Psychology SYLLABUS for Spring Semester 2012 Department of Kinesiology, Iowa State University

WRITING FOR INTERACTIVE MEDIA

Medical Terminology - Mdca 1313 Course Syllabus: Summer 2017

GEOG Introduction to GIS - Fall 2015

PREPARING FOR THE SITE VISIT IN YOUR FUTURE

Linking Task: Identifying authors and book titles in verbose queries

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Indiana University Northwest Chemistry C110 Chemistry of Life

DIGITAL GAMING AND SIMULATION Course Syllabus Advanced Game Programming GAME 2374

CS Machine Learning

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

Psychology 102- Understanding Human Behavior Fall 2011 MWF am 105 Chambliss

CS/SE 3341 Spring 2012

UNDERGRADUATE SEMINAR

ASTR 102: Introduction to Astronomy: Stars, Galaxies, and Cosmology

The University of Texas at Tyler College of Business and Technology Department of Management and Marketing SPRING 2015

BIOH : Principles of Medical Physiology

Counseling 150. EOPS Student Readiness and Success

San José State University Department of Psychology PSYC , Human Learning, Spring 2017

Course Syllabus. Alternatively, a student can schedule an appointment by .

Department of Legal Assistant Education THE SOONER DOCKET. Enroll Now for Spring 2018 Courses! American Bar Association Approved

SAMPLE. PJM410: Assessing and Managing Risk. Course Description and Outcomes. Participation & Attendance. Credit Hours: 3

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Business Computer Applications CGS 1100 Course Syllabus. Course Title: Course / Prefix Number CGS Business Computer Applications

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

Food Products Marketing

BENCHMARK TREND COMPARISON REPORT:

Texas A&M University - Central Texas PSYK PRINCIPLES OF RESEARCH FOR THE BEHAVIORAL SCIENCES. Professor: Elizabeth K.

Monday/Wednesday, 9:00 AM 10:30 AM

Syllabus: Introduction to Philosophy

INTRODUCTION TO GENERAL PSYCHOLOGY (PSYC 1101) ONLINE SYLLABUS. Instructor: April Babb Crisp, M.S., LPC

Bergen Community College School of Arts, Humanities, & Wellness Department of History & Geography. Course Syllabus

COURSE SYLLABUS HSV 347 SOCIAL SERVICES WITH CHILDREN

Chemistry 106 Chemistry for Health Professions Online Fall 2015

Reference to Tenure track faculty in this document includes tenured faculty, unless otherwise noted.

Office Location: LOCATION: BS 217 COURSE REFERENCE NUMBER: 93000

Transcription:

Course Syllabus Jump to Today LHS 712 Natural Language Processing for Health SYLLABUS Class #: 32394 Instructor: V. G. Vinod Vydiswaran (vgvinodv@umich.edu) Meeting schedule: Thursdays, 1:00 4:00pm, 2813/2817 Medical Sciences-II Building Dates: Jan 5 th, 2017 to April 13 th, 2017 Office Hour: Thursdays, 4:00 4:55 pm, 1161F North Ingalls Building (starting Jan 05) Students in this course will learn advanced techniques to parse and collate information from text -rich health documents such as electronic health records, clinical notes, and peer -reviewed medical literature. In this elective, students will be able to delve deeper into challenges in recognizing medical entities in text documents, extracting clinical information, addressing ambiguity and polysemy, and building searchable interfaces to efficiently and effectively query and retrieve relevant patient data. Students will develop tools and techniques to analyze new genres of health information, and build resources to help in these tasks. Students will also participate in a semester -long project on addressing specific natural language processing challenges in real-life health data sets. A. Learning objectives By completion of the course, students will be able to: 1. Describe the role of medical natural language processing in improving health and healthcare. 2. Identify the major natural language processing challenges in health data. 3. Develop skills to process and extract information from health-related free-text data. 4. Apply the state-of-the-art machine-learning techniques to extract information from medical text data. 5. Analyze and critique natural language processing tools currently available for medical text processing. 6. Explore the recent trends and open directions in the field of medical natural language processing. B. Course Format and Grading This course will be taught using multiple methods, including instructor and student-led discussions, in-class exercises, programming and reading assignments, and a longitudinal project. The instructor will give tutorials and lead discussions to introduce the basic principles and tasks of natural language processing on health data. After the first two weeks, during every week for the rest of the semester, the instructor will start with a tutorial about the methods, followed by a student-led tutorial on applications of the methods. Depending on the background of the cohort, the instructor will decide whether to give more tutorials about the methodology of mining particular genres of data, or let students with the right background lead the tutorials.

B.1. Grading Grading will be based on: Reading and critiquing peer-reviewed papers (15%) Student-led presentation and discussion (10%) Mid-term examination (20%) Assignments based on NLP tools (25%) Semester-long course project (30%) B.1.1. One-page summaries (15%): Each week, starting week 2, students are expected to write a one-page summary based on the reading assignment for the previous week. The reading assignments will cover significant papers on the topics being discussed in class. The one-page summaries are expected to discuss the key concepts described in the paper, rather than merely stating what the paper is about. A summary of key contributions, potential limitations, suggested improvements, and ideas for future follow-up work based on the paper are encouraged. All summaries will be read by the instructor, but not graded or returned to the students. Students are expected to hand-in at least ten summaries over the semester. B.1.2. Student-led presentation and discussion (10%): Students are expected to lead one tutorial on the applications of the methods discussed in class. The students will lead the discussion both during and after their presentation on the topics assigned to them. In preparation of their presentation, students will be required to survey the state-of-the-art techniques from major conferences and journals for recent developments and applications of these methods. The discussion will focus on how to apply the methods to solve particular problems and build various applications. Each student will be in charge of at least one topic, depending on the size of the cohort. Students who are not presenting or leading the discussion will be required to actively participate in discussion and write a short survey on the assigned topic. B.1.3. Mid-term exam (20%): The mid-term exam will be a take-home test and will be administered around Week 8 of the semester (early March 2017). B.1.4. Assignments (25%): There will be 3-4 assignments during the semester based on specific health text processing tasks. The tasks will be closely related to the course material, with real-world data and gold-standard judgments provided. This may include an in-situ data mining challenge using online competition services such as Kaggle-in-Class (http://inclass.kaggle.com). Students can submit and resubmit their results to the competition site and get instant feedback (evaluation metrics) from the system. The task will likely to be selected from one of the follows: severity identification, disease mention detection, forum post classification, etc. B.1.5. Course Project (30%): A course project is required. Individual projects are preferred. Small group projects are acceptable upon justification. The grading of group members will be adjusted according to their contribution to the project.

The course project will take the format of either a software system that applies existing data mining techniques to a specific type of data, or a research experiment documented in the form of a research paper. Examples of course projects include: 1. A de-identification tool for health records using conditional random fields 2. Retrieving information about relevant clinical trials for a given case 3. Comparing authorship networks and communities in different clinical specialties 4. Identifying high-quality consumer-centric resources The grading for the course project will be split as follows: 1. Proposal (15%):A two-page proposal, describing the project topic, objectives, expected deliverable (software package, demo, and/or a technical report), and a list of team members and their expected contribution to the project. Tentative deadline: Around Week 5 (early-february) 2. Progress report (10%):A one-page summary of the progress, any hurdles towards timely completion of the stated objectives. If there are any significant changes to the submitted proposal, the students should describe them in detail in the progress report. Consider this as a checkpoint towards achieving the stated goals of the project. There are no penalties for changes to the proposal document, rather it may be more prudent to recalibrate or clarify the expected outcomes during this stage. Tentative deadline: Around Week 10 (mid-march) 3. Project Presentation (25%):Students will give a short presentation to showcase their project in class. The focus of this presentation is to demonstrate and describe what was done, report interesting observations, present key conclusions, and discuss potential limitations of the study. Students working in teams may choose to present as a group or elect one of the team members to present on their behalf. Students will not be penalized for choosing not to present individually, as long as the project itself is showcased. Tentative schedule: Last lecture of the course (Thursday, April 13, 2017) 4. Final project deliverable and report (50%):Students are expected to submit their project deliverable, along with a brief report. The report should include the key observations and conclusions based on the project and suggest potential follow-up studies. Teams working on the project together must also describe individual contributions of the team members. Tentative deadline: Thursday of Exam week (Thursday, April 20, 2017) C. Policies C.1. Late submission policy Students have 72 hours of buffer grace period for the entire semester. If necessary, students may use it to submit any of the assignments, homework, or the course project reports late without any effect on the overall grade. The grace period, however, cannot be used to submit the exams or quizzes late. A student may use it all on one assignment or use a bit of it for any number of assignments. Once the buffer grace period is used up, late submissions will not be graded. C.2. Academic Conduct C.2.1. Collaboration

The Department of Learning Health Sciences and the instructor strongly encourage collaboration while working on some assignments, such as homework problems and interpreting reading assignments as a general practice. Active learning is effective. Collaboration with other students in the course will be especially valuable in summarizing the reading materials and picking out the key concepts. You must, however, write your homework submission on your own, in your own words, before turning it in. If you worked with someone on the homework before writing it, you must list any and all collaborators on your written submission. Read the instructions carefully and request clarification about collaboration when in doubt. Collaboration is almost always forbidden for take - home and in class exams. C.2.2. Plagiarism All written submissions must be your own, original work. Original work for narrative questions is not mere paraphrasing of someone else's completed answer: you must not share written answers with each other at all. At most, you should be working from notes you took while participating in a study session. You may incorporate selected excerpts from publications by other authors, but they must be clearly marked as quotations and must be attributed. If you build on the ideas of prior authors, you must cite their work. You may obtain copy-editing assistance, and you may discuss your ideas with others, but all substantive writing and ideas must be your own, or be explicitly attributed to another. Please refer to the Rackham s Graduate School Academic for the definition of plagiarism, cheating, and other academic misconduct; the consequences for intentional or unintentional plagiarism; and resources to help you avoid it. The policy handbook is available here: http://www.rackham.umich.edu/current-students/policies/academic-policies (Links to an external site) C.3. Reasonable accommodations The university will provide reasonable accommodations to qualified individuals with disabilities upon request. If you think you need an accommodation for a disability, please let the instructor know at your earliest convenience. Some aspects of this course, the assignments, the in-class activities, and the way we teach may be modified to facilitate your participation and progress. As soon as you make me aware of your needs, we can work with the Office of Services for Students with Disabilities (SSD) to help us determine appropriate accommodations. SSD (734-763-3000; http://www.umich.edu/sswd/) typically recommends accommodations through a Verified Individualized Services and Accommodations (VISA) form. I will treat any information that you provide in as confidential a manner as possible. For more information, see https://ssd.umich.edu/article/americans-disabilities-act-ada (Links to an external site) It is also the University's policy that every reasonable effort be made to help students avoid negative academic consequences when their religious obligations conflict with academic requirements. Students who expect to miss classes, examinations, or other assignments as a consequence of their religious observance are requested to contact the instructor by the drop/add deadline. For more information see https://www.provost.umich.edu/calendar/religious_holidays.html (Links to an external site) D. Tentative Schedule Please check this page periodically for a more detailed, accurate, and up-to-date schedule and reading list. Week 1: Introduction (Jan 05)

Why Medical Natural Language Processing? Challenges of Big Data in Health A. NLP Essentials Week 2: Dealing with words (Jan 12) tokenization, normalization word sense disambiguation ngrams statistical NLP Tools: NLTK Week 3: Processing sentences and corpora (Jan 19) sentence boundary, syntax part of speech tagging negation detection and hedging Regular expression Tools: NegEx B. NLP Tasks and Techniques Week 4: Text classification (Jan 26) Decision trees Support Vector Machines Naïve Bayes Tools: Weka Week 5: Information extraction (Feb 02) Hidden Markov Models Conditional Random Fields Tools: ctakes, Week 6: De-identification (Feb 09) Named entity recognition Protected health information De-identification Tools: MetaMap, MIST Week 7: Information retrieval (Feb 16) Vector space models Probabilistic models term weighting (tf-idf) index construction ranking retrieved results Tools: EMERSE, TREC Clinical Decision Support Task

Week 8: Question answering (Feb 23) question classification query construction passage retrieval answer extraction and ranking Tools: Watson Health Week 9: Advanced topics (Mar 09) Summarization Sentiment analysis Challenges due to acronyms (polysemy, synonymy) Deep learning C. Medical NLP Resources Week 10: Medical ontologies (Mar 16) UMLS ICD code SnoMed Week 11: Medical NLP systems (Mar 23) MetaMap MedLEE ctakes Week 12: Datasets and Shared Challenges (Mar 30) i2b2 and MIMIC NHANES and NAMCS Healthdata.gov TREC-Clinical Decision Support Week 13: Research directions in Medical NLP and Biomedical informatics (Apr 06) Week 14: Project presentations (Apr 13) E. Suggested Readings The readings of this course will be selected from the recent literature in major journals and conference proceedings in the field of medical informatics. They include, but are not limited to, the Journal of American Medical Informatics Association (JAMIA), the Journal of Biomedical Informatics (JBI), the Journal of Medical Internet Research (JMIR), Bioinformatics, and conferences such as the Annual Meeting of the American Medical Informatics Association (AMIA). Some relevant papers published in the Computer Science venues that describe relevant methodologies for natural language tasks will also be selected. Such venues include the Association of Computational Linguistics (ACL), Empirical Methods of Natural Language Processing (EMNLP), and the

Association for the Advancement of Artificial Intelligence (AAAI). It is also encouraged that students review and suggest relevant literature to add to the reading list. E.1. Optional Textbook The following is an optional textbook that could be used for supplemental reading. 1. Kevin B Cohen and Dina Demner-Fushman. Biomedical Natural Language Processing. This book has a good introduction to various biomedical natural language processing tasks for those with a working knowledge of NLP. F. Administrative notes 1. Regular class schedule: Thursdays, 1:00pm to 4:00pm, starting Jan 05 Note: The classes will run on Michigan time (start 10 minutes past the scheduled time). 2. Classroom: 2813/2817 Medical Sciences-II Building For directions, see https://campusinfo.umich.edu/campusmap/102(links to an external site.) 3. Office hours: Thursdays, 4pm to 4:55pm, 1161F North Ingalls Building, starting Jan 05. 4. Course website: We'll be using Canvas for this course. https://ctools.umich.edu(links to an external site) 5. Instructor: V.G.Vinod Vydiswaran, Ph.D. Assistant Professor, Department of Learning Health Sciences, Medical School, University of Michigan Assistant Professor (courtesy), School of Information, University of Michigan 1161F - NIB, 300 N. Ingalls Street, Ann Arbor, MI 48109 (734) 763-0080 vgvinodv@umich.edu(preferred mode to reach the instructor). Note: Please begin the subject line with [LHS 712].