CIS 520 Machine Learning

Similar documents
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

(Sub)Gradient Descent

Python Machine Learning

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CS Machine Learning

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Lecture 1: Machine Learning Basics

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

CS177 Python Programming

Lecture 1: Basic Concepts of Machine Learning

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

CS Course Missive

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

San José State University Department of Psychology PSYC , Human Learning, Spring 2017

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Indian Institute of Technology, Kanpur

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

CSL465/603 - Machine Learning

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Generative models and adversarial training

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Learning From the Past with Experiment Databases

CS 101 Computer Science I Fall Instructor Muller. Syllabus

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Exploration. CS : Deep Reinforcement Learning Sergey Levine

INTRO TO FREN 1010 In 15 Mins Or Less INTRO TO FREN 1010 INTRO TO FREN 1010 INTRO TO FREN FREN 1010 sections FREN 1010

CS 100: Principles of Computing

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Human Emotion Recognition From Speech

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

CWSEI Teaching Practices Inventory

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Navigating the PhD Options in CMS

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Assignment 1: Predicting Amazon Review Ratings

Travis Park, Assoc Prof, Cornell University Donna Pearson, Assoc Prof, University of Louisville. NACTEI National Conference Portland, OR May 16, 2012

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Statistics and Data Analytics Minor

Office Hours: Mon & Fri 10:00-12:00. Course Description

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

5 Star Writing Persuasive Essay

EPI BIO 446 DESIGN, CONDUCT, and ANALYSIS of CLINICAL TRIALS 1.0 Credit SPRING QUARTER 2014

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

INTERMEDIATE ALGEBRA Course Syllabus

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Data Structures and Algorithms

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Creating Your Term Schedule

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

Australian Journal of Basic and Applied Sciences

STAT 220 Midterm Exam, Friday, Feb. 24

FONDAMENTI DI INFORMATICA

Course Content Concepts

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

While you are waiting... socrative.com, room number SIMLANG2016

Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter

Student Perceptions of Reflective Learning Activities

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Ryerson University Sociology SOC 483: Advanced Research and Statistics

FINN FINANCIAL MANAGEMENT Spring 2014

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

Math 181, Calculus I

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Reducing Features to Improve Bug Prediction

Mining Association Rules in Student s Assessment Data

Test How To. Creating a New Test

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

*In Ancient Greek: *In English: micro = small macro = large economia = management of the household or family

B. How to write a research paper

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

12- A whirlwind tour of statistics

Word learning as Bayesian inference

Case study Norway case 1

Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Speech Emotion Recognition Using Support Vector Machine

Human-Computer Interaction CS Overview for Today. Who am I? 1/15/2012. Prof. Stephen Intille

Laboratorio di Intelligenza Artificiale e Robotica

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Best Practices in Internet Ministry Released November 7, 2008

Computer Science and Information Technology 2 rd Assessment Cycle

arxiv: v2 [cs.cv] 30 Mar 2017

Introduction, Organization Overview of NLP, Main Issues

Model Ensemble for Click Prediction in Bing Search Ads

Integrating simulation into the engineering curriculum: a case study

The Evolution of Random Phenomena

Food Products Marketing

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

arxiv: v1 [cs.lg] 15 Jun 2015

Reinventing College Physics for Biologists: Explicating an Epistemological Curriculum

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Transcription:

CIS 520 Machine Learning Shivani Agarwal & Lyle Ungar Computer and information Science Lyle Ungar, University of Pennsylvania

Introductions u Who am I? u Who are you? l Why are you here? u What will this course look like? l Lectures & Recitations n Slides, chalkboard, wiki & clickers l Homework n Math and MATLAB n Canvas and turnin l Exams n Midterm and final

Course goals u Be familiar with all major ML methods l Regression (linear, logistic) & feature selection l Decision trees & random forests l Naive Bayes, Bayes Nets, Markov Nets, HMMs l SVM, kernels, PCA, CCA l online learning: boosting l deep learning u Know their strengths and weaknesses l know jargon, concepts, theory l be able to modify and code algorithms l be able to read current literature 3

Introductions (2) u If you re waiting to get into this course l It won t happen L l But the course will be offered again in the spring u Alternate courses l CIS 419/519 Intro to Machine Learning l STAT 471/571/701 Modern Data Mining l CIS 545: Big Data Analytics

Administrivia u Course wiki l l l u Canvas l l u Piazza l Lecture notes Resources n Grading scheme, academic integrity, n office hours, Reading (including the Bishop textbook free online) n Mostly for reading after lectures n But will sometimes add background info Homework, grades Lecture recordings n But don t count on them being useful look here first for answers!

Do you have Polleverywhere? A) Yes B) No

Working Together Homework is mostly pair programming or pair problem solving If it is determined that code submitted by two students might have been copied A) Both will receive half credit B) The person who copied will be referred to the Office of Student Conduct (OSC) C) Both students will be referred to the Office of Student Conduct (OSC) D) None of the above

Asking Questions u Questions about homework should be A) Asked during office hours B) Emailed to the instructor or a TA C) Asked on piazza D) Two of the above E) None of the above

Matlab u We will use MATLAB l Free u Matlab is a better language than python A) True B) False u Matlab and Octave are A) Very different languages B) Almost identical C) Fully interchangeable except for the user interface D) None of the above

Where is Machine Learning used? https://alliance.seas.upenn.edu/~cis520/wiki/ 10

Types of Learning u supervised X, y l Given an observation x, what is the best label y? u unsupervised X l Given a set of x s, cluster or summarize them What kinds of learning are missing here? 11

Types of Learning u supervised X, y l P(y x) - conditional probability estimation l min y est (x) y - optimization u unsupervised l P(x) - generative model X Are you familiar with regression as a conditional probability? A) Yes B) No Are you familiar with regression as a minimization problem? A) Yes B) No 12

Consider the Netflix problem u Given a list of people and the ratings they have given movies, predict their ratings on other movies u What type of learning is this? A) supervised B) unsupervised C) something else u How might you go about solving it? If you have questions, raise your hand and I ll come around. 13

Assessing code quality u Given a bunch of student homework solutions and the ratings that graders gave them for coding style, estimate the ratings for future code. u What type of learning is this? A) supervised B) unsupervised C) something else u How might you go about solving it? 14

ML vs. Statistics 15

TODO u Join piazza l Linked to from the course wiki l https://alliance.seas.upenn.edu/~cis520/wiki u Install Polleverywhere (free) u Install matlab (free from Penn) u Go to canvas l Do HW 0 (trivial latex) 16

What you should know u Turning a real-world problem into a well-posed ML problem is often hard l E.g. generate features/predictors, pick X and y u Unsupervised vs. supervised l Generative P(x) vs. conditional P(y x) models u Canvas, piazza, course wiki 17