Introduction to Machine Learning

Similar documents
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

Welcome to. ECML/PKDD 2004 Community meeting

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

One Hour of Code 10 million students, A foundation for success

CS 100: Principles of Computing

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.

Lecture 1: Basic Concepts of Machine Learning

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11

Axiom 2013 Team Description Paper

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Learning From the Past with Experiment Databases

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Valcik, N. A., & Tracy, P. E. (2013). Case studies in disaster response and emergency management. Boca Raton, FL: CRC Press.

(Sub)Gradient Descent

MYCIN. The MYCIN Task

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Assignment 1: Predicting Amazon Review Ratings

Houghton Mifflin Online Assessment System Walkthrough Guide

Lecture 1: Machine Learning Basics

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

BUSINESS OCR LEVEL 2 CAMBRIDGE TECHNICAL. Cambridge TECHNICALS BUSINESS ONLINE CERTIFICATE/DIPLOMA IN R/502/5326 LEVEL 2 UNIT 11

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Physics Experimental Physics II: Electricity and Magnetism Prof. Eno Spring 2017

CS 101 Computer Science I Fall Instructor Muller. Syllabus

CSL465/603 - Machine Learning

University of Groningen. Systemen, planning, netwerken Bosman, Aart

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Laboratorio di Intelligenza Artificiale e Robotica

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

CENTRAL MAINE COMMUNITY COLLEGE Introduction to Computer Applications BCA ; FALL 2011

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

ACCOUNTING FOR MANAGERS BU-5190-OL Syllabus

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Strategic Management (MBA 800-AE) Fall 2010

Biology 1 General Biology, Lecture Sections: 47231, and Fall 2017

The Enterprise Knowledge Portal: The Concept

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

International Business Bachelor. Corporate Finance. Summer Term Prof. Dr. Ralf Hafner

Orange Coast College Spanish 180 T, Th Syllabus. Instructor: Jeff Brown

Laboratorio di Intelligenza Artificiale e Robotica

CS Course Missive

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

How long did... Who did... Where was... When did... How did... Which did...

Postprint.

LEARNER VARIABILITY AND UNIVERSAL DESIGN FOR LEARNING

An Introduction to Simio for Beginners


Syllabus: CS 377 Communication and Ethical Issues in Computing 3 Credit Hours Prerequisite: CS 251, Data Structures Fall 2015

Bittinger, M. L., Ellenbogen, D. J., & Johnson, B. L. (2012). Prealgebra (6th ed.). Boston, MA: Addison-Wesley.

CHEM:1070 Sections A, B, and C General Chemistry I (Fall 2017)

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Outreach Connect User Manual

Ruggiero, V. R. (2015). The art of thinking: A guide to critical and creative thought (11th ed.). New York, NY: Longman.

SYLLABUS- ACCOUNTING 5250: Advanced Auditing (SPRING 2017)

Introduction to Psychology

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Android App Development for Beginners

Instructor Dr. Kimberly D. Schurmeier

Prerequisite: General Biology 107 (UE) and 107L (UE) with a grade of C- or better. Chemistry 118 (UE) and 118L (UE) or permission of instructor.

Learning Methods for Fuzzy Systems

LEARN TO PROGRAM, SECOND EDITION (THE FACETS OF RUBY SERIES) BY CHRIS PINE

Conversation Starters: Using Spatial Context to Initiate Dialogue in First Person Perspective Games

BOS 3001, Fundamentals of Occupational Safety and Health Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes.

Machine Learning and Development Policy

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Navigating the PhD Options in CMS

Introduction to Causal Inference. Problem Set 1. Required Problems

Please find below a summary of why we feel Blackboard remains the best long term solution for the Lowell campus:

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Telekooperation Seminar

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Foothill College Summer 2016

The Creation and Significance of Study Resources intheformofvideos

EDEXCEL NATIONALS UNIT 25 PROGRAMMABLE LOGIC CONTROLLERS. ASSIGNMENT No.1 SELECTION CRITERIA

WELCOME PATIENT CHAMPIONS!

Intermediate Computable General Equilibrium (CGE) Modelling: Online Single Country Course

COMM370, Social Media Advertising Fall 2017

Introduction, Organization Overview of NLP, Main Issues

music downloads. free and free music downloads like

OFFICE SUPPORT SPECIALIST Technical Diploma

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

PSY 1010, General Psychology Course Syllabus. Course Description. Course etextbook. Course Learning Outcomes. Credits.

Human Emotion Recognition From Speech

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

Beveridge Primary School. One to one laptop computer program for 2018

Modeling user preferences and norms in context-aware systems

Creating Your Term Schedule

Photography: Photojournalism and Digital Media Jim Lang/B , extension 3069 Course Descriptions

Blackboard Communication Tools

Probabilistic Latent Semantic Analysis

Transcription:

1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer and Jyrki Kivinen) November 2nd December 15th 2017

2, Introduction Practical details of the course Lectures Exercises Exam Grading Course outline What is machine learning? Motivation & examples Definition Relation to other fields Examples

3, Practical details (1) Lectures: November 2nd (today) December 15th Thursdays at 2pm-4pm and Fridays at 10am-12pm in Exactum CK112 Lecturer: Teemu Roos (Exactum A322, teemu.roos@cs.helsinki.fi) Language: English Based on the course textbook (next slide) (previous instances of this course have used different textbooks)

4, Practical details (2) Textbook: authors: Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani title: An Introduction to Statistical Learning with Applications in R publisher: Springer (2013, first edition) web page: www-bcf.usc.edu/~gareth/isl/ we ll cover the whole book except splines and generalized additive models (GAMs) and include some additional Bayesian stuff

5, Practical details (3) Lecture material this set of slides (by Hoyer/Kivinen/Roos) is intended for use as part of the actual lectures, together with the blackboard etc. we will cover some topics in more detail than the textbook (and some less) in particular some additional detail is needed for homework problems both the selected parts of the textbook as well as additional material indicated on the course homepage are required material for the exam

6, Practical details (4) Exercises: Two kinds: mathematical exercises (pencil-and-paper) computer exercises (support given in R but Python is a good choice too) Problem set handed out every Friday, focusing on topics from that week s lectures Solutions returned at the exercise sessions NB: You get points only by attending your own exercise group (not group 99) Solutions can be returned by email only in exceptional circumstances, not including being busy at work: email janne.leppa-aho@helsinki.fi Language of exercise sessions: English Exercise points make up 40% of your total grade, must get at least half the points to be eligible for the course exam.

7, Practical details (5) Exercises this week: This week we offered voluntary R tutorials Wednesday Nov 1st at 4pm (C222) and Thursday Nov 2nd at 12pm (D123) Instruction on R and its features used on this course Voluntary, no points awarded. Recommended for everyone not previously familiar with R. Bring you own laptop, with R (and possible RStudio) installed.

8, Practical details (6) Course exam (these can sometimes change with short notice!): December 19th at 9:00am (NB: not 9:15am) Makes 60% of your course grade Must get a minimum of half the points of the exam to pass the course Pencil-and-paper problems, similar style as in exercises (also essay or explain problems) (Note: To be eligible to take a separate exam you need to first complete some programming assignments. These will be available on the course web page a bit later. However since you are here at the lecture, this probably does not concern you.) You may answer exam problems also in Finnish or Swedish.

9, Practical details (8) Prerequisites: Mathematics: Basics of probability theory and statistics, linear algebra (i.e., vectors and matrices) and real analysis (i.e., derivatives, etc.) Computer science: Good programming skills (but no previous familiarity with R necessary)

10, Related courses Various advanced Data Science courses: Advanced Course in Machine Learning (period IV) Introduction to Bayesian Inference (period II) Computational Statistics I-II (periods I-II) High Dimensional Statistics (period II) Introduction to Data Science (period I) Data Mining (self study, plus optional project) Deep Learning (period II) Seminar: Deep Learning for Natural Language Processing (periods III-IV) Seminar: Machine Learning Methods for Fossil Data Analysis (period II) Lots of courses at Aalto as well!

11, Practical details (9) Course material: Webpage (public information about the course): https://courses.helsinki.fi/en/data11002/119123177 NB: You should have signed up on the department registration system Help? Ask the assistants/lecturer at exercises/lectures Contact assistants/lecturer

12, Course outline Introduction Ingredients of machine learning task, models, data evaluation and model selection Supervised learning classification regression Unsupervised learning clustering dimension reduction

13, What is machine learning? Definition: machine = computer, computer program (in this course) learning = improving performance on a given task, based on experience / examples In other words instead of the programmer writing explicit rules for how to solve a given problem, the programmer instructs the computer how to learn from examples in many cases the computer program can even become better at the task than the programmer is!

Example 1: tic-tac-toe How to program the computer to play tic-tac-toe? Option A: The programmer writes explicit rules, e.g. if the opponent has two in a row, and the third is free, stop it by placing your mark there, etc (lots of work, difficult, not at all scalable!) Option B: Go through the game tree, choose optimally (for non-trivial games, must be combined with some heuristics to restrict tree size) Option C: Let the computer try out various strategies by playing against itself and others, and noting which strategies lead to winning and which to losing (= machine learning ) 14,

15, Arthur Samuel (50 s and 60 s): Computer program that learns to play checkers Program plays against itself thousands of times, learns which positions are good and which are bad (i.e. which lead to winning and which to losing) The computer program eventually becomes much better than the programmer.

16, Example 2: spam filter Programmer writes rules: If it contains viagra then it is spam. (difficult, not user-adaptive) The user marks which mails are spam, which are legit, and the computer learns itself what words are predictive Y { } From: medshop@spam.com Subject: viagra cheap meds... From: my.professor@helsinki.fi Subject: important information here s how to ace the test.... From: mike@example.org Subject: you need to see this how to win $1,000,000... spam non-spam.?

17, Problem setup One definition of machine learning: A computer program improves its performance on a given task with experience (i.e. examples, data). So we need to separate Task: What is the problem that the program is solving? Performance measure: How is the performance of the program (when solving the given task) evaluated? Experience: What is the data (examples) that the program is using to improve its performance?

18, Related scientific disciplines (1) Artificial Intelligence (AI) Machine learning can be seen as one approach towards implementing intelligent machines (or at least machines that behave in a seemingly intelligent way). Artificial neural networks, computational neuroscience Inspired by and trying to mimic the function of biological brains, in order to make computers that learn from experience. Modern machine learning really grew out of the neural networks boom in the 1980 s and early 1990 s. Pattern recognition Recognizing objects and identifying people in controlled or uncontrolled settings, from images, audio, etc. Such tasks typically require machine learning techniques.

19, Availability of data These days it is very easy to collect data (sensors are cheap, much information digital) store data (hard drives are big and cheap) transmit data (essentially free on the internet). The result? Everybody is collecting large quantities of data. Businesses: shops (market-basket data), search engines (web pages and user queries), financial sector (stocks, bonds, currencies etc), manufacturing (sensors of all kinds), social networking sites (facebook, twitter), anybody with a web server (hits, user activity) Science: genomes sequenced, gene expression data, experiments in high-energy physics, images of remote galaxies, global ecosystem monitoring data, drug research and development, public health data But how to benefit from it? Analysis is becoming key!

20, Big Data one definition: data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges (Oxford English Dictionary) 3V: volume, velocity, and variety (Doug Laney, 2001) a database may be able to handle a lot of data, but you can t implement a machine learning algorithm as an SQL query on this course we do not consider technical issues relating to extremely large data sets basic principles of machine learning still apply, but many algorithms may be difficult to implement efficiently

21, Related scientific disciplines (2) Data mining Trying to identify interesting and useful associations and patterns in huge datasets Focus on scalable algorithms Example: shopping basket analysis Statistics historically, introductory courses on statistics tend to focus on hypothesis testing and some other basic problems however there s a lot more to statistics than hypothesis testing there is a lot of interaction between research in machine learning, data mining and statistics

22, Example 3 Prediction of search queries The programmer provides a standard dictionary (words and expressions change!) Previous search queries are used as examples!

23, Example 4 Ranking search results: Various criteria for ranking results What do users click on after a given search? Search engines can learn what users are looking for by collecting queries and the resulting clicks.

24, Example 5 Detecting credit card fraud Credit card companies typically end up paying for fraud (stolen cards, stolen card numbers) Useful to try to detect fraud, for instance large transactions Important to be adaptive to the behaviors of customers, i.e. learn from existing data how users normally behave, and try to detect unusual transactions

25, Example 6 Self-driving cars: Sensors (radars, cameras) superior to humans How to make the computer react appropriately to the sensor data?

26, Example 7 Character recognition: Automatically sorting mail (handwritten characters) Digitizing old books and newspapers into easily searchable format (printed characters)

27, Example 8 Recommendation systems ( collaborative filtering ): Amazon: Customers who bought X also bought Y... Netflix: Based on your movie ratings, you might enjoy... Challenge: One million dollars ($1,000,000) prize money recently awarded! Linda Jack Bill Lucy John Seven Fargo Aliens Leon Avatar 4 5 5 1 2 3 4 3 1 4 1 5 1? 4 1? 2 1 1 5 1 1 4 5 4 5 5 2 3 3

28, Example 9 Machine translation: Traditional approach: Dictionary and explicit grammar More recently, statistical machine translation based on example data is increasingly being used

29, Example 10 Online store website optimization: What items to present, what layout? What colors to use? Can significantly affect sales volume Experiment, and analyze the results! (lots of decisions on how exactly to experiment and how to ensure meaningful results)

30, Example 11 Mining chat and discussion forums Breaking news Detecting outbreaks of infectious disease Tracking consumer sentiment about companies / products

31, Example 12 Real-time sales and inventory management Picking up quickly on new trends (what s hot at the moment?) Deciding on what to produce or order

32, Example 13 Prediction of friends in Facebook, or prediction of who you d like to follow on Twitter.

33, What about privacy? Users are surprisingly willing to sacrifice privacy to obtain useful services and benefits Regardless of what position you take on this issue, it is important to know what can and what cannot be done with various types information (i.e. what the dangers are) Privacy-preserving data mining What type of statistics/data can be released without exposing sensitive personal information? (e.g. government statistics) Developing data mining algorithms that limit exposure of user data (e.g. Collaborative filtering with privacy, Canny 2002)

We re in this together. Let s do it! 34,