Introduction to Machine Learning

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Python Machine Learning

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Laboratorio di Intelligenza Artificiale e Robotica

University of Groningen. Systemen, planning, netwerken Bosman, Aart

FONDAMENTI DI INFORMATICA

Axiom 2013 Team Description Paper

How long did... Who did... Where was... When did... How did... Which did...

A Case Study: News Classification Based on Term Frequency

Lecture 1: Basic Concepts of Machine Learning

One Hour of Code 10 million students, A foundation for success

Learning From the Past with Experiment Databases

An Introduction to Simio for Beginners

CSL465/603 - Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Syllabus: CS 377 Communication and Ethical Issues in Computing 3 Credit Hours Prerequisite: CS 251, Data Structures Fall 2015

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

Data Structures and Algorithms

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Welcome to. ECML/PKDD 2004 Community meeting

CENTRAL MAINE COMMUNITY COLLEGE Introduction to Computer Applications BCA ; FALL 2011

MTH 215: Introduction to Linear Algebra

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Laboratorio di Intelligenza Artificiale e Robotica

The Enterprise Knowledge Portal: The Concept

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11

WELCOME PATIENT CHAMPIONS!

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

BUSINESS OCR LEVEL 2 CAMBRIDGE TECHNICAL. Cambridge TECHNICALS BUSINESS ONLINE CERTIFICATE/DIPLOMA IN R/502/5326 LEVEL 2 UNIT 11

CS 100: Principles of Computing

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Computerized Adaptive Psychological Testing A Personalisation Perspective

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

MYCIN. The MYCIN Task

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

Using the CU*BASE Member Survey

Introduction to Psychology

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Business Computer Applications CGS 1100 Course Syllabus. Course Title: Course / Prefix Number CGS Business Computer Applications

Data Fusion Models in WSNs: Comparison and Analysis

Word Segmentation of Off-line Handwritten Documents

COMM370, Social Media Advertising Fall 2017

ACCOUNTING FOR MANAGERS BU-5190-OL Syllabus

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Conversation Starters: Using Spatial Context to Initiate Dialogue in First Person Perspective Games

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

SYLLABUS- ACCOUNTING 5250: Advanced Auditing (SPRING 2017)

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Welcome event for exchange students. Spring 2017

Prerequisite: General Biology 107 (UE) and 107L (UE) with a grade of C- or better. Chemistry 118 (UE) and 118L (UE) or permission of instructor.

Modeling user preferences and norms in context-aware systems

Orange Coast College Spanish 180 T, Th Syllabus. Instructor: Jeff Brown

Study Group Handbook

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Probabilistic Latent Semantic Analysis

New Features & Functionality in Q Release Version 3.1 January 2016

Computers Change the World

Time series prediction

The Heart of Philosophy, Jacob Needleman, ISBN#: LTCC Bookstore:

PROVIDENCE UNIVERSITY COLLEGE

TU-E2090 Research Assignment in Operations Management and Services

Valcik, N. A., & Tracy, P. E. (2013). Case studies in disaster response and emergency management. Boca Raton, FL: CRC Press.

STRATEGIC LEADERSHIP PROCESSES

Outreach Connect User Manual

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers.

Statistical Studies: Analyzing Data III.B Student Activity Sheet 7: Using Technology

Please find below a summary of why we feel Blackboard remains the best long term solution for the Lowell campus:

STUDENT MOODLE ORIENTATION

ESSENTIAL SKILLS PROFILE BINGO CALLER/CHECKER

The open source development model has unique characteristics that make it in some

Mining Association Rules in Student s Assessment Data

CSC200: Lecture 4. Allan Borodin

Phys4051: Methods of Experimental Physics I

(Sub)Gradient Descent

International Business BADM 455, Section 2 Spring 2008

KLI: Infer KCs from repeated assessment events. Do you know what you know? Ken Koedinger HCI & Psychology CMU Director of LearnLab

I. PREREQUISITE For information regarding prerequisites for this course, please refer to the Academic Course Catalog.

UPDATES. Bronco Bookstore. Spring 2015

music downloads. free and free music downloads like

Houghton Mifflin Online Assessment System Walkthrough Guide

Human Emotion Recognition From Speech

Handbook for Graduate Students in TESL and Applied Linguistics Programs

University of Pittsburgh Department of Slavic Languages and Literatures. Russian 0015: Russian for Heritage Learners 2 MoWe 3:00PM - 4:15PM G13 CL

Alignment of Australian Curriculum Year Levels to the Scope and Sequence of Math-U-See Program

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter

JFK Middle College. Summer & Fall 2014

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Transcription:

1, 582631 5 credits Introduction to Machine Learning Lecturer: Jyrki Kivinen Assistant: Johannes Verwijnen Department of Computer Science University of Helsinki based on material created by Patrik Hoyer and others 28 October 12 December 2014

2, Introduction What is machine learning? Motivation & examples Definition Relation to other fields Examples Course outline and related courses Practical details of the course Lectures Exercises Exam Grading

3, What is machine learning? Definition: machine = computer, computer program (in this course) learning = improving performance on a given task, based on experience / examples In other words instead of the programmer writing explicit rules for how to solve a given problem, the programmer instructs the computer how to learn from examples in many cases the computer program can even become better at the task than the programmer is!

Example 1: How to program the computer to play tic-tac-toe? Option A: The programmer writes explicit rules, e.g. if the opponent has two in a row, and the third is free, stop it by placing your mark there, etc (lots of work, difficult, not at all scalable!) Option B: Go through the game tree, choose optimally (for non-trivial games, must be combined with some heuristics to restrict tree size) Option C: Let the computer try out various strategies by playing against itself and others, and noting which strategies lead to winning and which to losing (= machine learning ) 4,

5, Arthur Samuel (50 s and 60 s): Computer program that learns to play checkers Program plays against itself thousands of times, learns which positions are good and which are bad (i.e. which lead to winning and which to losing) The computer program eventually becomes much better than the programmer.

6, Example 2: spam filter Programmer writes rules: If it contains viagra then it is spam. (difficult, not user-adaptive) The user marks which mails are spam, which are legit, and the computer learns itself what words are predictive Y { } From: medshop@spam.com Subject: viagra cheap meds... From: my.professor@helsinki.fi Subject: important information here s how to ace the test.... From: mike@example.org Subject: you need to see this how to win $1,000,000... spam non-spam.?

7, Example 3: face recognition Face recognition is hot (facebook, apple; security;... ) Programmer writes rules: If short dark hair, big nose, then it is Mikko (impossible! how do we judge the size of the nose?!) The computer is shown many (image, name) example pairs, and the computer learns which features of the images are predictive (difficult, but not impossible)... patrik antti doris patrik...?

8, Problem setup One definition of machine learning: A computer program improves its performance on a given task with experience (i.e. examples, data). So we need to separate Task: What is the problem that the program is solving? Performance measure: How is the performance of the program (when solving the given task) evaluated? Experience: What is the data (examples) that the program is using to improve its performance?

9, Related scientific disciplines (1) Artificial Intelligence (AI) Machine learning can be seen as one approach towards implementing intelligent machines (or at least machines that behave in a seemingly intelligent way). Artificial neural networks, computational neuroscience Inspired by and trying to mimic the function of biological brains, in order to make computers that learn from experience. Modern machine learning really grew out of the neural networks boom in the 1980 s and early 1990 s. Pattern recognition Recognizing objects and identifying people in controlled or uncontrolled settings, from images, audio, etc. Such tasks typically require machine learning techniques.

10, Availability of data These days it is very easy to collect data (sensors are cheap, much information digital) store data (hard drives are big and cheap) transmit data (essentially free on the internet). The result? Everybody is collecting large quantities of data. Businesses: shops (market-basket data), search engines (web pages and user queries), financial sector (stocks, bonds, currencies etc), manufacturing (sensors of all kinds), social networking sites (facebook, twitter), anybody with a web server (hits, user activity) Science: genomes sequenced, gene expression data, experiments in high-energy physics, images of remote galaxies, global ecosystem monitoring data, drug research and development, public health data But how to benefit from it? Analysis is becoming key!

11, Big Data one definition: data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges (Oxford English Dictionary) 3V: volume, velocity, and variety (Doug Laney, 2001) a database may be able to handle a lot of data, but you can t implement a machine learning algorithm as an SQL query on this course we do not consider technical issues relating to extremely large data sets basic principles of machine learning still apply, but many algorithms may be difficult to implement efficiently

12, Related scientific disciplines (2) Data mining Trying to identify interesting and useful associations and patterns in huge datasets Focus on scalable algorithms Example: On the order of 3 million people grocery shopping twice a week in just two main chains in Finland each chain would collect hundreds of thousands of transaction receipts per day! Statistics Traditionally: focus on testing hypotheses based on theory Has contributed a lot to data mining and machine learning, and has also evolved by incorporating ideas derived from these fields

13, Example 4 Prediction of search queries The programmer provides a standard dictionary (words and expressions change!) Previous search queries are used as examples!

14, Example 5 Ranking search results: Various criteria for ranking results What do users click on after a given search? Search engines can learn what users are looking for by collecting queries and the resulting clicks.

15, Example 6 Detecting credit card fraud Credit card companies typically end up paying for fraud (stolen cards, stolen card numbers) Useful to try to detect fraud, for instance large transactions Important to be adaptive to the behaviors of customers, i.e. learn from existing data how users normally behave, and try to detect unusual transactions

16, Example 7 Self-driving cars: Sensors (radars, cameras) superior to humans How to make the computer react appropriately to the sensor data?

17, Example 8 Character recognition: Automatically sorting mail (handwritten characters) Digitizing old books and newspapers into easily searchable format (printed characters)

18, Example 9 Recommendation systems ( collaborative filtering ): Amazon: Customers who bought X also bought Y... Netflix: Based on your movie ratings, you might enjoy... Challenge: One million dollars ($1,000,000) prize money recently awarded! Linda Jack Bill Lucy John Seven Fargo Aliens Leon Avatar 4 5 5 1 2 3 4 3 1 4 1 5 1? 4 1? 2 1 1 5 1 1 4 5 4 5 5 2 3 3

19, Example 10 Machine translation: Traditional approach: Dictionary and explicit grammar More recently, statistical machine translation based on example data is increasingly being used

20, Example 11 Online store website optimization: What items to present, what layout? What colors to use? Can significantly affect sales volume Experiment, and analyze the results! (lots of decisions on how exactly to experiment and how to ensure meaningful results)

21, Example 12 Mining chat and discussion forums Breaking news Detecting outbreaks of infectious disease Tracking consumer sentiment about companies / products

22, Example 13 Real-time sales and inventory management Picking up quickly on new trends (what s hot at the moment?) Deciding on what to produce or order (example: Jopo production moved from Taiwan to Finland for a quicker response to incoming sales data YLE 10.6.2010)

23, Example 14 Prediction of friends in Facebook, or prediction of who you d like to follow on Twitter.

24, What about privacy? Users are surprisingly willing to sacrifice privacy to obtain useful services and benefits Regardless of what position you take on this issue, it is important to know what can and what cannot be done with various types information (i.e. what the dangers are) Privacy-preserving data mining What type of statistics/data can be released without exposing sensitive personal information? (e.g. government statistics) Developing data mining algorithms that limit exposure of user data (e.g. Collaborative filtering with privacy, Canny 2002)

25, Course outline Introduction Data data types and quality, preprocessing similarity/distance measures, visualization Supervised learning classification regression evaluation and model selection Unsupervised learning clustering anomaly detection

26, What has changed course used to be 4 credit, now 5 credits one more homework assignment one extra week of lectures more explanation on some parts that are seen as difficult more on unsupervised learning?

27, Related courses Various continuation courses at CS (spring 2015): Probabilistic Models (period III) Project in Practical Machine Learning (period III) Unsupervised Machine Learning (period IV) Data Mining (period IV) Big Data Frameworks (period IV) A number of other specialized courses at CS department A number of courses at maths+stats Lots of courses at Aalto as well

28, Practical details (1) Lectures: 28 October (today) 12 December Tuesdays and Fridays at 10:15 12:00 in Exactum C222 Lecturer: Jyrki Kivinen (Exactum B229a, jyrki.kivinen@cs.helsinki.fi) Language: English Based on parts of the course textbook (next slide) Lecture slides available online soon after each lecture

29, Practical details (2) Textbook: Tan, Steinbach, Kumar: Introduction to Data Mining (2005 or 2013 edition) This course covers (much of) chapters 1 5 and 8 10. There will be assigned reading each week Although lectures and assigned reading from the textbook mostly overlap, the course requirements consist of the union of the two Kumpula science library has a number of copies that can be borrowed and one reading room copy

Practical details (3) Exercises: course assistant: Johannes Verwijnen Learning by doing: mathematical exercises (pen-and-paper) computer exercises (with Matlab, Octave or R) Problem set handed out every Friday, focusing on topics from that week s lectures Deadline for handing in your solutions is next Friday at 23:59. In the exercise session on the day before deadline (Thu 10:15 12:00), you can discuss the problems with the assistant and with other students. Attending exercise sessions is completely voluntary. Language of exercise sessions: English Exercise points make up 40% of your total grade, must get at least half the points to be eligible for the course exam. Details will appear on the course web page. 30,

31, Practical details (4) Exercises this week: No regular exercise session this week. Instead: instruction on Matlab, Octave, and R. Choose either of the following: Tuesday 28 October (today) at 12:15 in B221, or Friday 31 October at 12:15 in B221 Voluntary, no points awarded. Recommended for everyone not previously familiar with Matlab, Octave, nor R.

32, Practical details (5) Computer exercises: Choose one of Matlab (dominant in computer science and engineering, commercial software) Octave (free clone of Matlab, mainly compatible) R (dominant in statistics, free software) If you wish to use some other language, discuss it with the teaching assistant (Johannes).

33, Matlab, Octave, and R Common features: Environments for numerical/statistical calculations Scripts to automate (matlab/octave:.m files, R:.R files) Native representations for matrices and vectors Allow standard programming constructs: variables, functions, loops, conditional statements Optimized for matrix and vector operations. Avoid explicit loops whenever possible! As always: Use descriptive variable and function names Indent your code to show the structure Comment your code! Write functions for any code snippets that you re-use

34, Practical details (6) Course exam: 17 December at 9:00 (double-check a few days before the exam) Constitutes 60% of your course grade Must get a minimum of half the points of the exam to pass the course Pen-and-paper problems, similar style as in exercises (also essay or explain problems) Note: To be eligible to take a separate exam you need to first complete some programming assignments. These will be available on the course web page a bit later. Answering the exam problems in Finnish (or Swedish) is OK.

35, Practical details (7) Grading: Exercises: (typically: 3 pen-and-paper and 1 programming problem per week) Programming problem graded to 0 15 points Pen-and-paper problems graded to 0 3 points Attendance in first week Matlab/Octave/R exercises: Voluntary, no points Exam: (4 5 problems) Pen-and-paper: 0 6 points/problem (tentative) Rescaling done so that 40% of total points come from exercises, 60% from exam Half of all total points required for lowest grade, close to maximum total points for highest grade Note: Must get at least half the points of the exam, and must get at least half the points available from the exercises

36, Practical details (8) Prerequisites: Mathematics: Basics of probability theory and statistics, linear algebra and real analysis Computer science: Basics of programming (but no previous familiarity with Matlab, Octave, or R necessary) Prerequisites quiz! For you to get a sense of how well you know the prerequisites For me to get a sense of how well you (in aggregate!) know the prerequisites. Fully anonymous!

37, Practical details (9) Course material: Webpage (public information about the course): http://www.cs.helsinki.fi/en/courses/582631/2014/s/k/1 Sign up in Ilmo (department registration system) Help? Ask the assistants/lecturer at exercises/lectures Contact assistants/lecturer separately

Questions? 38,