Lecture 1.1: Introduction CSC Machine Learning

Similar documents
Lecture 1: Machine Learning Basics

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

EGRHS Course Fair. Science & Math AP & IB Courses

CSL465/603 - Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

(Sub)Gradient Descent

STA 225: Introductory Statistics (CT)

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Probabilistic Latent Semantic Analysis

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

DOMAIN MISMATCH COMPENSATION FOR SPEAKER RECOGNITION USING A LIBRARY OF WHITENERS. Elliot Singer and Douglas Reynolds

Mathematics. Mathematics

Mathematics Program Assessment Plan

Python Machine Learning

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Statistics and Data Analytics Minor

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

School of Innovative Technologies and Engineering

Relationships Between Motivation And Student Performance In A Technology-Rich Classroom Environment

Switchboard Language Model Improvement with Conversational Data from Gigaword

Human Emotion Recognition From Speech

Measurement. When Smaller Is Better. Activity:

Self Study Report Computer Science

Course Development Using OCW Resources: Applying the Inverted Classroom Model in an Electrical Engineering Course

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Learning From the Past with Experiment Databases

CS Machine Learning

Fashion Design Program Articulation

Bachelor of Science in Mechanical Engineering with Co-op

Corrective Feedback and Persistent Learning for Information Extraction

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

B.S/M.A in Mathematics

Lecture 1: Basic Concepts of Machine Learning

Modeling function word errors in DNN-HMM based LVCSR systems

Mining Student Evolution Using Associative Classification and Clustering

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

A survey of multi-view machine learning

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Rule Learning With Negation: Issues Regarding Effectiveness

Comparison of network inference packages and methods for multiple networks inference

Edinburgh Research Explorer

TREATMENT OF SMC COURSEWORK FOR STUDENTS WITHOUT AN ASSOCIATE OF ARTS

Multivariate k-nearest Neighbor Regression for Time Series data -

Assignment 1: Predicting Amazon Review Ratings

Speech Emotion Recognition Using Support Vector Machine

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Learning Methods for Fuzzy Systems

Statewide Framework Document for:

Content-based Image Retrieval Using Image Regions as Query Examples

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

Math Placement at Paci c Lutheran University

Speaker recognition using universal background model on YOHO database

Reducing Features to Improve Bug Prediction

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

arxiv: v1 [cs.lg] 3 May 2013

A Characterization of Calculus I Final Exams in U.S. Colleges and Universities

Modeling function word errors in DNN-HMM based LVCSR systems

College Pricing and Income Inequality

Issues in the Mining of Heart Failure Datasets

A Comparison of Two Text Representations for Sentiment Analysis

P-4: Differentiate your plans to fit your students

EXAMINING THE DEVELOPMENT OF FIFTH AND SIXTH GRADE STUDENTS EPISTEMIC CONSIDERATIONS OVER TIME THROUGH AN AUTOMATED ANALYSIS OF EMBEDDED ASSESSMENTS

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Methodology

Probability and Statistics Curriculum Pacing Guide

Mathematics subject curriculum

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Research Area

Office Hours: Mon & Fri 10:00-12:00. Course Description

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Robot Learning Simultaneously a Task and How to Interpret Human Instructions

AU MATH Calculus I 2017 Spring SYLLABUS

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

ECE-492 SENIOR ADVANCED DESIGN PROJECT

arxiv: v2 [cs.cv] 30 Mar 2017

Math 4 Units Algebra I, Applied Algebra I or Algebra I Pt 1 and Algebra I Pt 2

Indian Institute of Technology, Kanpur

MULTIDISCIPLINARY TEAM COMMUNICATION THROUGH VISUAL REPRESENTATIONS

Math Techniques of Calculus I Penn State University Summer Session 2017

Artificial Neural Networks written examination

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

AQUA: An Ontology-Driven Question Answering System

Rule Learning with Negation: Issues Regarding Effectiveness

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS APPLIED MECHANICS MET 2025

Self-Supervised Acquisition of Vowels in American English

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Theory of Probability

TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Transcription:

Lecture 1.1: Introduction CSC 84020 - Machine Learning Andrew Rosenberg January 29, 2010

Today Introductions and Class Mechanics.

Background about me Me: Graduated from Columbia in 2009 Research Speech and Natural Language Processing (Computational Linguistics) Specifically analyzing the intonation of speech. Written papers on Evaluation Measures All of my research has relied heavily on Machine Learning

Background about you You: Why are you taking this class? What is your background in and comfort with: Calculus Linear Algebra Probability and Statistics What do you hope to get from this class?

Why does anyone care about Machine Learning?

What IS Machine Learning Automatically identifying patterns in data Automatically making decisions based on data.

Major Tasks of Machine Learning Major Tasks Classification Regression Clustering

Classification Identify which of N classes a data point belongs to. x is a feature vector based on some entity x. Also, sometimes, x = x = f 0 (x) f 1 (x)... f n 1 (x) x 0 x 1... x n 1

Target Values In supervised approaches, in addition to the data point x, we will also have some target value t. In classification, t represents the class of the data point. Goal of classification. Identify a function y, such that y(x) = t.

Graphical Example of Classification

Graphical Example of Classification

Graphical Example of Classification

Graphical Example of Classification

Graphical Example of Classification

Graphical Example of Classification

Regression Regression is another supervised machine learning task. In classification t was a discrete variable, representing the class of the data point, in regression t is a continuous variable. Goal of regression. Identify a function y, such that y(x) = t.

Regression Regression is another supervised machine learning task. In classification t was a discrete variable, representing the class of the data point, in regression t is a continuous variable. Goal of regression. Identify a function y, such that y(x) = t. If the goals of regression and classification are the same, what is the difference?

Regression Regression is another supervised machine learning task. In classification t was a discrete variable, representing the class of the data point, in regression t is a continuous variable. Goal of regression. Identify a function y, such that y(x) = t. If the goals of regression and classification are the same, what is the difference? Evaluation.

Graphical Example of Regression

Graphical Example of Regression

Graphical Example of Regression

Clustering Clustering is an unsupervised task. Therefore we have no target information to learn. Rather, the goal is to identify groups of similar data points, that are dissimilar than others. Technically, identify a partition of the data satisfying these two constraints. 1 Points in the same cluster should be similar 2 Points in different clusters should be dissimilar

Clustering Clustering is an unsupervised task. Therefore we have no target information to learn. Rather, the goal is to identify groups of similar data points, that are dissimilar than others. Technically, identify a partition of the data satisfying these two constraints. 1 Points in the same cluster should be similar 2 Points in different clusters should be dissimilar Now the tricky part: Define Similar.

Graphical Example of Clustering

Graphical Example of Clustering

Graphical Example of Clustering

How do we do this? Feature Extraction Statistical Estimation Mechanisms of Machine Learning.

Mathematical Underpinnings What Math will we use? Probability and Statistics Calculus Linear Algebra

Why do we need such complicated math? How much math? A lot. One common function we will use is the Gaussian Distribution. { N(x µ,σ 2 1 ) = exp 1 } 2πσ 2 2σ2(x µ)2 We will be differentiating and integrating over this function.

Why do we need such complicated math? How much math? A lot. We also look at higher-dimensional Gaussians N(x µ,σ) = { 1 (2π) D/2 Σ 1/2exp 1 } 2 (x µ)t Σ 1 (x µ) We will be differentiating and integrating over this function, too.

Policies and Structure Course website: http://eniac.cs.qc.cuny.edu/andrew/gcml/syllabus.html

Data Data Data All of the work we will do in this class relies on the availability of data to process. UCI: http://archive.ics.uci.edu/ml/ Netflix Prize: http://archive.ics.uci.edu/ml/datasets/netflix+prize LDC (Linguistic Data Consortium): http://www.ldc.upenn.edu/

Bye Next Probability Review! Frequentists v. Bayesians Bayes Rule