SNS College of Engineering. Machine Learning

Similar documents
Python Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

CSL465/603 - Machine Learning

AQUA: An Ontology-Driven Question Answering System

Laboratorio di Intelligenza Artificiale e Robotica

Axiom 2013 Team Description Paper

CS Machine Learning

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Assignment 1: Predicting Amazon Review Ratings

Building Community Online

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

BYLINE [Heng Ji, Computer Science Department, New York University,

Lecture 1: Machine Learning Basics

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Rule Learning With Negation: Issues Regarding Effectiveness

A Case Study: News Classification Based on Term Frequency

CS 446: Machine Learning

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Australian Journal of Basic and Applied Sciences

The Enterprise Knowledge Portal: The Concept

Switchboard Language Model Improvement with Conversational Data from Gigaword

A Vector Space Approach for Aspect-Based Sentiment Analysis

EXPO MILANO CALL Best Sustainable Development Practices for Food Security

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #3 Higher Education Salary Problem

Ontologies vs. classification systems

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Word Segmentation of Off-line Handwritten Documents

Linking Task: Identifying authors and book titles in verbose queries

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Human Emotion Recognition From Speech

PowerTeacher Gradebook User Guide PowerSchool Student Information System

Universidade do Minho Escola de Engenharia

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

ACADEMIC TECHNOLOGY SUPPORT

Reducing Features to Improve Bug Prediction

Laboratorio di Intelligenza Artificiale e Robotica

SCHOOL WITHOUT CLASSROOMS BERLIN ARCHITECTURE COMPETITION TO

Algebra 2- Semester 2 Review

Indian Institute of Technology, Kanpur

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Probabilistic Latent Semantic Analysis

Matching Similarity for Keyword-Based Clustering

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning From the Past with Experiment Databases

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

16.1 Lesson: Putting it into practice - isikhnas

Trend Survey on Japanese Natural Language Processing Studies over the Last Decade

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Cross-lingual Short-Text Document Classification for Facebook Comments

Bootstrapping Personal Gesture Shortcuts with the Wisdom of the Crowd and Handwriting Recognition

The stages of event extraction

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Rule Learning with Negation: Issues Regarding Effectiveness

Applications of memory-based natural language processing

Using Proportions to Solve Percentage Problems I

Using Web Searches on Important Words to Create Background Sets for LSI Classification

P-4: Differentiate your plans to fit your students

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Discriminative Learning of Beam-Search Heuristics for Planning

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

MYCIN. The MYCIN Task

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Parsing of part-of-speech tagged Assamese Texts

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

(Sub)Gradient Descent

Using focal point learning to improve human machine tacit coordination

Expert locator using concept linking. V. Senthil Kumaran* and A. Sankar

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

Statistics and Data Analytics Minor

Airplane Rescue: Social Studies. LEGO, the LEGO logo, and WEDO are trademarks of the LEGO Group The LEGO Group.

Ricopili: Postimputation Module. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

PROCESS USE CASES: USE CASES IDENTIFICATION

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

A Comparison of Two Text Representations for Sentiment Analysis

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

ATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4

Speech Emotion Recognition Using Support Vector Machine

CWIS 23,3. Nikolaos Avouris Human Computer Interaction Group, University of Patras, Patras, Greece

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Transcription:

SNS College of Engineering Machine Learning

About subfield of Artificial Intelligence (AI) name is derived from the concept that it deals with construction and study of systems that can learn from data can be seen as building blocks to make computers learn to behave more intelligently

In other words A computer program is said to learn from experience (E) with some class of tasks (T) and a performance measure (P) if its performance at tasks in T as measured by P improves with E

Terminology Features The number of features or distinct traits that can be used to describe each item in a quantitative manner. Samples A sample is an item to process (e.g. classify). It can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits. Feature vector

Learning (Training) Features: 1. Color: Radish/Red 2. Type : Fruit 3. Shape etc Features: 1. Sky Blue 2. Logo 3. Shape etc Features: 1. Yellow 2. Fruit 3. Shape etc

Workflow

Categories Supervised Learning Unsupervised Learning Semi-Supervised Learning Reinforcement Learning

Supervised Learning the correct classes of the training data are known

Unsupervised Learning the correct classes of the training data are not known

Semi-Supervised Learning A Mix of Supervised and Unsupervised learning

Reinforcement Learning allows the machine or software agent to learn its behavior based on feedback from the environment. This behavior can be learnt once and for all, or keep on adapting as time goes by.

Machine Learning Techniques

Techniques classification: predict class from observations clustering: group observations into meaningful groups regression (prediction): predict value from observations

Classification classify a document into a predefined category. documents can be text, images Popular one is Naive Bayes Classifier. Steps: Step1 : Train the program (Building a Model) using a training set with a category for e.g. sports, cricket, news, Classifier will compute probability for each word, the probability that it makes a document belong

Clustering clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other objects are not predefined For e.g. these keywords man s shoe women s shoe women s t-shirt

K-means Clustering partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.

Hierarchical clustering method of cluster analysis which seeks to build a hierarchy of clusters. There can be two strategies Agglomerative: This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Time complexity is O(n^3) Divisive: This is a "top down" approach: all observations start in

Regression is a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost). regression analysis is a statistical process for estimating the relationships among variables. Regression means to predict

Classification vs Regression Classification means to group the output into a class. classification to predict the type of tumor i.e. harmful or not harmful using training data if it is discrete/categorical Regression means to predict the output value using training data. regression to predict the house price from training data if it is a real number/continuous, then it is regression problem.

Let s see the usage in Real life

Use-Cases Spam Email Detection Machine Translation (Language Translation) Image Search (Similarity) Clustering (K-Means) : Amazon Recommendations Classification : Google News continued

Use-Cases (contd.) Text Summarization - Google News Rating a Review/Comment: Yelp Fraud detection : Credit card Providers Decision Making : e.g. Bank/Insurance sector Sentiment Analysis Speech Understanding iphone with Siri Face Detection Facebook s Photo tagging

it s not (Snapshot of Spam folder) Not a Spam Not a Spam

NER (Named Entity Recognition) http://nlp.stanford.edu:8080/ner/process

Similar/Duplicate Images Remember Features? (Feature Extraction) Can be : Width Height Contrast Brightness Position Hue Colors

Recommendations

Popular Frameworks/Tools Weka Carrot2 Gate OpenNLP LingPipe Stanford NLP Mallet Topic Modelling Gensim Topic Modelling (Python)