Data Mining: Practical Machine Learning Techniques

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

Lecture 1: Machine Learning Basics

Lecture 1: Basic Concepts of Machine Learning

Probabilistic Latent Semantic Analysis

CS Machine Learning

Word Segmentation of Off-line Handwritten Documents

Rule Learning With Negation: Issues Regarding Effectiveness

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Laboratorio di Intelligenza Artificiale e Robotica

Knowledge-Based - Systems

(Sub)Gradient Descent

Learning Methods for Fuzzy Systems

Rule Learning with Negation: Issues Regarding Effectiveness

Applications of data mining algorithms to analysis of medical data

Assignment 1: Predicting Amazon Review Ratings

A Case Study: News Classification Based on Term Frequency

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Seminar - Organic Computing

Computerized Adaptive Psychological Testing A Personalisation Perspective

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Axiom 2013 Team Description Paper

Time series prediction

Applications of memory-based natural language processing

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Australian Journal of Basic and Applied Sciences

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

CSL465/603 - Machine Learning

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Speech Emotion Recognition Using Support Vector Machine

Using dialogue context to improve parsing performance in dialogue systems

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Using focal point learning to improve human machine tacit coordination

Reducing Features to Improve Bug Prediction

Linking Task: Identifying authors and book titles in verbose queries

MYCIN. The MYCIN Task

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

GACE Computer Science Assessment Test at a Glance

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Laboratorio di Intelligenza Artificiale e Robotica

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Evolutive Neural Net Fuzzy Filtering: Basic Description

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Interactive Whiteboard

Switchboard Language Model Improvement with Conversational Data from Gigaword

A Genetic Irrational Belief System

Human Emotion Recognition From Speech

Universidade do Minho Escola de Engenharia

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Welcome to. ECML/PKDD 2004 Community meeting

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Issues in the Mining of Heart Failure Datasets

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Radius STEM Readiness TM

Speech Recognition at ICSI: Broadcast News and beyond

Learning and Transferring Relational Instance-Based Policies

Generative models and adversarial training

Study and Analysis of MYCIN expert system

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Evolution of Symbolisation in Chimpanzees and Neural Nets

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Mining Student Evolution Using Associative Classification and Clustering

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Learning to Schedule Straight-Line Code

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

AQUA: An Ontology-Driven Question Answering System

FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification

Controlled vocabulary

Clouds = Heavy Sidewalk = Wet. davinci V2.1 alpha3

SARDNET: A Self-Organizing Feature Map for Sequences

Dinesh K. Sharma, Ph.D. Department of Management School of Business and Economics Fayetteville State University

Uncertainty concepts, types, sources

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

INPE São José dos Campos

Modeling user preferences and norms in context-aware systems

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Transcription:

Artificial Intelligence Data Mining: Practical Machine Learning Techniques Dae-Won Kim School of Computer Science & Engineering Chung-Ang University

AI Scope 1. Search-based optimization techniques for real-life problems Hill climbing, Branch and bound, A*, Greedy algorithm Simulated annealing, Tabu search, Genetic algorithm 2. Machine Learning/ Pattern Recognition/ Data Mining Classification: Bayesian algorithm, Nearest-neighbor algorithm, Neural network Clustering: Hierarchical algorithm, K-Means algorithm 3. Reasoning: Logic, Inference, and knowledge representation Logical language: Syntax and Semantics Inference algorithm: Forward/Backward chaining, Resolution, and Expert System 4. Uncertainty based on Probability theory 5. Planning, Scheduling, Robotics, and Industry Automation

Did you ever hear about Big Data?

Progress in digital data acquisition and storage technology has resulted in the growth of huge databases.

Data mining is the extraction of implicit, previously unknown, and potentially useful information from data.

We build algorithms that sift through databases automatically, seeking patterns.

Strong patterns, if found, will likely generalize to make accurate predictions on future data.

Algorithms need to be robust enough to cope with imperfect data and to extract patterns that are inexact useful.

Machine learning provides the technical basis of data mining.

We will study simple machine learning methods, looking for patterns in data.

People has been seeking patterns in data since human life began.

In data mining, computer algorithm is solving problems by analyzing data in databases.

Data mining is defined as the process of (knowledge) discovering patterns in data.

Data mining is defined as the process of (knowledge) discovering patterns in data.

We start with a simple example.

Q: Tell me the name of this fish.

Algorithm??

We have 100 fishes, and measured their lengths. (e.g., fish: x=[length] t )

Our algorithm can measure the length of a new fish, and estimate its label.

Yes, it is a typical prediction task through classification technique. But, it is often inexact and unsatisfactory.

Next, we measured their lightness. (e.g., fish: x=[lightness])

Lightness is better than length.

Let us use both lightness and width. (e.g., fish: x=[lightness, width])

Each fish is represented a point (vector) in 2D x-y coordinate space.

Everything is represented as N- dimensional vector in coordinate space.

The world is represented as matrix

We assume that you have learned the basic concepts of linear algebra.

The objective is to find a line that effectively separates two groups.

How to find the line using a simple Math from high school?

We can build a complex nonlinear line to provide exact separation.

The formal procedure is given as:

This shows a predictive task of data mining, often called as pattern classification/ recognition/ prediction.

The act of taking in raw data and making an action based on the category of the pattern.

We build a machine that can recognize or predict patterns.

Another famous task of data mining is a descriptive task. Cluster analysis is the well-known group discovery algorithm.

We will experience the basic issues in the prediction task (pattern classification) in forthcoming weeks.

Some terms should be defined.

Given training data set : n x d pattern/data matrix: Fish Lightness Length Weight Width Class Label Fish-1 10 70.3 6.0 36 Salmon Fish-2 10 75.5 8.8 128 Salmon Fish-3 29 51.1 9.4 164 Sea bass Fish-4 36 49.9 8.4 113 Sea bass

Given training data set : n x d pattern/data matrix: d features (attributes, variables, dimensions, fields) Fish Lightness Length Weight Width Class Label Fish-1 10 70.3 6.0 36 Salmon Fish-2 10 75.5 8.8 128 Salmon Fish-3 29 51.1 9.4 164 Sea bass Fish-4 36 49.9 8.4 113 Sea bass n patterns (objects, observations, vectors, records)

Each pattern is represented as a feature vector.

The training pattern matrix is stored in a file or database.

Given labeled training patterns, the class groups are known a priori.

We constructs algorithms to classify new data into the known groups.

Training data vs. Test data

Training data are used as answers. We are learning algorithms using training data.

Test data are a set of new unseen data. We predict class labels using the learned algorithm.

Training data # of data # of features data index feature-1 feature-2 feature-n class label data index feature-1 feature-2 feature-n class label data index Feature-1 feature-2 feature-n class label Test data # of data # of features data index feature-1 feature-2 feature-n data index feature-1 feature-2 feature-n data index Feature-1 feature-2 feature-n

For example, we try to classify the tumor type of breast cancer patients

Breast-cancer-training.txt 100 30 Patient-1 165 52 210 cancer Patient-2 170 50 230 normal Patient-100 160 47 250 cancer Breast-cancer-test.txt 10 30 Patient-1 163 55 215 Patient-2 155 50 240 Patient-10 165 45 235

To evaluate the performance of prediction algorithms, we need a performance measure (Accuracy).

Gold Standard (Truth) Prediction Result Positive Negative Positive True Positive False Positive Negative False Negative True Negative Suspicious Patients with Breast Cancer Prediction Result Positive (Cancer) Negative (Normal) Positive (Cancer) True Positive False Positive Negative (Normal) False Negative True Negative Accuracy = (True Positive + True Negative) / (True Positive + False Positive + False Negative + True Negative)

Gold Standard (Truth) Prediction Result Positive Negative Positive True Positive False Positive Negative False Negative True Negative Suspicious Patients with Breast Cancer Prediction Result Positive (Cancer) Negative (Normal) Positive (Cancer) 30 5 Negative (Normal) 10 55 Accuracy = (30 + 55) / (30 + 5 + 10 + 55) = 0.85 (85%)