State of Machine Learning and Future of Machine Learning

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Python Machine Learning

CSL465/603 - Machine Learning

Word Segmentation of Off-line Handwritten Documents

Lecture 1: Machine Learning Basics

Lecture 10: Reinforcement Learning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

CS Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

A Case Study: News Classification Based on Term Frequency

Linking Task: Identifying authors and book titles in verbose queries

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Reinforcement Learning by Comparing Immediate Reward

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Exposé for a Master s Thesis

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Axiom 2013 Team Description Paper

Human Emotion Recognition From Speech

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Speech Emotion Recognition Using Support Vector Machine

Laboratorio di Intelligenza Artificiale e Robotica

Learning Methods for Fuzzy Systems

Learning From the Past with Experiment Databases

Seminar - Organic Computing

Applications of data mining algorithms to analysis of medical data

CS 446: Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

An OO Framework for building Intelligence and Learning properties in Software Agents

Generative models and adversarial training

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

A Neural Network GUI Tested on Text-To-Phoneme Mapping

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Exploration. CS : Deep Reinforcement Learning Sergey Levine

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Computational Data Analysis Techniques In Economics And Finance

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Evolution of Symbolisation in Chimpanzees and Neural Nets

Radius STEM Readiness TM

Universidade do Minho Escola de Engenharia

Mining Association Rules in Student s Assessment Data

Laboratorio di Intelligenza Artificiale e Robotica

BYLINE [Heng Ji, Computer Science Department, New York University,

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

arxiv: v1 [cs.lg] 15 Jun 2015

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

MYCIN. The MYCIN Task

(Sub)Gradient Descent

Learning Methods in Multilingual Speech Recognition

Calibration of Confidence Measures in Speech Recognition

Discriminative Learning of Beam-Search Heuristics for Planning

Using focal point learning to improve human machine tacit coordination

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

A survey of multi-view machine learning

Probabilistic Latent Semantic Analysis

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Artificial Neural Networks written examination

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Australian Journal of Basic and Applied Sciences

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Speeding Up Reinforcement Learning with Behavior Transfer

arxiv: v1 [cs.cv] 10 May 2017

Reducing Features to Improve Bug Prediction

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Knowledge-Based - Systems

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Using dialogue context to improve parsing performance in dialogue systems

School of Innovative Technologies and Engineering

While you are waiting... socrative.com, room number SIMLANG2016

Rule Learning with Negation: Issues Regarding Effectiveness

Modeling user preferences and norms in context-aware systems

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Switchboard Language Model Improvement with Conversational Data from Gigaword

Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain

Indian Institute of Technology, Kanpur

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Corrective Feedback and Persistent Learning for Information Extraction

Georgetown University at TREC 2017 Dynamic Domain Track

Computerized Adaptive Psychological Testing A Personalisation Perspective

Graph Alignment for Semi-Supervised Semantic Role Labeling

Ontologies vs. classification systems

The Strong Minimalist Thesis and Bounded Optimality

Distant Supervised Relation Extraction with Wikipedia and Freebase

Beyond the Pipeline: Discrete Optimization in NLP

Online Updating of Word Representations for Part-of-Speech Tagging

Robot Learning Simultaneously a Task and How to Interpret Human Instructions

A Vector Space Approach for Aspect-Based Sentiment Analysis

Transcription:

State of Machine Learning and Future of Machine Learning (based on the vision of T.M. Mitchell) Rémi Gilleron Mostrare project Lille university and INRIA Futurs www.grappa.univ-lille3.fr/mostrare Collège scientifique FT - January 9, 2007 Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 1 / 1

Outline Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 2 / 1

Utopian View Let us imagine Computers learning from medical records which treatments are more effective for new diseases Houses learning from experience tptimize energy costs based on the particular usage patterns of their occupants Personal software assistants learning from past usage the evolving interests of their users in order to highlight relevant news Personal software assistants learning from the Web in order to organize a journey by using Web services Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 3 / 1

Application Success: Data Mining Objective Extract knowledge from databases Data Mining Industrial systems are available A toolbox of algorithms for extracting, transforming and loading data (ETL), and reporting tools, but, core algorithms are Machine Learning algorithms Some applications Business intelligence Marketing Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 4 / 1

Application success: speech recognition Definition Speech recognition is the process of converting a speech signal into a sequence of words State of speech recognition Commercial systems are available They use machine learning techniques because of greater accuracy by training than by programming by hand Two learning phases: 1 before purchase in a speaker-independent fashion, 2 after purchase in a speaker-dependent fashion Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 5 / 1

Machine Learning A tentative definition by Tom M. Mitchell Central question: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes Definition: Learning = improving performance at some task through experience Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 6 / 1

Outline Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 7 / 1

A basic task: Supervised Classification Classifying objects into a finite set of groups The problem: Task: classifying real-valued vectors into two groups, i.e. searching for a boolean-valued function Experience: a training set of examples which are pairs (input value, output value) Performance: vectors A set of data records A B... J Group 5 t... 1.5 Y 15 f... 3.2 N 4 t... 3.5 Y............... ability to correctly classify unseen input real-valued A set of rules If A 7 and B = t then Group = Y If B = f and J > 3 then Group = N... An unseen input (A = 2, B = f,..., J = 2)? Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 8 / 1

Why is it difficult (1)? oo o oo o Searching in the set of linear functions: In the figure, empirical error rate is 7 34 20% Algorithms exist for linear separation, but it seems that there is no good hypothesis in the set of linear functions Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 9 / 1

Why is it difficult (2)? oo o oo o Searching in the set of all functions: In the figure, empirical error rate is 0 Algorithms must deal with complexity issues, and noisy examples seem influential, and ability to correctly classify unseen examples is doubtful Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 10 / 1

Why is it difficult (3)? oo o oo o A trade-off between the empirical error rate and the complexity of the function: In the figure, low empirical error rate 2 34 5% The function is simple, and such a function would generalize well Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 11 / 1

State of supervised classification Theory Why is it difficult?: the true error rate (correctly classifying new unseen inputs) depends on an unknown probability distribution statistical learning theory: empirical risk, VC dimension, structural risk and regularized risk, bounds PAC learning: polynomial time complexity, learnability real situation is even worse Off-the-shelf algorithms Support vector machines (SVMs), kernel methods, decision trees, logistic regression, neural networks,... Applications A base task in many application domains: texts, biological sequences, medical records,... Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 12 / 1

Discipline of Machine Learning Theory, algorithms and applications Designing a ML system Modeling the problem: defining the task (representation of inputs and outputs), defining the experience, defining the performance Defining an ML algorithm: set of hypothesis, regularization, solving an optimization problem, complexity issues Evaluation The place of ML outgrowth of the intersection of Computer Science and Statistics Human Learning Empirical sciences: biology, economics, social sciences, control theory,... Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 13 / 1

Outline Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 14 / 1

Machine Learning within Computer Science Application successes data mining, text mining, Web mining speech recognition, computer vision, robot control accelerating empirical sciences A growing niche in software engineering the application is too complex to manually write a successful algorithm and it is easy to collect training data the software must be customized to its operational environment Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 15 / 1

State of ML for other tasks A large toolbox of algorithms many off the shelf algorithms for classification and regression Hidden Markov Models and Conditional Random fields for labeling sequences: texts (NER), biological sequences, Algorithms for searching frequent patterns in sets of records, Algorithms for reinforcement learning (learning control strategies for autonomous agents) With some restrictions objects often described by real-valued vectors, sets of training data should be quite large, the good model is chosen, it is difficult to reuse systems from one application to an other application, Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 16 / 1

Outline Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 17 / 1

Less preprocessing and less postprocessing Complex inputs Can we avoid encoding and deal with complex objects? sequences (texts, biological sequences) trees (HTML documents, phylogenetic trees) graphs (XML and Web documents, social networks, gene networks) combining views (texts and images) Complex outputs can we go beyond classification and regression? a hierarchy of groups, not necessarily defining partitions a set of sequences (annotation), a set of trees (parsing),... Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 18 / 1

Less training data semi-supervised learning Can unlabeled data be helpful for supervised learning? labeling can be costly: labeling or annotating texts or images; or impossible: fraudulent actions, sick patients co-training and EM algorithms should help one-class learning algorithms Active learning What is the best strategy for actively collecting training data? Intelligently asking for the label of an unlabeled example Intelligently choosing patients for drug testing Intelligently exploring the domain for an autonomous robot Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 19 / 1

Less effort in designing ML systems Model selection which learning algorithm should be used when? theory: characterize properties of learning algorithms: convergence properties, relative strengths and weaknesses practical: choosing the right model under limited ressources (for instance limited training data or limited time) transfer learning how transfer what is learned for one task in other tasks? Transfering a learned model from one family of genes to another one Combine learning for multi-objective tasks: or instance, annotation of texts with named entity, semantic role, relations. Reuse expertise from one domain to another one: for instance, from handwriting analysis systems to face recognition systems. Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 20 / 1

Outline Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 21 / 1

Future of machine learning Let us imagine computers learning from medical records... houses learning from experience... personal software assistants learning from past usage... personal software assistants learning from the Web... lead tpen questions never-ending learners learn in a cumulative way self-supervised learners define their experience intelligent learners may use their domain knowledge in learning and the relations between Machine Learning and Computer perception between Machine learning and human learning Rémi Gilleron (INRIA Futurs) state and future of ML collège scientifique FT 2007 22 / 1