- Introduzione al Corso - (a.a )

Similar documents
Lecture 1: Basic Concepts of Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Machine Learning Basics

CSL465/603 - Machine Learning

Python Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

(Sub)Gradient Descent

Natural Language Processing: Interpretation, Reasoning and Machine Learning

Laboratorio di Intelligenza Artificiale e Robotica

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

A Case Study: News Classification Based on Term Frequency

Speech Emotion Recognition Using Support Vector Machine

Human Emotion Recognition From Speech

Learning to Schedule Straight-Line Code

Rule Learning With Negation: Issues Regarding Effectiveness

CS Machine Learning

Rule Learning with Negation: Issues Regarding Effectiveness

Laboratorio di Intelligenza Artificiale e Robotica

Probabilistic Latent Semantic Analysis

Speech Recognition at ICSI: Broadcast News and beyond

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Assignment 1: Predicting Amazon Review Ratings

Linking Task: Identifying authors and book titles in verbose queries

Mining Association Rules in Student s Assessment Data

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Abstractions and the Brain

Welcome to. ECML/PKDD 2004 Community meeting

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Artificial Neural Networks written examination

Learning Methods for Fuzzy Systems

Learning From the Past with Experiment Databases

Natural Language Processing. George Konidaris

Australian Journal of Basic and Applied Sciences

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Reducing Features to Improve Bug Prediction

Seminar - Organic Computing

Lecture 10: Reinforcement Learning

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

A study of speaker adaptation for DNN-based speech synthesis

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Word Segmentation of Off-line Handwritten Documents

Axiom 2013 Team Description Paper

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

CS 446: Machine Learning

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

Knowledge-Based - Systems

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Switchboard Language Model Improvement with Conversational Data from Gigaword

Ensemble Technique Utilization for Indonesian Dependency Parser

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Modeling function word errors in DNN-HMM based LVCSR systems

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Universidade do Minho Escola de Engenharia

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

MYCIN. The MYCIN Task

Characteristics of Collaborative Network Models. ed. by Line Gry Knudsen

Digital Media Literacy

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Applications of data mining algorithms to analysis of medical data

Word Sense Disambiguation

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

On the Formation of Phoneme Categories in DNN Acoustic Models

Calibration of Confidence Measures in Speech Recognition

Learning Methods in Multilingual Speech Recognition

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

arxiv: v2 [cs.cv] 30 Mar 2017

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Test Effort Estimation Using Neural Network

Modeling function word errors in DNN-HMM based LVCSR systems

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

MGT/MGP/MGB 261: Investment Analysis

Build on students informal understanding of sharing and proportionality to develop initial fraction concepts.

Word learning as Bayesian inference

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Knowledge Transfer in Deep Convolutional Neural Nets

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Diploma in Library and Information Science (Part-Time) - SH220

Ontologies vs. classification systems

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Mining Student Evolution Using Associative Classification and Clustering

Innovative Methods for Teaching Engineering Courses

Indian Institute of Technology, Kanpur

Transcription:

Short Course on Machine Learning for Web Mining - Introduzione al Corso - (a.a. 2009-2010) Roberto Basili (University of Roma, Tor Vergata) 1

Overview MLxWM: Motivations and perspectives A temptative syllabus Introduction to Machine Learning 2

WM&R: Motivazioni What is Web Mining? Why IR? Why Machine Learning? What is the IR contribution to Social Web practices? What are the perspectives of the adoption of these technologies? 3

What is Web Mining? Web Mining is currently gathering a number of different technologies required to exploit the huge set of information made availablein the Web: Contents: data but people, locations events, concepts as well Relations: Links, Web structure Thematic, Concpetual and interpersonal links Redundancies (duplicates, quasi-duplicates) Multilinguality Trends e social behaviours Opinions 4

Why IR? The size of the involved information space poses a localization probelm The automatic access is possible only if a suitable notion of relevance is made available Web search proceeds through the computation of a stochastic function i.e. a mapping between the information needs and the useful data/resources 5

Machine Learning vs IR? The information involved in the Web search scenarios is heterogeneous and is intrisecally uncertain, characterized by: Incompleteness Rich data models, complex formats and access modes Vague requirements Subjectivity Timeliness 6

ML vs. IR The pervasivity of the uncertainty aspects in the information distributed in the Web makes the search for globally exact (or exhaustive) solutions impractical Finding diamonds in the rough (Fan Chung, UCSD) 7

ML vs. IR ML technologies propose a wide set of methods, algorithms, strategies and technologies able to locate and develop effective sub-optimal solutions In the learning process the data themselves suggest the proper representation (or mappings) that corresponds to a given hypothesized solution This hypothesis is expected to improve the overall performance of a base system: Accuracy Computational Efficiency 8

Attempted Syllabus Introduction to Machine Learning: between statistics and knowledge engineering Automatic Classification: Decision trees and performance evaluation Probabilistic Text Classification Sequence labeling tasks: Hidden Markov Models Introduction to PAC Learning: the VC dimension Support Vector Machines Kernel-based Learning: sequence kernel Kernel-based Learning: geometrical embeddings and kernels 9

Machine Learning A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with E. (Mitchell, 1997) Critical Issues: Task, Experience, Performance ( P) 10

Experience and Learning In the chess game, for example, the experience can be provided as: Data about the winning (or loosing) games in the past, that suggest positive or negative impact of the followed strategies. Suggestions (or guide) provided by an external observer (also called oracle) Self-observation, that is the analysis of our own previous games, according to an explicit model of the match, strategies, behaviours, 11

Experience and Learning (2) Three forms of learning: Experience based learning or Inductive learning (past matches plus a utility function, i.e. the success score), Supervised Learning (matches annotated in the oracle) Knowledge-based learning, where an explicit task model is availble, and it guides the development of suitable process and behaviour models. 12

Unsupervised learning When no oracle nor any task model is available still methods to improve performances can be developed: A better world/task model can be learned (knowledge acquisition/discovery) Better performance: some form of optimization can be promoted Caching vs. case-based reasoning 13

Unsupervised Learning Example: the MP3 collection Clustering according to audio properties can be applied to develop a hierachical organization Search efficiency increases while the expressiveness of the system knowledge based is also improved are 14

Unsupervised Learning The future interaction between the system and its operational environment is greatly improved. The semantic transparency of the KBs with respect to the traditional (and naive) users significantly increases. are 15

Machine Learning Learning a function from examples: Continuous case: regression Discrete case: classification Example: a function to need a discrete function able to distinguish: 2 classes, cats vs. dogs f : X {cats, dogs} Given a set of examples E for the two classes: We can extract (visible) features (height, has_whiskers, type_of_coat, number_of_legs). The learning algorithms is applied to E and a function h (as the hypotheissi for f) is generated 16

Learning Algorithms and Function classes Boolean functions (e.g., decision trees). Probability functions, (e.g., Bayesian classifier, NB). Analytical functions in vector spaces (halfplanes) Linear case: perceptrons, Support Vector Machines, Non linear case: k-nn, multilayer neural nets, Geometrical approaches, space transformations: embeddings, spectral analysis 17

Decision Trees (Cats vs. Dogs) Height > 50 cm? No Has a fur (coat)? Yes Output: Dog No Yes Has wiskers? No Yes Output: Dog Output: Cat Output: Cat 18

MLxWM: Technological Perspectives Exponential Growth of the problem size Increasing focus on heterogenous (e.g. multimedia) data Social Web: Web 2.0 Software systems are going to play an increasingly important role Software as a Service Personalization 19

The Long Tail Maren Jinnett over data compiled by the UK s Civil 20 Aviation Authority. (Wired Blog network, Oct 2009)

Social Web 21

22

23 Hype Cycle for Social Software 2008 (Source: Gartner[1])

24

25

References Mitchell, Tom. M. 1997. Machine Learning. New York: McGraw-Hill. Kernel machines, neural networks and graphical models, P. Frasconi, A. Sperduti, A. Starita, Rivista AI*IA Numero speciale per i 50 anni di IA, 2007. Nice Video lectures by Andrew Ng (Stanford) http://academicearth.org/courses/machine-learning 27