Artificial Intelligence Introduction to Machine Learning

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Artificial Intelligence Introduction to Machine Learning"

Transcription

1 Artificial Intelligence Introduction to Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee

2 Introduction Applications which Machine Learning techniques play an important role: Face Recognition Dimensionality Reduction is additionally used for finding important features Face Detection Facial Age Estimation Face recognition Facial age estimation Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 2

3 Introduction Adaboost structure One of the most popular Machine Learning algorithm for applications with high-variety Good for reducing Detection Time while maintaining Detection Accuracy Often used for Automatic face focusing in Digital Cameras Automatic face focusing Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 3

4 Introduction Adaboost structure (Facial Age Estimation) Extracting important features those are difficult to be found by Human Building a Model or Property to be learned that maps the facial features to Predicted age Providing theoretical analysis and practical guidelines Extracted features A Model built by Adaboost Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 4

5 Introduction This is Bigdata era Fast development of technology Burst usage of Internet Lots of Image Sources (Examples) User-created Contents Flickr, Facebook, and YouTube Online source of Images and Texts Explosion of User-created content Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 5

6 Introduction Power of Machine Learning will be increased because Large amount of accessible data means Large amount of examples for training Large amount of examples for training means Higher chance of accurate learning Higher chance of accurate learning means Valuable but undiscovered knowledge is found Improving utility of Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 6

7 What is Machine Learning? What is Machine Learning? Difficult to define precisely because it covers a broad range of processes Modification of a behavioral tendency by experience Modification of Behavior by Experience (Data) Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 7

8 What is Machine Learning? What is Machine Learning? Difficult to define precisely because it covers a broad range of processes Modification of a behavioral tendency by experience Optimizing a performance criterion using example data and past experience Optimizing a performance with regard to a criterion Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 8

9 What is Machine Learning? Machine Learning Changes in systems for AI tasks: Recognition Diagnosis Planning Robot control Prediction A constant program can never be changed with regard to its environment! Relation to AI Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 9

10 What is Machine Learning? AI agent Perceives and models its environment Computes proper Actions considering consequences Agent may change its internal process or component according to Perceives and Actions It is regard as a sort of Learning AI agent Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 10

11 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Programming (Designing) versus Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 11

12 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule What is the underlying rule? Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 12

13 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule Guessing the rule based on a large number of examples Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 13

14 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule Extracting hidden relationship Extracting underlying mapping relationship Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 14

15 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule Extracting hidden relationship Adapting unknown environment Navigation in unknown environment Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 15

16 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule Extracting hidden relationship Adapting unknown environment Too many examples are given Big Data Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 16

17 Designing versus Learning There is no need to learn to calculate minimum credit points for graduation Learning is required when: Humans are unable to explain the hidden rule Extracting hidden relationship Adapting unknown environment Too many examples are given Dynamic environment Dynamic environment: Weather forecasting Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 17

18 Relationships with Other Disciplines Machine Learning unifies several disciplines Relation to other disciplines Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 18

19 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Guessing unknown probability distribution Making decisions based on estimated new samples Machine Learning versus Statistics Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 19

20 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Simplified models of biological neurons Approximating the learning phenomena Popular example: Artificial Neural Network Brain modeling Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 20

21 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Controlling a processing having parameters that must be estimated during operation Popular example: Robot Task-adaptive Robotic Arm Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 21

22 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Psychological Models Studying the performance of living organism in various learning tasks Popular example: Reinforcement Learning Reinforcement Learning Procedure Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 22

23 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Psychological Models Evolutionary Models Mimic the evolution of species Adaptation of Finch w.r.t. its environment or behavior Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 23

24 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Psychological Models Evolutionary Models Mimic the evolution of species Popular example: Genetic Algorithm Genetic Algorithm Procedure Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 24

25 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Psychological Models Evolutionary Models Modeling and Optimization How to model the separating boundary and optimize the performance Class separation Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 25

26 Relationships with Other Disciplines Machine Learning unifies several disciplines Statistics Brain Models Adaptive Control Theory Psychological Models Evolutionary Models Modeling and Optimization Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 26

27 Universal Dataset Universal set Containing all the possible data pairs There is a unknown probability distribution Universal set Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 27

28 Universal Dataset Universal set Containing all the possible data pairs There is a unknown probability distribution Training set Only a subset of universal set is possibly considered due to limited memory or time This subset is known as Training set Universal set Training set Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 28

29 Universal Dataset Training set Only a subset of universal set is possibly considered due to limited memory or time This subset is known as Training set Independently and Identically Distributed (i.i.d.) Validation set A subset of training set for monitoring the training process of Machine Learning algorithm Dataset Preparation Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 29

30 What Are We Looking For? Possible learned properties A line separating boundary between two groups A relation among examples Test set Unseen samples or future events A disjoint set of Training set and Validation set Used for examining performance evaluation Dataset Preparation Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 30

31 What Are We Looking For? Universal set Assumed to exist but unknown Training set A subset of Universal set that is obtained from Data acquisition stage Example of three labeled datasets: Universal, Training, and Test Set Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 31

32 What Are We Looking For? Universal set Assumed to exist but unknown Training set A subset of Universal set that is obtained from Data acquisition stage Test set A set used for examining the performance Training set Test set Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 32

33 No Free Lunch Rule Possible learned properties (Revisited) A line separating boundary between two groups A relation among examples No Free Lunch Rules Assumptions that making Machine Learning feasible Assumptions needed for both Dataset and Properties Can we expect any classification method to be superior or inferior overall? The answer is NO Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 33

34 No Free Lunch Rule Assumptions needed for Dataset Training set and Test set are come from the same distribution Assumptions needed for Properties Making assumptions on what kind of function to learn How to model the property Can we expect any classification method to be superior or inferior overall? The answer is NO Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 34

35 No Free Lunch Rule Assumptions needed for Dataset Training set and Test set are come from the same distribution Assumptions needed for Properties Making assumptions on what kind of function to learn Straight line vs. Curved line How to model the property Gaussian distribution vs. Poisson distribution Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 35

36 Learning Input-Output Functions = ( ): A short summarization of Machine Learning : Vector-valued input or Training set Ξ or h( ): Predicted output : hypothesis function between and Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 36

37 Learning Input-Output Functions = ( ): A short summarization of Machine Learning : Vector-valued input or Training set Ξ or h( ): Predicted output : hypothesis function between and H: Set of all possible hypothesis functions Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 37

38 Types of Learning Types of Machine Learning Supervised Learning Unsupervised Learning Semi-supervised Learning Reinforcement Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 38

39 Types of Learning: Supervised Learning Types of Machine Learning Supervised Learning Learning from labeled dataset Find the relationships between the Feature set and the Label set Find a hypothesis h that is mostly-agreed by members of Training set Ξ Hypothesis If it Walks/Swims/Quacks like a Duck Then It must be a Duck Features: Walks, Swims, Quacks Label: Ducks, Not ducks Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 39

40 Types of Learning: Supervised Learning Types of Machine Learning Supervised Learning Learning from labeled dataset Find the relationships between the Feature set and the Label set Find a hypothesis h that is mostly-agreed by members of Training set Ξ Regression: finding a curve that fits points Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 40

41 Types of Learning: Supervised Learning Types of Machine Learning Supervised Learning Learning from labeled dataset Find the relationships between the Feature set and the Label set Find a hypothesis h that is mostly-agreed by members of Training set Ξ Regression: finding a curve that fits points Classification: finding a discriminating vector(s) Figure (a) Figure (b) Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 41

42 Types of Learning: Unsupervised Learning Types of Machine Learning Supervised Learning Unsupervised Learning Learning from characteristics of dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 42

43 Types of Learning: Unsupervised Learning Types of Machine Learning Supervised Learning Unsupervised Learning Learning from intrinsic characteristics of dataset Popular purposes Clustering, Probability density estimation, Finding association among features, Dimensionality reduction Output can be used for other learning paradigms Figure (a) Figure (b) Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 43

44 Types of Learning: Semi-supervised Learning Types of Machine Learning Supervised Learning Unsupervised Learning Semi-supervised Learning Learning with the dataset containing both labeled and unlabeled data Figure (a) Figure (b) Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 44

45 Types of Learning: Reinforcement Learning Types of Machine Learning Supervised Learning Unsupervised Learning Semi-supervised Learning Reinforcement Learning Learning from rewards or mistakes Agent acts in an Environment Reinforcement Learning procedure Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 45

46 Input Vectors Input vectors Pattern/feature vectors, samples, examples, instances An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 46

47 Input Vectors Input vectors Pattern/feature vectors, samples, examples, instances Features: the components of the input vector Attributes, input variables, components An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 47

48 Input Vectors Input vectors Pattern/feature vectors, samples, examples, instances Features: the components of the input vector Attributes, input variables, components An instance of (class, major, sex, advisor) can be (sophomore, history, male, Higgins) An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 48

49 Input Vectors Input vectors Pattern/feature vectors, samples, examples, instances Features: the components of the input vector Attributes, input variables, components An instance of (class, major, sex, advisor) can be (sophomore, history, male, Higgins) Boolean valued-features Boolean feature (High, Normal) An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 49

50 Outputs Outputs Output value, label, class, category, decision Function estimator if output is a real number Classifier if output is a categorical value Application: hand-written character recognition Input: printed character Output: categories of hand-written character An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 50

51 Outputs Outputs Output value, label, class, category, decision Function estimator if output is a real number Classifier if output is a categorical value Vector-valued outputs If mutually exclusive then Multi-class problem If not then Multi-label problem An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 51

52 Outputs Outputs Output value, label, class, category, decision Function estimator if output is a real number Classifier if output is a categorical value Vector-valued outputs Boolean outputs True: positive instance False: negative instance An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 52

53 Outputs Outputs Output value, label, class, category, decision Function estimator if output is a real number Classifier if output is a categorical value Vector-valued outputs Boolean outputs Concept learning: Boolean input-output An example dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 53

54 Preprocessing Preprocessing steps for a desired Machine Learning Standard Machine Learning Procedure Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 54

55 Noise and Outliers Preprocessing steps for a desired Machine Learning Noise and Outliers Corrupted values leading to Outliers An example of outlier Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 55

56 Noise and Outliers Preprocessing steps for a desired Machine Learning Noise and Outliers Corrupted values leading to Outliers Class noise and Attribute noise Class noise vs. Attribute noise Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 56

57 Missing Value Preprocessing steps for a desired Machine Learning Noise and Outliers Missing value Adding median/mean values Deleting corresponding row Using the value from the previous row Example of missing value Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 57

58 Scaling Preprocessing steps for a desired Machine Learning Noise and Outliers Missing value Scaling Reducing values in columns into a common scale Normalization or Standardization Normalization Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 58

59 Natural Language Processing for Texts Preprocessing steps for a desired Machine Learning Noise and Outliers Missing value Scaling Natural Language Processing for Texts Encoding free texts to vectors Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 59

60 Imbalanced Datasets Preprocessing steps for a desired Machine Learning Noise and Outliers Missing value Scaling Natural Language Processing for Texts Imbalanced datasets Class distribution is skewed Scatter plot of Imbalanced dataset Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 60

61 Imbalanced Datasets Preprocessing steps for a desired Machine Learning Noise and Outliers Missing value Scaling Natural Language Processing for Texts Imbalanced datasets Class distribution is skewed Over-/Under-sampling Synthesizing examples Over-/Under-sampling Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 61

62 Training Main step of Machine Learning Machine Learning procedures Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 62

63 Policy Main step of Machine Learning Training policy Batch method Entire Training set is available Use all at once to compute the function Batch Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 63

64 Policy Main step of Machine Learning Training policy Batch method Online method Arriving one member of Training set at a time Optimizing the function incrementally Deciding a next action based on current action Online Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 64

65 Feature Selection Main step of Machine Learning Training policy Feature selection Not all features contribute to the learning! One of Model Selection Approaches Can be used for improve the performance Curse of dimensionality Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 65

66 Evaluation Main step of Machine Learning Training policy Feature selection Evaluation Evaluating learning performance Usage of evaluation metrics Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 66

67 Evaluation Main step of Machine Learning Training policy Feature selection Evaluation Evaluating learning performance Population evaluation metric Mean-squared-error for Regression task Accuracy: the total number of errors Confusion matrix for calculating Accuracy Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 67

68 Parameter Tuning Main step of Machine Learning Training policy Feature selection Evaluation Parameter tuning Controlling learning model Popular approach: Trial-and-error basis Parameter tuning based on Trial-and-error strategy Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 68

69 Overfitting (Bias) Main step of Machine Learning Training policy Feature selection Evaluation Parameter tuning Overfitting Good on Training set, but poor on Test set Overfit vs. Underfit Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 69

70 Underfitting (Variance) Main step of Machine Learning Training policy Feature selection Evaluation Parameter tuning Overfitting Underfitting Low performance on both sets Overfit vs. Underfit Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 70

71 Bias and Variance Trade-off Main step of Machine Learning Training policy Feature selection Evaluation Parameter tuning Overfitting Underfitting Low performance on both sets Overfit vs. Underfit Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 71

72 Model Stacking Main step of Machine Learning Training policy Feature selection Evaluation Parameter tuning Overfitting Underfitting Model stacking Combining multiple algorithms Example of model stacking Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 72

73 Sample Applications Sample applications based on Machine Learning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 73

74 Sample Applications Sample applications based on Machine Learning Rule discovery for printing industry problem Electric power load forecasting Automatic help desk assistant 3D printing rule discovery Electric power load forecasting Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 74

75 Sample Applications Sample applications based on Machine Learning Rule discovery for printing industry problem Electric power load forecasting Automatic help desk assistant Planning and scheduling for a steel mill Stars and galaxies classification Stars and galaxies Classification Steel mill planning Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 75

76 Sample Applications Successes of Machine Learning in real world Sharp s Japanese Kanji character recognition system 200 characters per second with 99% accuracy Fujitsu s continuous steel casting monitoring system Kanji character recognition Steel casting monitoring Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 76

77 Sample Applications Successes of Machine Learning in real world Sharp s Japanese Kanji character recognition system 200 characters per second with 99% accuracy Fujitsu s continuous steel casting monitoring system Artificial Intelligence Chung-Ang University Narration: Prof. Jaesung Lee 77