Artificial Intelligence COMP-424

Size: px

Start display at page:

Download "Artificial Intelligence COMP-424"

Angela Ruth Quinn
5 years ago
Views:

1 Lecture notes Page 1 Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof. Joelle Pineau McGill University Winter 2009

2 Lecture notes Page 2 Table of Contents December :16 PM I. II. III. IV. V. VI. History of AI Search 1. Uninformed Search Methods 2. Informed Search 3. Search for Optimization Problems 4. Game Playing 5. Constraint Satisfaction Logic 1. Knowledge Representation: Logic 2. First Order Logic 3. Planning 4. Spatial Planning Probability 1. Reasoning under Uncertainty 2. Bayesian Networks Machine Learning 1. Machine Learning: Parameter Estimation 2. Learning with Missing Values 3. Supervised Learning 4. Neural Nets 5. Decision Trees Decision Theory 1. Utility Theory 2. Markov Decision Processes (MDPs) 3. Reinforcement Learning

3 Lecture notes Page 3 History of AI January :03 AM

4 Lecture notes Page 4 Uninformed Search Methods January :06 AM

5 Lecture notes Page 5

6 Lecture notes Page 6 Generic Search Algorithm: Algorithm 1: BFS

7 Lecture notes Page 7 Algorithm 2: DFS Algorithm 3: Depth limited search Algorithm 4: Iterative Deepening

8 Lecture notes Page 8 Informed Search January :02 AM

9 Lecture notes Page 9 Algorithms January :34 AM Algorithm #1: Best-First Search Algorithm #2: Heuristic Search

10 Algorithm # 3: A* search Lecture notes Page 10

11 Lecture notes Page 11

12 Lecture notes Page 12 Search for Optimization Problems January :05 AM

13 Lecture notes Page 13 Iterative Improvement Algorithms January :05 AM Algorithm #1: Hill Climbing Algorithm #2: Simulated Annealing

14 Lecture notes Page 14

15 Lecture notes Page 15 Genetic Algorithms January :06 AM

16 Lecture notes Page 16 Game Playing January :03 AM

17 Lecture notes Page 17 Minimax Search January :07 AM

18 Lecture notes Page 18 α-β Pruning January :44 AM

19 Lecture notes Page 19 Constraint Satisfaction January :10 AM

20 Lecture notes Page 20

21 Lecture notes Page 21

22 Lecture notes Page 22 Knowledge Representation: Logic January :10 AM

23 Lecture notes Page 23

24 Lecture notes Page 24

25 Lecture notes Page 25

26 Lecture notes Page 26

27 Lecture notes Page 27

28 Lecture notes Page 28 First Order Logic February :50 PM

29 Lecture notes Page 29

30 Lecture notes Page 30

31 Lecture notes Page 31

32 Lecture notes Page 32

33 Lecture notes Page 33

34 Lecture notes Page 34

35 Lecture notes Page 35 Planning February :11 AM

36 Lecture notes Page 36

37 Lecture notes Page 37

38 Lecture notes Page 38

39 Lecture notes Page 39 Partial Order Planning Algorithm February :55 PM

40 Lecture notes Page 40 Least Commitment Analysis

41 Lecture notes Page 41 Spatial Planning February :32 AM

42 Lecture notes Page 42

43 Lecture notes Page 43

44 Lecture notes Page 44

45 Lecture notes Page 45 Reasoning under Uncertainty February :13 PM If we know probabilities, what actions should we choose?

46 Lecture notes Page 46

47 Lecture notes Page 47

48 Lecture notes Page 48

49 Lecture notes Page 49

50 Lecture notes Page 50

51 Lecture notes Page 51 Bayesian Networks March :26 PM

52 Lecture notes Page 52

53 Lecture notes Page 53 Machine Learning: Parameter Estimation March :09 AM

54 Lecture notes Page 54 Statistical Parameter Fitting March :34 AM

55 Lecture notes Page 55 Maximum Likelihood Estimate (MLE) March :53 AM

56 Lecture notes Page 56

57 Lecture notes Page 57 Learning with Missing Values March :14 AM

58 Lecture notes Page 58 Basic EM algorithm: Start with an initial parameter setting Repeat: Expectation Step: Complete the data by assigning values to missing items. Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

59 Lecture notes Page 59

60 Soft EM for a general Bayes net: Lecture notes Page 60

61 Lecture notes Page 61 Machine Learning: Clustering March :21 PM

62 Lecture notes Page 62

63 Lecture notes Page 63 Supervised Learning March :55 AM

64 Lecture notes Page 64

65 Lecture notes Page 65

66 Lecture notes Page 66 Overfitting April :35 PM

67 Lecture notes Page 67

68 Lecture notes Page 68 Finding Parameters in General April :05 PM Gradient Descent: Given w 0, for i = 0, 1, 2,... do: Repeat until necessary.

69 Lecture notes Page 69 Batch vs. Online Optimization April :38 PM

70 What we should know: Lecture notes Page 70

71 Lecture notes Page 71 Neural Nets March :48 PM

72 Lecture notes Page 72

73 Lecture notes Page 73

74 Lecture notes Page 74

Lecture notes Page 75 Feed Forward Neural Networks April-15-09 10:48 AM Forward pass: for layer k =

75 Lecture notes Page 75 Feed Forward Neural Networks April :48 AM Forward pass: for layer k = 1... K do: Compute the output of all units in layer k Copy this output as the input to the next layer

76 Lecture notes Page 76

77 Lecture notes Page Backpropagation algorithm: Forward pass: compute the output of the network going from input layer to output layer. Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer. Update: update the weights using the standard rule:

78 Lecture notes Page 78

79 Lecture notes Page 79 Overfitting in Neural Net April :56 PM

80 Lecture notes Page 80 Decision Trees April :04 PM

81 Lecture notes Page 81

82 Lecture notes Page 82

83 Lecture notes Page 83

84 Lecture notes Page 84

85 Lecture notes Page 85 Utility Theory April :54 PM

86 Utility Models: Lecture notes Page 86

87 Lecture notes Page 87 Maximizing Expected Utility (MEU) Principle April :21 PM

88 Lecture notes Page 88

89 What we should know: Lecture notes Page 89

90 Lecture notes Page 90 Markov Decision Processes (MDPs) April :50 PM

91 Lecture notes Page 91

92 Lecture notes Page 92 Policies April :50 PM

93 Lecture notes Page 93

94 Lecture notes Page Iterative Policy Evaluation Algorithm: Start with some initial guess During iteration k update the function for all states as follows:

95 Lecture notes Page 95 Searching for a Good Policy April :47 PM

96 Lecture notes Page 96 Policy Iteration Algorithm: Start with an initial policy Repeat until Compute using policy evaluation algorithm Compute using greedy policy update rule on

97 Lecture notes Page 97 Value Iteration Algorithm: Start with an initial value Repeat until Update the value function estimate using:

98 Lecture notes Page 98

99 Lecture notes Page 99

100 Lecture notes Page 100 Reinforcement Learning April :38 PM

101 Lecture notes Page 101

102 Lecture notes Page TD (order 0) Learning Algorithm: Initialize the value function: Repeat until feeling sick of it: a. Pick a start state b. Repeat for every time step t i. Choose an action a based on current policy π and current state s ii. Take action a, observe reward r and new state s' iii. Compute TD error: δ = r + γ V(s') - V(s) iv. Update the value function: V(s) = V(s) + α s δ v. Update current state: s = s' vi. If s' is a terminal state, GoTo 2.

103 Lecture notes Page 103 Reinforcement Learning for Control April :35 PM

104 Lecture notes Page 104

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation