Artificial Intelligence COMP PDF Free Download

Lecture notes Page 1 Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof. Joelle Pineau McGill University Winter 2009

Lecture notes Page 2 Table of Contents December-03-08 12:16 PM I. II. III. IV. V. VI. History of AI Search 1. Uninformed Search Methods 2. Informed Search 3. Search for Optimization Problems 4. Game Playing 5. Constraint Satisfaction Logic 1. Knowledge Representation: Logic 2. First Order Logic 3. Planning 4. Spatial Planning Probability 1. Reasoning under Uncertainty 2. Bayesian Networks Machine Learning 1. Machine Learning: Parameter Estimation 2. Learning with Missing Values 3. Supervised Learning 4. Neural Nets 5. Decision Trees Decision Theory 1. Utility Theory 2. Markov Decision Processes (MDPs) 3. Reinforcement Learning

Lecture notes Page 3 History of AI January-06-09 10:03 AM

Lecture notes Page 4 Uninformed Search Methods January-08-09 10:06 AM

Lecture notes Page 5

Lecture notes Page 6 Generic Search Algorithm: Algorithm 1: BFS

Lecture notes Page 7 Algorithm 2: DFS Algorithm 3: Depth limited search Algorithm 4: Iterative Deepening

Lecture notes Page 8 Informed Search January-13-09 10:02 AM

Lecture notes Page 9 Algorithms January-13-09 10:34 AM Algorithm #1: Best-First Search Algorithm #2: Heuristic Search

Algorithm # 3: A* search Lecture notes Page 10

Lecture notes Page 11

Lecture notes Page 12 Search for Optimization Problems January-15-09 10:05 AM

Lecture notes Page 13 Iterative Improvement Algorithms January-15-09 10:05 AM Algorithm #1: Hill Climbing Algorithm #2: Simulated Annealing

Lecture notes Page 14

Lecture notes Page 15 Genetic Algorithms January-15-09 11:06 AM

Lecture notes Page 16 Game Playing January-20-09 10:03 AM

Lecture notes Page 17 Minimax Search January-20-09 10:07 AM

Lecture notes Page 18 α-β Pruning January-20-09 10:44 AM

Lecture notes Page 19 Constraint Satisfaction January-22-09 10:10 AM

Lecture notes Page 20

Lecture notes Page 21

Lecture notes Page 22 Knowledge Representation: Logic January-27-09 10:10 AM

Lecture notes Page 23

Lecture notes Page 24

Lecture notes Page 25

Lecture notes Page 26

Lecture notes Page 27

Lecture notes Page 28 First Order Logic February-18-09 7:50 PM

Lecture notes Page 29

Lecture notes Page 30

Lecture notes Page 31

Lecture notes Page 32

Lecture notes Page 33

Lecture notes Page 34

Lecture notes Page 35 Planning February-03-09 10:11 AM

Lecture notes Page 36

Lecture notes Page 37

Lecture notes Page 38

Lecture notes Page 39 Partial Order Planning Algorithm February-18-09 8:55 PM

Lecture notes Page 40 Least Commitment Analysis

Lecture notes Page 41 Spatial Planning February-03-09 10:32 AM

Lecture notes Page 42

Lecture notes Page 43

Lecture notes Page 44

Lecture notes Page 45 Reasoning under Uncertainty February-18-09 9:13 PM If we know probabilities, what actions should we choose?

Lecture notes Page 46

Lecture notes Page 47

Lecture notes Page 48

Lecture notes Page 49

Lecture notes Page 50

Lecture notes Page 51 Bayesian Networks March-19-09 3:26 PM

Lecture notes Page 52

Lecture notes Page 53 Machine Learning: Parameter Estimation March-03-09 10:09 AM

Lecture notes Page 54 Statistical Parameter Fitting March-03-09 10:34 AM

Lecture notes Page 55 Maximum Likelihood Estimate (MLE) March-03-09 10:53 AM

Lecture notes Page 56

Lecture notes Page 57 Learning with Missing Values March-10-09 10:14 AM

Lecture notes Page 58 Basic EM algorithm: Start with an initial parameter setting Repeat: Expectation Step: Complete the data by assigning values to missing items. Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

Lecture notes Page 59

Soft EM for a general Bayes net: Lecture notes Page 60

Lecture notes Page 61 Machine Learning: Clustering March-19-09 4:21 PM

Lecture notes Page 62

Lecture notes Page 63 Supervised Learning March-10-09 10:55 AM

Lecture notes Page 64

Lecture notes Page 65

Lecture notes Page 66 Overfitting April-14-09 8:35 PM

Lecture notes Page 67

Lecture notes Page 68 Finding Parameters in General April-14-09 9:05 PM Gradient Descent: Given w 0, for i = 0, 1, 2,... do: Repeat until necessary.

Lecture notes Page 69 Batch vs. Online Optimization April-14-09 9:38 PM

What we should know: Lecture notes Page 70

Lecture notes Page 71 Neural Nets March-19-09 4:48 PM

Lecture notes Page 72

Lecture notes Page 73

Lecture notes Page 74

Lecture notes Page 75 Feed Forward Neural Networks April-15-09 10:48 AM Forward pass: for layer k = 1... K do: Compute the output of all units in layer k Copy this output as the input to the next layer

Lecture notes Page 76

Lecture notes Page 77 1. 2. 3. Backpropagation algorithm: Forward pass: compute the output of the network going from input layer to output layer. Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer. Update: update the weights using the standard rule:

Lecture notes Page 78

Lecture notes Page 79 Overfitting in Neural Net April-15-09 12:56 PM

Lecture notes Page 80 Decision Trees April-15-09 1:04 PM

Lecture notes Page 81

Lecture notes Page 82

Lecture notes Page 83

Lecture notes Page 84

Lecture notes Page 85 Utility Theory April-15-09 1:54 PM

Utility Models: Lecture notes Page 86

Lecture notes Page 87 Maximizing Expected Utility (MEU) Principle April-15-09 2:21 PM

Lecture notes Page 88

What we should know: Lecture notes Page 89

Lecture notes Page 90 Markov Decision Processes (MDPs) April-15-09 2:50 PM

Lecture notes Page 91

Lecture notes Page 92 Policies April-15-09 2:50 PM

Lecture notes Page 93

Lecture notes Page 94 1. 2. Iterative Policy Evaluation Algorithm: Start with some initial guess During iteration k update the function for all states as follows:

Lecture notes Page 95 Searching for a Good Policy April-15-09 4:47 PM

Lecture notes Page 96 Policy Iteration Algorithm: Start with an initial policy Repeat until Compute using policy evaluation algorithm Compute using greedy policy update rule on

Lecture notes Page 97 Value Iteration Algorithm: Start with an initial value Repeat until Update the value function estimate using:

Lecture notes Page 98

Lecture notes Page 99

Lecture notes Page 100 Reinforcement Learning April-15-09 5:38 PM

Lecture notes Page 101

Lecture notes Page 102 1. 2. TD (order 0) Learning Algorithm: Initialize the value function: Repeat until feeling sick of it: a. Pick a start state b. Repeat for every time step t i. Choose an action a based on current policy π and current state s ii. Take action a, observe reward r and new state s' iii. Compute TD error: δ = r + γ V(s') - V(s) iv. Update the value function: V(s) = V(s) + α s δ v. Update current state: s = s' vi. If s' is a terminal state, GoTo 2.

Lecture notes Page 103 Reinforcement Learning for Control April-15-09 6:35 PM

Lecture notes Page 104