Iterative Improvement Search Methods

Similar documents
Artificial Neural Networks written examination

Lecture 1: Machine Learning Basics

A Comparison of Annealing Techniques for Academic Course Scheduling

Reinforcement Learning by Comparing Immediate Reward

A simulated annealing and hill-climbing algorithm for the traveling tournament problem

Discriminative Learning of Beam-Search Heuristics for Planning

Research Article Hybrid Multistarting GA-Tabu Search Method for the Placement of BtB Converters for Korean Metropolitan Ring Grid

An Introduction to Simulation Optimization

Laboratorio di Intelligenza Artificiale e Robotica

Introduction to Simulation

On the Combined Behavior of Autonomous Resource Management Agents

A Reinforcement Learning Variant for Control Scheduling

BMBF Project ROBUKOM: Robust Communication Networks

Python Machine Learning

An Introduction to Simio for Beginners

While you are waiting... socrative.com, room number SIMLANG2016

(Sub)Gradient Descent

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Laboratorio di Intelligenza Artificiale e Robotica

Lecture 10: Reinforcement Learning

Participant s Journal. Fun and Games with Systems Theory. BPD Conference March 19, 2009 Phoenix AZ

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

An empirical study of learning speed in backpropagation

Axiom 2013 Team Description Paper

AMULTIAGENT system [1] can be defined as a group of

The dilemma of Saussurean communication

Exploration. CS : Deep Reinforcement Learning Sergey Levine

FF+FPG: Guiding a Policy-Gradient Planner

EVOLVING POLICIES TO SOLVE THE RUBIK S CUBE: EXPERIMENTS WITH IDEAL AND APPROXIMATE PERFORMANCE FUNCTIONS

Hi I m Ryan O Donnell, I m with Florida Tech s Orlando Campus, and today I am going to review a book titled Standard Celeration Charting 2002 by

Major Milestones, Team Activities, and Individual Deliverables

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Generative models and adversarial training

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Implementation of Genetic Algorithm to Solve Travelling Salesman Problem with Time Window (TSP-TW) for Scheduling Tourist Destinations in Malang City

SARDNET: A Self-Organizing Feature Map for Sequences

Seminar - Organic Computing

Learning From the Past with Experiment Databases

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Learning and Transferring Relational Instance-Based Policies

Ricochet Robots - A Case Study for Human Complex Problem Solving

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Evolutive Neural Net Fuzzy Filtering: Basic Description

Acquiring Competence from Performance Data

Chapter 2 Rule Learning in a Nutshell

An Effective Framework for Fast Expert Mining in Collaboration Networks: A Group-Oriented and Cost-Based Method

CS Machine Learning

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Lecture 1: Basic Concepts of Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

Challenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley

MASTERS VS. PH.D. WHICH ONE TO CHOOSE? HOW FAR TO GO? Rita H. Wouhaybi, Intel Labs Bushra Anjum, Amazon

Softprop: Softmax Neural Network Backpropagation Learning

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

Sugar And Salt Solutions Phet Simulation Packet

Title:A Flexible Simulation Platform to Quantify and Manage Emergency Department Crowding

Towards a Robuster Interpretive Parsing

Evolution of Symbolisation in Chimpanzees and Neural Nets

Truth Inference in Crowdsourcing: Is the Problem Solved?

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

English (native), German (fair/good, I am one year away from speaking at the classroom level), French (written).

Improving Conceptual Understanding of Physics with Technology

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Software Maintenance

TD(λ) and Q-Learning Based Ludo Players

Team Formation for Generalized Tasks in Expertise Social Networks

RETURNING TEACHER REQUIRED TRAINING MODULE YE TRANSCRIPT

B. How to write a research paper

Knowledge-Based - Systems

Go fishing! Responsibility judgments when cooperation breaks down

Disambiguation of Thai Personal Name from Online News Articles

Understanding and Changing Habits

Planning with External Events

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

How the Guppy Got its Spots:

Regret-based Reward Elicitation for Markov Decision Processes

A Stochastic Model for the Vocabulary Explosion

CSC200: Lecture 4. Allan Borodin

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Learning to Schedule Straight-Line Code

A Genetic Irrational Belief System

Investigating Ahuja-Orlin s Large Neighbourhood Search Approach for Examination Timetabling

What is beautiful is useful visual appeal and expected information quality

Timeline. Recommendations

WHEN THERE IS A mismatch between the acoustic

English Language Arts Summative Assessment

SCORING KEY AND RATING GUIDE

The Writing Process. The Academic Support Centre // September 2015

Toward Probabilistic Natural Logic for Syllogistic Reasoning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Evaluating Statements About Probability

Transcription:

Iterative Improvement Search Methods Kris Beevers Intro to AI 9/18/03 Ch. 4.3 Overview Blind/heuristic search methods are designed to explore a search space systematically, and return a path to the goal as their solution In many problems, the path to the goal is not relevant: all we care about is arriving at the solution Iterative improvement algorithms: The solution is just a state in the search, not the path we took to get to it Start in a configuration in the state space, and try to improve it Often our goal is to maximize (or minimize) some objective (evaluation) function (i.e. optimization) Advantages of iterative improvement algorithms: Use very little memory (usually constant amount), because we only store the current state Often find reasonable solutions in large or infinite (continuous) state spaces for which systematic algorithms are unsuitable Objective Functions I.e. evaluation functions Returns a number given a state Generally not an analytic function; there are well-developed numerical optimization techniques for analytic functions, especially those with an analytic derivative State space landscape: draw and label a picture (p. 111) 1

Our state: a location on this landscape Our objective (depending on problem formulation) is usually to find either a global maximum or global minimum for this function Note that maximizing and minimizing are really equivalent: maximizing is the same as minimizing Usually any local maximum or minimum is a goal, but the optimal solution is global max or min Algorithm is complete if it always finds a goal if one exists (might/might not be more than one goal) Algorithm is optimal if it always finds a global minimum/maximum Hill-climbing Search More specifically, steepest-ascent hill climbing search Algorithm to find a local maximum: Given an initial problem state, create a search node for that state, call it current Repeat: Pick of with highest If, return (Show slide) Else, Does not maintain a search tree Stops when it reaches a peak where no neighbor has a higher value Note that if we are searching using some heuristic function that gives the cost from a state to the goal, we would try to minimize this function (gradient descent) Simple Variations Stochastic hill climbing: choose at random from among the uphill moves Vary probability of selection with steepness of uphill move Usually converges more slowly than steepest-ascent 2

For some landscapes, finds better solutions First-choice hill climbing: implements stochastic hill climbing by randomly generating successors until one is generated that is better than the current state Good strategy when a state has many successors (so we don t have to generate them all) Random-restart Hill-climbing: to find a global maximum, perform hill climbing from randomly selected initial states and take the best solution; often performs very well for somewhat simple landscapes Problems With Hill-climbing Searches Local maxima: i.e. peaks that are lower than the global minimum (slide) Ridges: sequence of local maxima that aren t connected to each other (slide) Plateus: might not be able to find our way off of the plateu Step sizes: Size of step may be dictated by the problem (e.g. 8-queens, 8-puzzle), or may be variable (continuous search spaces) Large step can converge more quickly, but small step can find maximum more accurately Direction of allowable steps affects efficiency and results (e.g. ridge example) Simulated Annealing A hill climbing algorithm that never makes a downhill move is guaranteed to be incomplete, because it can get stuck on a local maximum So, it might make sense to move downhill sometimes (Show slide with quote) Boltzmann probability distribution: Energy of a system in equilibrium at temperature is probabilistically distributed 3

Small chance of the system being in a high energy state even at low temperature Simulated annealing algorithm; at each step in an iterative improvement algorithm: Pick a random step (instead of the best step) If step improves (evaluation function), take it Otherwise, take the step anyway with probability that decreases exponentially with how much worse it is (and that is also affected by current temperature ) Cooling schedule idea: Temperature affects probability of taking a downward step every iterations) Use some cooling schedule to determine how the temperature decreases (e.g. decrease by Book has more formal implementation of algorithm (p. 116) Local Beam Search Rather than just keeping one node in memory at a time, keep track of states Algorithm: Begin with randomly generated states Repeat: Generate all successors of the current set of states If any one is a goal, return success Otherwise, select best successors from the complete list Not the same as running random restart searches in parallel! Here, useful information is passed among the parallel search threads E.g. if one state generates good successors and other states generate bad successors, we concentrate our resources on the good states Potential problem: lack of diversity of states; might quickly become concentrated in one small region of state space Stochastic beam search: choose successors at random, with probability proportional to how good they are 4

Genetic Algorithms Variant of stochastic beam search Generate successor states by combining two parent states (rather than by modifying a single state) Natural selection: Successors offspring States organisms Evaluation function value fitness Begin with a set of randomly generated states, the population Usually represent each state as a string over a finite alphabet (0s and 1s, or a set of predefined language elements, etc.) Each state is rated by the fitness function, which should return higher values for better states Usually select states for reproduction with probability proportional to their fitness Reproduction/mating: randomly choose crossover points in each parent; offspring gets part of one parent, part of the other When parents are quite different, offspring can be much different than either parent state; since often the population is diverse early in the process, crossover frequently takes large steps in the state space early in the search, small steps later on Each offspring is subject to mutation with small independent probability Problem: crossover can often destroy useful features Must engineer the representation of an organism carefully to minimize this type of problem 5