USING REINFORCEMENT LEARNING TO INTRODUCE ARTIFICIAL INTELLIGENCE IN THE CS CURRICULUM

Size: px
Start display at page:

Download "USING REINFORCEMENT LEARNING TO INTRODUCE ARTIFICIAL INTELLIGENCE IN THE CS CURRICULUM"

Transcription

1 USING REINFORCEMENT LEARNING TO INTRODUCE ARTIFICIAL INTELLIGENCE IN THE CS CURRICULUM Scott M. Thede Department of Computer Science DePauw University Phone: (765) ABSTRACT: There are many interesting topics in artificial intelligence that would be useful to stimulate student interest at various levels of the computer science curriculum. They can also be used to illustrate some basic concepts of computer science, such as arrays. One such topic is reinforcement learning teaching a computer program how to play a game or traverse an environment using a system of rewards and punishments. There are reinforcement learning projects that can be used at all levels of the computer science curriculum. This paper describes a few examples of reinforcement learning and how they might be used. INTRODUCTION TO REINFORCEMENT LEARNING: Reinforcement learning is a topic from artificial intelligence, specifically machine learning. It functions by allowing the computer program (or agent) to learn correct choices by repeatedly making choices and observing the results. For example, a computer game would learn to play a good game of chess by playing many games of chess, and favoring moves that in the past had led to a win, while avoiding moves that have led to a loss. There are many ways to implement a reinforcement learning program, but most work by keeping a value (usually called a utility value or Q value) associated with each state and/or move option, increasing its value when it leads to a win (or a reward) and decreasing its value when it leads to a loss (or a penalty). Reinforcement learning is typically taught in artificial intelligence courses, but it can be used in other courses in the computer science curriculum either to stimulate student interest or to spotlight a specific topic in the course or both. A simple version of reinforcement learning is 1

2 appropriate at the end of an introductory programming course knowledge of two-dimensional arrays is required. A SIMPLE REINFORCEMENT LEARNING ASSIGNMENT: A good first assignment for reinforcement learning is the implementation of a simple computer game. We typically use the game of NIM as our example. For those unfamiliar with the game, it involves two players with a pile of sticks between them (the number of sticks can vary we typically start with ten). Players alternate turns, which consist of taking a number of sticks from the pile (the number you can take varies from one to some maximum value, which can also vary we typically have a maximum of three). The player who takes the last stick is the winner. This is a particularly good example for reinforcement learning, because it is a simple game and easily defined. For game playing, the computer must calculate a utility value for each possible action that can be taken in each possible state. For a game like chess or checkers, this is an exceptionally large number of values. For NIM, with ten sticks in the starting pile and with players allowed to pick up a maximum of three sticks at a time, this amounts to thirty values. For example, it would calculate the value of picking up two sticks when there are six sticks in the pile. We can teach the computer to play NIM (or more accurately, create a program that teaches itself to play NIM) by creating an M x N array of numbers (typically integers, but any numerical variable type will work), where N is the number of sticks initially in the pile, and M is the maximum number of sticks a player can take at one time. 1 When it is the computer s turn to move, it should look at the column that corresponds to the number of sticks left in the pile. 1 This could become a more advanced assignment by designing the program to allow the user to choose the initial number of sticks and maximum number a player can take, thus requiring the use of dynamically allocated twodimensional arrays. It could also use a vector-based matrix class, if you are using C++ with the standard template library. 2

3 Whichever row in that column contains the largest value is the number of sticks the computer should remove from the pile. 2 The learning comes from the initial values of the array and the modifications the game program makes to the array after a game is completed. Initially, the array should contain all zeroes. Then, the program needs to keep track of what moves it made in which states during the game. For example, the program should remember that when there were ten sticks, it chose to take three, and when there were five sticks, it chose two. How this is remembered is not important we use an array of N elements, where the i-th element indicates the number of sticks chosen in state i. A linked list containing state / action pairs would also be appropriate. When the game concludes, the computer player has either won or lost. If it has lost, it goes to the spaces in the array indicated by its move list and decrements them if it has won, it increments them. In this way, it improves the values of moves that led to a win, and decreases the value of moves that lead to a loss. It is very fascinating to watch the computer make its moves, seeing it try out different moves, and eventually become a player that is impossible to beat. It takes a number of games for the computer to learn the game well enough to always win, so the assignment should also involve saving the current array to the disk when the human quits playing, and loading it again when the game is begun. The training may take longer than you want, 3 even with the saving and loading, so an interesting twist to the task is to let the computer play itself to train. Each computer player could share the array this would allow fast training. This would allow the concept to be applied to larger games for example, tic-tac-toe requires 177,147 values to be calculated (not all of which are legal position / move combinations, however). An array this size is large, but not too large to handle, but the number of games required to train is prohibitive if done by hand. Other questions can arise in the coding of the algorithm. For example, if two or more moves have the same value, how do you choose between them? It doesn t really matter random choice is one solution, or simple picking the lower number of sticks will also work (this might be 2 Of course, the rows are numbered beginning with zero, while the smallest number of sticks you can choose is one, so an appropriate mapping must be made. This sort of thing is frequently missed by students. 3 It took probably games with N = 10 and M = 3 before the computer won all the time. 3

4 a better choice than a random selection, since it allows more moves per game, which will allow faster training). The algorithm will not work if you are able to change the parameters N and M between games particularly M. Changing N can be handled as long as you can resize the array. Changing M will invalidate the data, so it will have to beginning learning all over again. We have used this assignment in our artificial intelligence course, but it is well-suited to a beginning level computer science course. It would be a particularly good assignment when introducing two-dimensional arrays at our school, this would be done either at the end of CS 1 or the beginning of CS 2. A MORE COMPLICATED ASSIGNMENT: The assignment mentioned in the previous section is a simple form of reinforcement learning. It uses an approach known as naïve updating, which updates each state based upon the final result of the state sequence. This is suitable for a simple assignment, but some might wish to go into greater detail. This would work especially well for a more mathematically oriented class. When using reinforcement learning for a game, we are estimating the value of a function Q( a, i ) that estimates the value of taking the action a when the game is in state i. At some point in the process, the program updates its estimates of the Q values using some rule. Once we have reached equilibrium (trained with enough games), the Q values will accurately reflect the actual value to the program to take action a in state i. In naïve updating, the update occurs only at the end of the game, and takes the form Q( a, i ) Q( a, i ) + reward received at end of sequence Initial values should be Q( a, i ) = 0. The reward received at the end is typically +1 for a win, -1 for a loss, and 0 for a draw (if possible), but other rewards are possible. Backgammon, for example, can have payoffs from 192 to Naïve updating is a very simple approach 4

5 there are other update approaches that converge faster to the actual values. For example, using an approach called temporal difference, or TD-learning, the updates occur after every move using the following update rule: Q( a, i ) Q( a, i ) + α[ R( i ) + max a Q( a, j ) Q( a, i ) ] In this update equation, R( i ) is the reward received in state i (for most game playing, this term is 0 and will disappear, since most games receive an award only at completion), and j is the state that will result from taking action a in state i. α is the learning rate parameter, which determines how fast the function will converge to an equilibrium value. So, this equation basically says the new value of Q( a, i ) is equal to the old value plus α times the reward received from state i plus the difference between the best move you can make next and the current value. In this way, moves that lead to good positions have their values gradually increased, while moves that lead to bad values are decreased. So, for a more challenging assignment, we can assign the students to use this equation to do the learning. The challenge is perhaps not in the programming, but in understanding the mathematical concepts behind the programming. In fact, we could have them do both versions, play them against each other, and see which one can learn faster. Experiments with various values of α are also possible. One problem with this sort of approach is that it requires tabulating the values for all possible actions in all possible states. For any non-trivial game, this table will be exceptionally large, and exceptionally slow to learn. We can address this issue by using a heuristic function to estimate the Q values. If we assume that the function Q( a, i ) can be expressed in this format Q( a, i ) = w 1 f 1 ( a, i ) + w 2 f 2 ( a, i ) + + w n f n ( a, i ) we can use reinforcement learning to learn the weights in the above equation. Of course, we need to choose appropriate f functions to combine in the Q function for example, tic-tac-toe 5

6 might have one of the functions be the number of rows that have two Os in a row, and another might be the number of rows containing an X. Other choices are of course possible. When learning the weights, we need to adjust our update rule a bit. For TD-learning, we can use the following update equation: w w + α [ R( i ) + max a Q( a, j ) Q( a, i ) ] w Q( a, i ) This performs a gradient descent in the weight space. This might be a bit advanced mathematically, but it could be written and presented as individual rewrite rules for each weight, if we didn t want to get into the linear algebra. LEARNING AN ENVIRONMENT: Another classic problem that reinforcement learning is applied to is that of learning an environment. The typical example of this is an agent that can wander throughout a world that is laid out in a grid pattern, achieving rewards and penalties on its way to a goal. The plan is that the agent will learn the environment and eventually find the best path to take from start to finish, collecting the most rewards along the way. For this particular problem, we are typically not interested in learning the value of taking certain actions in certain states we are more interested in simply the value of being in a certain state. We assume that the agent can move from state to state easily. This can be modeled using TDlearning with an equation similar to the one outlined in the previous section: U( i ) U( i ) + α [ R( i ) + U( j ) U( i ) ] Here, U( i ) is the utility value of being in state i. This update is applied immediately after the transition from state i to state j. In this case, R( i ) is most likely non-zero in many states. Some models could make it slightly negative in all non-reward states, to impose a penalty for the agent 6

7 wandering about in the environment. This depends on whether the goal of the agent is to find a path from start to finish, or simply to map the environment. This problem illustrates an issue that may not be obvious in the game playing problems. A typical approach for the agent to select its moves is to choose to move to the state with the highest utility value U( i ). This strategy seems to make sense, and is probably the best choice for a game playing agent after all, if the strategy is winning, keep using it. However, when exploring an environment, this may not be the optimal strategy. After all, there may be a better path somewhere else that we haven t explored yet. So, we should encourage our learning agent to explore the terrain more and use the utility values less. However, if we favor exploration over utility values, the agent will learn the utility values very well, but will not really make use of them. So just exploring the environment is not perfect either. What we need is a combination of strategies, where the agent is encouraged to explore early in the exploration, and encouraged to exploit the calculated utilities later in the exploration. The exact method to determine how to make this switch is not obvious, but could involve making the learning rate α vary with the number of explorations. Another option is that the perceived utility function values could be altered to seem larger if that state hasn t been explored, or explored very often. The options are wide open this is a good sort of assignment to give a class that enjoys working on open-ended problems. This could be a good research type problem as well. SUMMARY: In summary, the area of reinforcement learning is not only a very interesting area of artificial intelligence, but also an interesting area of computer science in general. It can be used to teach various points about computer science, from two-dimensional arrays to exploration agents. There are some fundamental underlying mathematical concepts that can be used to tie closer in students minds the areas of mathematics and computer science. And it is something that is fun for the students they can write a program that actually learns! 7

8 BIBLIOGRAPHY: Russel, Stuart and Norvig, Peter. Artificial Intelligence: A Modern Approach. Prentice Hall, Tesauro, Gerald. Temporal Difference Learning and TD-Gammon. Communications of the ACM, March 1995, Volume 38, Number 3. Contains the rules for NIM, as well as many other games and puzzles. 8

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

TD(λ) and Q-Learning Based Ludo Players

TD(λ) and Q-Learning Based Ludo Players TD(λ) and Q-Learning Based Ludo Players Majed Alhajry, Faisal Alvi, Member, IEEE and Moataz Ahmed Abstract Reinforcement learning is a popular machine learning technique whose inherent self-learning ability

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

An OO Framework for building Intelligence and Learning properties in Software Agents

An OO Framework for building Intelligence and Learning properties in Software Agents An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

How long did... Who did... Where was... When did... How did... Which did...

How long did... Who did... Where was... When did... How did... Which did... (Past Tense) Who did... Where was... How long did... When did... How did... 1 2 How were... What did... Which did... What time did... Where did... What were... Where were... Why did... Who was... How many

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Proceedings of 28 ISFA 28 International Symposium on Flexible Automation Atlanta, GA, USA June 23-26, 28 ISFA28U_12 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM Amit Gil, Helman Stern, Yael Edan, and

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I Session 1793 Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I John Greco, Ph.D. Department of Electrical and Computer Engineering Lafayette College Easton, PA 18042 Abstract

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

Speeding Up Reinforcement Learning with Behavior Transfer

Speeding Up Reinforcement Learning with Behavior Transfer Speeding Up Reinforcement Learning with Behavior Transfer Matthew E. Taylor and Peter Stone Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188 {mtaylor, pstone}@cs.utexas.edu

More information

Measurement. When Smaller Is Better. Activity:

Measurement. When Smaller Is Better. Activity: Measurement Activity: TEKS: When Smaller Is Better (6.8) Measurement. The student solves application problems involving estimation and measurement of length, area, time, temperature, volume, weight, and

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Using focal point learning to improve human machine tacit coordination

Using focal point learning to improve human machine tacit coordination DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated

More information

LEGO MINDSTORMS Education EV3 Coding Activities

LEGO MINDSTORMS Education EV3 Coding Activities LEGO MINDSTORMS Education EV3 Coding Activities s t e e h s k r o W t n e d Stu LEGOeducation.com/MINDSTORMS Contents ACTIVITY 1 Performing a Three Point Turn 3-6 ACTIVITY 2 Written Instructions for a

More information

AN EXAMPLE OF THE GOMORY CUTTING PLANE ALGORITHM. max z = 3x 1 + 4x 2. 3x 1 x x x x N 2

AN EXAMPLE OF THE GOMORY CUTTING PLANE ALGORITHM. max z = 3x 1 + 4x 2. 3x 1 x x x x N 2 AN EXAMPLE OF THE GOMORY CUTTING PLANE ALGORITHM Consider the integer programme subject to max z = 3x 1 + 4x 2 3x 1 x 2 12 3x 1 + 11x 2 66 The first linear programming relaxation is subject to x N 2 max

More information

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA Testing a Moving Target How Do We Test Machine Learning Systems? Peter Varhol, Technology

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

P-4: Differentiate your plans to fit your students

P-4: Differentiate your plans to fit your students Putting It All Together: Middle School Examples 7 th Grade Math 7 th Grade Science SAM REHEARD, DC 99 7th Grade Math DIFFERENTATION AROUND THE WORLD My first teaching experience was actually not as a Teach

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

High-level Reinforcement Learning in Strategy Games

High-level Reinforcement Learning in Strategy Games High-level Reinforcement Learning in Strategy Games Christopher Amato Department of Computer Science University of Massachusetts Amherst, MA 01003 USA camato@cs.umass.edu Guy Shani Department of Computer

More information

Getting Started with Deliberate Practice

Getting Started with Deliberate Practice Getting Started with Deliberate Practice Most of the implementation guides so far in Learning on Steroids have focused on conceptual skills. Things like being able to form mental images, remembering facts

More information

Contents. Foreword... 5

Contents. Foreword... 5 Contents Foreword... 5 Chapter 1: Addition Within 0-10 Introduction... 6 Two Groups and a Total... 10 Learn Symbols + and =... 13 Addition Practice... 15 Which is More?... 17 Missing Items... 19 Sums with

More information

Number Line Moves Dash -- 1st Grade. Michelle Eckstein

Number Line Moves Dash -- 1st Grade. Michelle Eckstein Number Line Moves Dash -- 1st Grade Michelle Eckstein Common Core Standards CCSS.MATH.CONTENT.1.NBT.C.4 Add within 100, including adding a two-digit number and a one-digit number, and adding a two-digit

More information

Improving Action Selection in MDP s via Knowledge Transfer

Improving Action Selection in MDP s via Knowledge Transfer In Proc. 20th National Conference on Artificial Intelligence (AAAI-05), July 9 13, 2005, Pittsburgh, USA. Improving Action Selection in MDP s via Knowledge Transfer Alexander A. Sherstov and Peter Stone

More information

CS 100: Principles of Computing

CS 100: Principles of Computing CS 100: Principles of Computing Kevin Molloy August 29, 2017 1 Basic Course Information 1.1 Prerequisites: None 1.2 General Education Fulfills Mason Core requirement in Information Technology (ALL). 1.3

More information

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations 4 Interior point algorithms for network ow problems Mauricio G.C. Resende AT&T Bell Laboratories, Murray Hill, NJ 07974-2070 USA Panos M. Pardalos The University of Florida, Gainesville, FL 32611-6595

More information

Understanding and Changing Habits

Understanding and Changing Habits Understanding and Changing Habits We are what we repeatedly do. Excellence, then, is not an act, but a habit. Aristotle Have you ever stopped to think about your habits or how they impact your daily life?

More information

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Paper #3 Five Q-to-survey approaches: did they work? Job van Exel

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning

More information

While you are waiting... socrative.com, room number SIMLANG2016

While you are waiting... socrative.com, room number SIMLANG2016 While you are waiting... socrative.com, room number SIMLANG2016 Simulating Language Lecture 4: When will optimal signalling evolve? Simon Kirby simon@ling.ed.ac.uk T H E U N I V E R S I T Y O H F R G E

More information

Hentai High School A Game Guide

Hentai High School A Game Guide Hentai High School A Game Guide Hentai High School is a sex game where you are the Principal of a high school with the goal of turning the students into sex crazed people within 15 years. The game is difficult

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

Cooperative Game Theoretic Models for Decision-Making in Contexts of Library Cooperation 1

Cooperative Game Theoretic Models for Decision-Making in Contexts of Library Cooperation 1 Cooperative Game Theoretic Models for Decision-Making in Contexts of Library Cooperation 1 Robert M. Hayes Abstract This article starts, in Section 1, with a brief summary of Cooperative Economic Game

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Shockwheat. Statistics 1, Activity 1

Shockwheat. Statistics 1, Activity 1 Statistics 1, Activity 1 Shockwheat Students require real experiences with situations involving data and with situations involving chance. They will best learn about these concepts on an intuitive or informal

More information

The Task. A Guide for Tutors in the Rutgers Writing Centers Written and edited by Michael Goeller and Karen Kalteissen

The Task. A Guide for Tutors in the Rutgers Writing Centers Written and edited by Michael Goeller and Karen Kalteissen The Task A Guide for Tutors in the Rutgers Writing Centers Written and edited by Michael Goeller and Karen Kalteissen Reading Tasks As many experienced tutors will tell you, reading the texts and understanding

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Navigating the PhD Options in CMS

Navigating the PhD Options in CMS Navigating the PhD Options in CMS This document gives an overview of the typical student path through the four Ph.D. programs in the CMS department ACM, CDS, CS, and CMS. Note that it is not a replacement

More information

AMULTIAGENT system [1] can be defined as a group of

AMULTIAGENT system [1] can be defined as a group of 156 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 2, MARCH 2008 A Comprehensive Survey of Multiagent Reinforcement Learning Lucian Buşoniu, Robert Babuška,

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Pair Programming. Spring 2015

Pair Programming. Spring 2015 CS4 Introduction to Scientific Computing Potter Pair Programming Spring 2015 1 What is Pair Programming? Simply put, pair programming is two people working together at a single computer [1]. The practice

More information

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham Curriculum Design Project with Virtual Manipulatives Gwenanne Salkind George Mason University EDCI 856 Dr. Patricia Moyer-Packenham Spring 2006 Curriculum Design Project with Virtual Manipulatives Table

More information

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Dublin City Schools Mathematics Graded Course of Study GRADE 4 I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported

More information

Improving Conceptual Understanding of Physics with Technology

Improving Conceptual Understanding of Physics with Technology INTRODUCTION Improving Conceptual Understanding of Physics with Technology Heidi Jackman Research Experience for Undergraduates, 1999 Michigan State University Advisors: Edwin Kashy and Michael Thoennessen

More information

An empirical study of learning speed in backpropagation

An empirical study of learning speed in backpropagation Carnegie Mellon University Research Showcase @ CMU Computer Science Department School of Computer Science 1988 An empirical study of learning speed in backpropagation networks Scott E. Fahlman Carnegie

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

Radius STEM Readiness TM

Radius STEM Readiness TM Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and

More information

Learning to Schedule Straight-Line Code

Learning to Schedule Straight-Line Code Learning to Schedule Straight-Line Code Eliot Moss, Paul Utgoff, John Cavazos Doina Precup, Darko Stefanović Dept. of Comp. Sci., Univ. of Mass. Amherst, MA 01003 Carla Brodley, David Scheeff Sch. of Elec.

More information

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming. Computer Science 1 COMPUTER SCIENCE Office: Department of Computer Science, ECS, Suite 379 Mail Code: 2155 E Wesley Avenue, Denver, CO 80208 Phone: 303-871-2458 Email: info@cs.du.edu Web Site: Computer

More information

FF+FPG: Guiding a Policy-Gradient Planner

FF+FPG: Guiding a Policy-Gradient Planner FF+FPG: Guiding a Policy-Gradient Planner Olivier Buffet LAAS-CNRS University of Toulouse Toulouse, France firstname.lastname@laas.fr Douglas Aberdeen National ICT australia & The Australian National University

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number

9.85 Cognition in Infancy and Early Childhood. Lecture 7: Number 9.85 Cognition in Infancy and Early Childhood Lecture 7: Number What else might you know about objects? Spelke Objects i. Continuity. Objects exist continuously and move on paths that are connected over

More information

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer. Tip Sheet I m going to show you how to deal with ten of the most typical aspects of English grammar that are tested on the CAE Use of English paper, part 4. Of course, there are many other grammar points

More information

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design. Name: Partner(s): Lab #1 The Scientific Method Due 6/25 Objective The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

More information

Mathematics process categories

Mathematics process categories Mathematics process categories All of the UK curricula define multiple categories of mathematical proficiency that require students to be able to use and apply mathematics, beyond simple recall of facts

More information

Causal Link Semantics for Narrative Planning Using Numeric Fluents

Causal Link Semantics for Narrative Planning Using Numeric Fluents Proceedings, The Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17) Causal Link Semantics for Narrative Planning Using Numeric Fluents Rachelyn Farrell,

More information

Cognitive Thinking Style Sample Report

Cognitive Thinking Style Sample Report Cognitive Thinking Style Sample Report Goldisc Limited Authorised Agent for IML, PeopleKeys & StudentKeys DISC Profiles Online Reports Training Courses Consultations sales@goldisc.co.uk Telephone: +44

More information

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

CHAPTER 4: REIMBURSEMENT STRATEGIES 24 CHAPTER 4: REIMBURSEMENT STRATEGIES 24 INTRODUCTION Once state level policymakers have decided to implement and pay for CSR, one issue they face is simply how to calculate the reimbursements to districts

More information

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller. Cognitive Modeling Lecture 5: Models of Problem Solving Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk January 22, 2008 1 2 3 4 Reading: Cooper (2002:Ch. 4). Frank Keller

More information

School of Innovative Technologies and Engineering

School of Innovative Technologies and Engineering School of Innovative Technologies and Engineering Department of Applied Mathematical Sciences Proficiency Course in MATLAB COURSE DOCUMENT VERSION 1.0 PCMv1.0 July 2012 University of Technology, Mauritius

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Rigor is NOT a Four-Letter Word By Barbara R. Blackburn (Eye On Education, Inc., 2008)

Rigor is NOT a Four-Letter Word By Barbara R. Blackburn (Eye On Education, Inc., 2008) File: Instruction Rigor is NOT a Four-Letter Word By Barbara R. Blackburn (Eye On Education, Inc., 2008) The main ideas of the book are: S.O.S. (A Summary of the Summary ) ~ To increase the level of rigor

More information

Computer Science is more important than Calculus: The challenge of living up to our potential

Computer Science is more important than Calculus: The challenge of living up to our potential Computer Science is more important than Calculus: The challenge of living up to our potential By Mark Guzdial and Elliot Soloway In 1961, Alan Perlis made the argument that computer science should be considered

More information

Call Center Assessment-Technical Support (CCA-Technical Support)

Call Center Assessment-Technical Support (CCA-Technical Support) WHY DO AT&T AND ITS AFFILIATES TEST? At AT&T, we pride ourselves on matching the best jobs with the best people. To do this, we need to better understand your skills and abilities to make sure that you

More information

Welcome to SAT Brain Boot Camp (AJH, HJH, FJH)

Welcome to SAT Brain Boot Camp (AJH, HJH, FJH) Welcome to SAT Brain Boot Camp (AJH, HJH, FJH) 9:30 am - 9:45 am ALL STUDENTS: Basics: Moreno Multipurpose Room 9:45 am - 10:15 am Breakout Session #1 RED GROUP: SAT Math: Adame Multipurpose Room BLUE

More information

arxiv: v1 [cs.lg] 15 Jun 2015

arxiv: v1 [cs.lg] 15 Jun 2015 Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and

More information

On the Combined Behavior of Autonomous Resource Management Agents

On the Combined Behavior of Autonomous Resource Management Agents On the Combined Behavior of Autonomous Resource Management Agents Siri Fagernes 1 and Alva L. Couch 2 1 Faculty of Engineering Oslo University College Oslo, Norway siri.fagernes@iu.hio.no 2 Computer Science

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Spinal Cord. Student Pages. Classroom Ac tivities

Spinal Cord. Student Pages. Classroom Ac tivities Classroom Ac tivities Spinal Cord Student Pages Produced by Regenerative Medicine Partnership in Education Duquesne University Director john A. Pollock (pollock@duq.edu) The spinal column protects the

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

The Indices Investigations Teacher s Notes

The Indices Investigations Teacher s Notes The Indices Investigations Teacher s Notes These activities are for students to use independently of the teacher to practise and develop number and algebra properties.. Number Framework domain and stage:

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

PREVIEW LEADER S GUIDE IT S ABOUT RESPECT CONTENTS. Recognizing Harassment in a Diverse Workplace

PREVIEW LEADER S GUIDE IT S ABOUT RESPECT CONTENTS. Recognizing Harassment in a Diverse Workplace 1 IT S ABOUT RESPECT LEADER S GUIDE CONTENTS About This Program Training Materials A Brief Synopsis Preparation Presentation Tips Training Session Overview PreTest Pre-Test Key Exercises 1 Harassment in

More information

A Comparison of Annealing Techniques for Academic Course Scheduling

A Comparison of Annealing Techniques for Academic Course Scheduling A Comparison of Annealing Techniques for Academic Course Scheduling M. A. Saleh Elmohamed 1, Paul Coddington 2, and Geoffrey Fox 1 1 Northeast Parallel Architectures Center Syracuse University, Syracuse,

More information

PreReading. Lateral Leadership. provided by MDI Management Development International

PreReading. Lateral Leadership. provided by MDI Management Development International PreReading Lateral Leadership NEW STRUCTURES REQUIRE A NEW ATTITUDE In an increasing number of organizations hierarchies lose their importance and instead companies focus on more network-like structures.

More information

Circuit Simulators: A Revolutionary E-Learning Platform

Circuit Simulators: A Revolutionary E-Learning Platform Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,

More information

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL 1 PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL IMPORTANCE OF THE SPEAKER LISTENER TECHNIQUE The Speaker Listener Technique (SLT) is a structured communication strategy that promotes clarity, understanding,

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

A Reinforcement Learning Variant for Control Scheduling

A Reinforcement Learning Variant for Control Scheduling A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement

More information

Test Effort Estimation Using Neural Network

Test Effort Estimation Using Neural Network J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish

More information

Section 3.4. Logframe Module. This module will help you understand and use the logical framework in project design and proposal writing.

Section 3.4. Logframe Module. This module will help you understand and use the logical framework in project design and proposal writing. Section 3.4 Logframe Module This module will help you understand and use the logical framework in project design and proposal writing. THIS MODULE INCLUDES: Contents (Direct links clickable belo[abstract]w)

More information

Statewide Framework Document for:

Statewide Framework Document for: Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance

More information

Evidence for Reliability, Validity and Learning Effectiveness

Evidence for Reliability, Validity and Learning Effectiveness PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies

More information

BMBF Project ROBUKOM: Robust Communication Networks

BMBF Project ROBUKOM: Robust Communication Networks BMBF Project ROBUKOM: Robust Communication Networks Arie M.C.A. Koster Christoph Helmberg Andreas Bley Martin Grötschel Thomas Bauschert supported by BMBF grant 03MS616A: ROBUKOM Robust Communication Networks,

More information

Operations and Algebraic Thinking Number and Operations in Base Ten

Operations and Algebraic Thinking Number and Operations in Base Ten Operations and Algebraic Thinking Number and Operations in Base Ten Teaching Tips: First Grade Using Best Instructional Practices with Educational Media to Enhance Learning pbskids.org/lab Boston University

More information

DIANA: A computer-supported heterogeneous grouping system for teachers to conduct successful small learning groups

DIANA: A computer-supported heterogeneous grouping system for teachers to conduct successful small learning groups Computers in Human Behavior Computers in Human Behavior 23 (2007) 1997 2010 www.elsevier.com/locate/comphumbeh DIANA: A computer-supported heterogeneous grouping system for teachers to conduct successful

More information

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING From Proceedings of Physics Teacher Education Beyond 2000 International Conference, Barcelona, Spain, August 27 to September 1, 2000 WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING

More information