Machine Learning and Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6)

Similar documents
Artificial Neural Networks

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Artificial Neural Networks written examination

Python Machine Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

Evolution of Symbolisation in Chimpanzees and Neural Nets

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Evolutive Neural Net Fuzzy Filtering: Basic Description

Learning Methods for Fuzzy Systems

Knowledge-Based - Systems

Axiom 2013 Team Description Paper

Seminar - Organic Computing

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

INPE São José dos Campos

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Lecture 1: Machine Learning Basics

Test Effort Estimation Using Neural Network

Learning to Schedule Straight-Line Code

(Sub)Gradient Descent

MYCIN. The MYCIN Task

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Human Emotion Recognition From Speech

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

An empirical study of learning speed in backpropagation

Issues in the Mining of Heart Failure Datasets

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Computerized Adaptive Psychological Testing A Personalisation Perspective

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Lecture 10: Reinforcement Learning

Breaking the Habit of Being Yourself Workshop for Quantum University

A study of speaker adaptation for DNN-based speech synthesis

Rule Learning With Negation: Issues Regarding Effectiveness

GACE Computer Science Assessment Test at a Glance

Learning goal-oriented strategies in problem solving

Softprop: Softmax Neural Network Backpropagation Learning

An OO Framework for building Intelligence and Learning properties in Software Agents

Speaker Identification by Comparison of Smart Methods. Abstract

Abstractions and the Brain

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

CS Machine Learning

SARDNET: A Self-Organizing Feature Map for Sequences

Knowledge Transfer in Deep Convolutional Neural Nets

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Classify: by elimination Road signs

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

arxiv: v1 [cs.lg] 15 Jun 2015

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Classification Using ANN: A Review

Second Exam: Natural Language Parsing with Neural Networks

Bluetooth mlearning Applications for the Classroom of the Future

Word Segmentation of Off-line Handwritten Documents

Probabilistic Latent Semantic Analysis

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Software Maintenance

Forget catastrophic forgetting: AI that learns after deployment

Mathematics process categories

Laboratorio di Intelligenza Artificiale e Robotica

arxiv: v1 [cs.cv] 10 May 2017

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Introduction to Simulation

Diploma in Library and Information Science (Part-Time) - SH220

While you are waiting... socrative.com, room number SIMLANG2016

Software Development: Programming Paradigms (SCQF level 8)

Course Specifications

Reducing Features to Improve Bug Prediction

Information System Design and Development (Advanced Higher) Unit. level 7 (12 SCQF credit points)

Reinforcement Learning by Comparing Immediate Reward

Cooperative evolutive concept learning: an empirical study

Networks in Cognitive Science

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Accelerated Learning Course Outline

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Syntactic systematicity in sentence processing with a recurrent self-organizing network

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Applications of data mining algorithms to analysis of medical data

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

Predicting Early Students with High Risk to Drop Out of University using a Neural Network-Based Approach

Innovative Methods for Teaching Engineering Courses

Accelerated Learning Online. Course Outline

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Transcription:

Machine Learning and Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6) The Concept of Learning Learning is the ability to adapt to new surroundings and solve new problems. Modification of a behavioural tendency by expertise. (Webster 1984) A learning machine, broadly defined is any device whose actions are influenced by past experiences. (Nilsson 1965) Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population. (Simon 1983) An improvement in information processing ability that results from information processing activity. (Tanimoto 1990) Week 2 Lecture Notes page 1

Main types of learning Rote Learning (Memorisation) Taking Advice (Direct Instruction) Learning in Problem Solving Learning from Examples (Induction) Explanation-based Learning Discovery (Deduction) Analogy Neural Nets and Genetic Algorithms Rote Learning (Memorisation) Simple storage of data data caching. Requires: 1. Organised storage of information, and 2. Generalisation. Eg. Samuel s checkers program stores scores from minimax search no need to re-evaluate a particular sub-tree at a later stage. Week 2 Lecture Notes page 2

Learning by Taking Advice (Direct Instruction) Involves receiving direct instructions on how to respond to certain situations. In a machine, this amounts to straight-forward procedural programming. In situations where the instructions do not correspond to direct procedures (eg. Take control of the centre of the board in Chess), an interpreter is required to translate the instructions to concrete execution steps. Learning in Problem Solving Learning ways of problem solving from own experience without an instructor/advisor. Does not involve an increase in knowledge, just the methods in using the knowledge. The Utility Problem: Learnt rules are good in directing the problem solving process, but it incurs a cost (utility) because the problem solver needs to store and consult those rules. Can be partially overcome by a utility measure to keep track of how useful the learnt rules are, and deleting them when necessary. Week 2 Lecture Notes page 3

Types of Learning in Problem Solving 1. Learning by Parameter Adjustment Use outcomes to adjust the weights for factors in an evaluation function Considerations: What are the initial weights? When does certain weights increase? When do they decrease? 2. Learning with Macro-operators Rote Learning a sequence of operations found to be useful during problem solving. 3. Learning by Chunking Rote learning in the context of a Production System Rules which are useful and always fired together are chunked to form one large production. Week 2 Lecture Notes page 4

Learning from Examples (Induction) Classification: the process of assigning, to a particular input, the name of a class to which it belongs. Classify by looking at many different examples of a class, and generalising the common features. Must define class structure before classification process begins. Explanation Based Learning To extract the concept behind the information within one example, and generalize to other instances. Requires domain-specific knowledge. In general, the inputs to EBL programs are: 1. A Training Example 2. A Goal Concept 3. An Operationally Criterion 4. A Domain Theory (or Knowledge Base) Week 2 Lecture Notes page 5

DISCOVERY (DEDUCTION) Much like Learning in Problem Solving, this involves gleaming information without the use of a teacher. Focuses more on extracting knowledge, rather than strategies/operations in problem solving. ANALOGY Eg: Last month, the stock market was like a rollercoaster. Transformational Analogy - Transform solutions previously found to new solutions. New Problem Previously solved problem Solution to new problem Transform Solution to old problem Eg. Transforming a PASCAL program to a C program. Week 2 Lecture Notes page 6

Derivational Analogy - Use methods in previously solved problems to derive methods of solving new problems. New derivation New problem Previously solved problem Old derivation Solution to new problem Solution to old problem Eg. Learning ways of deconstructing lists in PROLOG by studying example clauses. NEURAL NETS AND GENETIC LEARNING Learning by iterative improvement: start with an initial (possibly random) solution, then improve on the solution step-by-step. Genetic Learning: based on evolution and natural selection. evolve new solutions from old ones, then selection the new solutions which are good. Week 2 Lecture Notes page 7

How Brain Works A neural network can be defined as a model of reasoning based on the human brain. The brain consists of a densely interconnected set of nerve cells, or basic information-processing units, called neurons. The human brain incorporates nearly 10 billion neurons and 60 trillion connections, synapses, between them. Our brain can be considered as a highly complex, non-linear and parallel information-processing system. Information is stored and processed in a neural network simultaneously throughout the whole network, rather than at specific locations. Learning is a fundamental and essential characteristic of biological neural networks. The ease with which they can learn led to attempts to emulate a biological neural network in a computer. Week 2 Lecture Notes page 8

Artificial Neural Networks An Artificial Neural Network (ANN) consists of a number of very simple processors, also called neurons, which are analogous to the biological neuron in the brain. The neurons are connected by weighted links passing signals from one neuron to another. The output signal is transmitted through the neuron s outgoing connection The outgoing connection splits into a number of branches that transmit the same signal. The outgoing branches terminate at the incoming connections of other neurons in the network. Architecture of a typical artificial neural network Week 2 Lecture Notes page 9

The neuron as a simple computing element The neuron computes the weighted sum of the input signals and compares the results with a threshold value, è. If the net input is less than the threshold, the neuron output is 1. But if the net input is greater than or equal to the threshold, the neuron becomes activated and its output attains a value +1. The neuron uses the following transfer or activation function: This type of activation function is called a sign function. Week 2 Lecture Notes page 10

Activation functions of a neuron In 1958, Frank Rosenblatt introduced a training algorithm that provided the first procedure for training a simple ANN: a perceptron. The perceptron is the simplest form of a neural network. It consists of a single neuron with adjustable synaptic weights and a hard limiter. Week 2 Lecture Notes page 11

Perceptron The weighted sum of the inputs is applied to the hard limiter, which produces an output equal to +1 if its input is positive and 1 if it is negative. The aim of the perceptron is to classify inputs, x 1, x 2,, x n, into one of two classes, say A 1 and A 2 In the case of an elementary perceptron, the n- dimensional space is divided by a hyperplane into two decision regions. The hyperplane is defined by the linearly separable function: Week 2 Lecture Notes page 12

The perceptron learn its classification tasks by making small adjustments in the weights to reduce the difference between the actual and desired outputs of the perceptron. The initial weights are randomly assigned, and then updated to obtain the output consistent with the training examples. If at iteration p, the actual output is Y(p) and the desired output is Y d (p), then the error is given by: where p = 1, 2, 3, Iteration p here refers to pth training example presented to the perceptron If the error, e(p), is positive, we need to increase perceptron output Y(p), but if it is negative, we need to decrease Y(p). The perceptron learning rule was first proposed by Rosenblatt in 1960: where p =1, 2, 3, á is the learning rate, a positive constant less than unity. Week 2 Lecture Notes page 13

Perceptron s Training Algorithm Step 1: Initialisation o Set initial weights w 1, w 2,, w n and threshold è to random numbers in the range [-0.5, 0.5] o If the error, e(p), is positive, we need to increase perceptron output Y(p), but if it is negative, we need to decrease Y(p). Step 2: Activation o Activate the perceptron by applying inputs x 1 (p), x 2 (p),, x n (p) and desired output Y d (p). o Calculate the actual output at iteration p = 1 where n is the number of the perceptron inputs, and step is a step activation function. Week 2 Lecture Notes page 14

Step 3: Weight training o update the weights of the perceptron where w i (p) is the weight correction at iteration p. o The weight correction is computed by the delta rule: Step 4: Iteration o Increase iteration p by one, go back to Step 2 and repeat the process until convergences. Week 2 Lecture Notes page 15

Examples of perceptron learning: the logical operation AND Week 2 Lecture Notes page 16

Multilayer Neural Networks (MLNN) A multiplayer perceptron is a feedforward neural network with one or more hidden layers The network consists of an input layer of source neurons, at least one hidden layer of computational neurons, and an output layer of computational neurons. The input signals are propagated in a forward direction on a layer-by-layer basis. Neurons in the hidden layer cannot be observed through input/output behaviour of the network. There is no obvious way to know what the desired output of the hidden layer should be. Week 2 Lecture Notes page 17

Backpropagation Neural Network (BPNN) Learning in a multiplayer network proceeds the same way as for a perceptron A training set of input patterns is presented to the network The network computes its output pattern, and if there is an error or in other words a difference between actual and desired output patterns the weights are adjusted to reduce this error In BPNN, the learning algorithm has two phases First, a training input pattern is presented to the network input layer. The network propagates the input pattern from layer to layer until the output pattern is generated by the output layer If this pattern is different from the desired output, an error is calculated and then propagated backwards through the network from the output layer to the input laye. The weights are modified as the error is propagated. Week 2 Lecture Notes page 18

Training of Backpropagation Algorithm Step 1: Initialisation Set all the weights and threshold levels of the network to random numbers uniformly distributed inside a small range: where F i is the total number of inputs of neuron i in the network. The weight initialisation is done on a neuron-by-neuron basis. Week 2 Lecture Notes page 19

Step 2: Activation Activate the BPNN by applying inputs x 1 (p), x 2 (p),, x n (p) and desired outputs y d,1 (p), y d,2 (p),, y d,n (p). Calculate the actual outputs of the neurons in the hidden layer: where n is the number of inputs of neuron j in the hidden layer, and sigmoid is the sigmoid activation function. Calculate the actual outputs of the neurons in the output layer: where m is the number of inputs of neuron k in the output layer. Week 2 Lecture Notes page 20

Step 3: Weight Training Update the weights in the BPNN by propagating backward the errors associated with output neurons. Calculate the error gradient for the neurons in the output layer: where Calculate the weight corrections: Update the weights at the output neurons: Calculate the error gradient for the neurons in the hidden layer: Calculate the weight corrections: Update the weights at the hidden neurons: Step 4: Iteration Week 2 Lecture Notes page 21

Increase iteration p by one, go back to Step 2 and repeat the process until the selected error criterion is satisfied. Week 2 Lecture Notes page 22