Introduction of connectionist models

Introduction of connectionist models Introduction to ANNs Markus Dambek Uni Bremen 20. Dezember 2010 Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 1 / 66

1 Introduction 2 Information processing in biology 3 The artificial neuron 4 The perceptron 5 The multilayer perceptron 6 Classification 7 Development Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 2 / 66

About what? Connectionism is a set of approaches in the fields of artificial intelligence, cognitive psychology, cognitive science, neuroscience and philosophy of mind, that models mental or behavioral phenomena as the emergent processes of interconnected networks of simple units. There are many forms of connectionism, but the most common forms use artificial neural networks (ANNs) Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 3 / 66

ANNs Artificial neural networks (ANNs) are relatively new computational tools that have found extensive utilization in solving many complex real-world problems. The attractiveness of ANNs comes from their remarkable information processing characteristics pertinent mainly to nonlinearity Although ANNs are drastic abstractions of the biological counterparts, the idea of ANNs is not to replicate the operation of the biological systems. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 4 / 66

Capabilities nonlinearity noise-insensitivity Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 5 / 66

Capabilities high parallelism learning generalization Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 6 / 66

Information processing in biology Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 7 / 66

The Neuron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 8 / 66

An impulse, in the form of an electric signal,travels within the dendrites and through the cell body towards the pre-synaptic membrane of the synapse. Upon arrival at the membrane, a neurotransmitter (chemical) is released from the vesicles in quantities proportional to the strength of the incoming signal. The neurotransmitter diffuses within the synaptic gap and eventually into the dendrites of neighboring neurons. Depending on the neighboring neuron s threshold, they force them to generate a new electrical signal. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 9 / 66

Neural Networks Such neurons are interconnected in networks of billions of neurons, processing incoming information into motor actions and new information. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 10 / 66

The artificial neuron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 11 / 66

The artificial neuron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 12 / 66

Analogy connections = dendrites and axons connection weights = synapses threshold = soma activity Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 13 / 66

Components Input x i w ij net-input-function net j = k x k w kj activation-function a j = f act (net j,θ j ) output-function x j = f out (a j ) (mostly : f out = id) Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 14 / 66

Activation Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 15 / 66

The perceptron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 16 / 66

Perceptron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 17 / 66

Perceptron unidirectional two Layers: Input-Layer and Output-Layer Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 18 / 66

Learning Delta-rule Change weights according to the contribution to the error Error denotes the difference between the perceptron s output and the expected output (teaching output) Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 19 / 66

Learning Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 20 / 66

AND-Net Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 21 / 66

AND-Net x 1 x 2 x 1 *1+x 2 *1 > θ = 1,5 output 0 0 0 false 0 0 1 1 false 0 1 0 1 false 0 1 1 2 true 1 Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 22 / 66

OR-Net Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 23 / 66

OR-Net x 1 x 2 x 1 *1+x 2 *1 > θ = 0,5 output 0 0 0 false 0 0 1 1 true 1 1 0 1 true 1 1 1 2 true 1 Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 24 / 66

XOR-Net Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 25 / 66

Capabilities Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 26 / 66

Capabilities Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 27 / 66

The multilayer perceptron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 28 / 66

Multilayer-Perceptron Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 29 / 66

Multilayer-Perceptron Adding Hidden-Layers results in the capability to represent even more komplex functions But delta-rule operates on the difference between the perceptron s output and the teaching output We have no clue about the hidden neuron s taching output We need to modify the learning algorithm Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 30 / 66

Many different approaches: Backpropagation Hopfield Adaptive resonance theory... Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 31 / 66

Learning: supervised vs. unsupervised vs. reinforcement Most of them can be divided into three different classes of learning algorithms: supervised learing unsupervised learning reinforcement learning Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 32 / 66

error-correction learning The error-correction learning (ECL) rule is used in supervised learning. The arithmetic difference (error) between the ANN solution and the corresponding correct answer is used to modify the connection weights. Gradual reduction the overall network error. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 33 / 66

boltzman learning The Boltzmann learning (BL) rule is a stochastic rule. It is similar to ECL, however each neuron generates an output (or state) based on a Boltzmann statistical distribution Renders learning extremely slow. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 34 / 66

hebbian learning The Hebbian learning (HL) rule, developed based on neurobiological experiments, is the oldest learning rule. It postulates that if neurons on both sides of a synapse are activated synchronously and repeatedly, the synapses strength is selectively increased. Learning is done locally by adjusting the synapse weight based on the activities of the neurons. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 35 / 66

competitive learning In the competitive learning (CL) rule, all neurons are forced to compete among themselves. Only one neuron will be activated in a given iteration with all the weights attached to it being adjusted. The CL rule is speculated to exist in many biological systems Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 36 / 66

Backpropagation Generalization of delta-rule error-correction learning Idea Represent the net-error as Function of all weights: E(W j ) = E(w j1,w j2,...,w jn ) two weights w 1 and w 2 Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 37 / 66

Backpropagation Objective Find the global minimum of the Error-Function Use Gradient descent Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 38 / 66

Backpropagation Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 39 / 66

BP-XOR-Net Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 40 / 66

BP-XOR-Net x 1 x 2 x 1*1+x 2*1 > θ = 1,5 x 3 x 1*1+x 2*1+x 3*(-2) > θ = 0,5 output 0 0 0 false 0 0 false 0 0 1 1 false 0 1 true 1 1 0 1 false 0 1 true 1 1 1 2 true 1 0 false 0 Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 41 / 66

Classification Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 42 / 66

Hopfield networks This network is a symmetric fully connected twolayer recurrent network. When presented with an incomplete or noisy pattern, the network responds by retrieving an internally stored pattern that most closely resembles the presented pattern. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 43 / 66

Hopfield networks Efficient in solving optimization problems. Learning is done by setting each weight connecting two neurons to the product of the inputs of these two neurons. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 44 / 66

Adaptive resonance theory These are trained by unsupervised learning. The ART network consists of two fully interconnected layers, an input-layer and an output-layer. The feedforward weights are used to select a winning output neuron (cluster) and serve as the long-term memory for the networks. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 45 / 66

Adaptive resonance theory The feedback weights serve as the short-term memory for the network. Can be used for pattern recognition, completion, and classification. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 46 / 66

Kohonen networks These networks are two-layer networks that transform n-dimensional input patterns into lower-ordered data. Patterns project onto points in close proximity to one another. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 47 / 66

Kohonen networks Kohonen networks are trained in an unsupervised manner to form clusters within the data. pattern recognition, classification and data compression Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 48 / 66

recurrents networks In a recurrent network, the outputs of some neurons are fed back to the same neurons or to neurons in preceding layers. Enables a flow of information in both forward and backward directions. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 49 / 66

recurrents networks Provides the ANN with a dynamic memory. Special algorithms for training recurrent networks. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 50 / 66

Counterpropagation networks These networks are trained by hybrid learning to create a self-organizing look-up table. A response is the average for those feature vectors closest to it in the input data space. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 51 / 66

Counterpropagation networks As input is presented unsupervised learning is carried out to create a Kohonen map of the input data. Meanwhile, supervised learning is used to associate an appropriate output vector with each point on the map. Useful for function approximation and classification. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 52 / 66

Backpropagation networks (feedforward) A backpropagation (BP) network is an MLP. The term backpropagation refers to the way the error computed at the output side is propagated backward from the output layer, to the hidden layer, and finally to the input layer. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 53 / 66

Backpropagation networks (feedforward) Uses supervised learning. Can be used for data modeling, classification, forecasting, control, data and image compression, and pattern recognition. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 54 / 66

General Issues with Backpropagation Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 55 / 66

ANNs or expert Systems The decision as to whether to use ANNs, ESs, or theoretical modeling for an arbitrary problem depends on the availability of the theory and the data. For a problem with abundant data but unclear theory, ANNs can be a perfect tool. When both the data and theory are inadequate, the human experts opinion should be sought followed by coding this knowledge into a set of ES-rules. When the problem is rich in both data and theory, it may be possible to derive a physical model. When both theory and data are abundant but a physical model is hard to formulate, the modeler can also use ANNs. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 56 / 66

Graduation Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 57 / 66

Phases in ANN development Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 58 / 66

Problem Definition The problem definition and formulation (phase 1) relies heavily on an adequate understanding of the problem. The benefits of ANNs over other techniques should be evaluated before final selection of the modeling technique. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 59 / 66

System Design Determination of the type of ANN and learning rule that fit the problem. Involves data collection, data preprocessing to fit the type of ANN partitioning the data into three distinct subsets (training, test, and validation subsets). Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 60 / 66

System Realization Training of the network utilizing the training and test subsets Assessing the network performance by analyzing the prediction error. Selection of the various parameters (e.g., network size, learning rate, number of training cycles, acceptable error, etc.) If possible, splitting the problem into smaller sub-problems, if possible, and designing an ensemble of networks Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 61 / 66

System Verification Examination of the best network for its generalization capability using the validation subset. Comparison of the performance of the ANN-based model to those of other approaches. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 62 / 66

System Implementation Embedding the obtained network in an appropriate working system. Final testing of the integrated system. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 63 / 66

System Maintenance Updating the developed system as changes in the environment or the system variables occur. Involves a new development cycle. Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 64 / 66

Thank you for your Attention Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 65 / 66

Sources I.A. Basheer, M. Hajmeer: Artificial neural networks: fundamentals, computing, design, and application Uwe Lämmel, Jürgen Cleve: Künstliche Intelligenz 3. Auflage Günter Daniel Rey, Karl F. Wender: Neuronale Netze Eine Einführung in die Grundlagen, Anwendungen und Datenauswertung Markus Dambek (Uni Bremen) Introduction of connectionist models 20. Dezember 2010 66 / 66