Artifi ifi i c l a Neur l a Networks Mohamed M. El Wakil t akil.ne 1

Artificial i lneural lnetworks Mohamed M. El Wakil mohamed@elwakil.net 1

Agenda Natural Neural Networks Artificial Neural Networks XOR Example Design Issues Applications Conclusion 2

Artificial Neural Networks? A technique for solving problems by constructing software that works like our brains! 3

Artificial Neural Networks? A technique for solving problems by constructing software that works like our brains! But, how do our brains work? 4

The Brain is.. A massively parallel information processing system 5

A HUGE network of processing elements A typical brain contains a network of 10 billion neurons 6

A processing element (a.k.a. A neuron) Cell body Axon S ynap se Synaptic Dendrites 7

A processing element (a.k.a. A neuron) Cell body: Processor Axon: Output ynap se Synaptic: Link Dendrites : Input A neuron is connected to other neurons through about 10,000 synapses 8

How it works? 1 Cell body Axon S ynap se Synaptic Dendrites 9 A neuron receives input from other neurons (typically many thousands). Inputs are combined (i.e. summed)

How it works? 2 Cell body Axon S ynap se Synaptic Dendrites 10 Once input exceeds a critical level, the neuron discharges a spike an electrical pulse that travels from the body, down the axon, to the next neuron(s)

How it works? 3 Cell body Axon S ynap se Synaptic Dendrites The axon endings almost touch the dendrites or cell body of the next neuron 11

How it works? 3 Cell body Axon S ynap se Synaptic Dendrites Transmission of an electrical signal from one neuron to the next is effected by neurotransmittors 12

How it works? 3 Cell body Axon S ynap se Synaptic Dendrites 13 Neurotransmittors are chemicals which are released from the first neuron and which bind to the second

How it works? 4 Cell body Axon S ynap se Synaptic Dendrites 14 This link is called a synapse. The strength of the signal that reaches the next neuron depends on factors such as the amount of neurotransmittor available

Neuron Abstraction 15

So An artificial network is an imitation of a human neural network. 16

So An artificial network is an imitation of a human neural network. An artificial neuron is an imitation of a human neuron. 17

So An artificial network is an imitation of a human neural network. An artificial neuron is an imitation of a human neuron Let s see an artificial neural network. 18

Cell Body 19

Dendrites 20

Axon/Spike 21

Input Processing Output 22

Not all inputs are equal! 23

Not all neurons are equal! 24

The signal is not passed down to the next neuron verbatim Transfer Function 25

Three neurons 26

Input of a neuron=output of other neurons 27

Layer: A set of neurons that receive the same input 28

Three typesof layers: Input, Hidden, and Output 29

The output is a function of the input, that is affected by the weights, and the transfer functions 30

A powerful tool An ANN can compute any computable function, by the appropriate selection of the networktopology topology and weights values. 31

A powerful tool An ANN can compute any computable function, by the appropriate selection of the networktopology topology and weights values. Also, ANNs can learn from experience! 32

A powerful tool An ANN can compute any computable function, by the appropriate selection of the networktopology topology and weights values. Also, ANNs can learn from experience! Specifically, by trial and error 33

Learning by trial and error and Continuous processof: of: Trial Evaluate Adjust

Learning by trial and error and Continuous processof: of: Processing an input to produce an output (i.e. trial) Evaluating this output (i.e. comparing it to the expected output) Adjusting the processing, accordingly In terms of ANN: Compute the output tfunction of a given input.

An ANN that learns how approximate XOR Three Layers 39

An ANN that learns how approximate Input XOR Three Layers: Layer 1: Input Layer, with two neurons 40

An ANN that learns how approximate XOR Hidden Three Layers: Layer 1: Input Layer, with two neurons Layer 2: Hidden Layer, with three neurons 41

An ANN that learns how approximate XOR Output Three Layers: Layer 1: Input Layer, with two neurons Layer 2: Hidden Layer, with three neurons Layer 3: Output Layer, with one neuron 42

How it works? Set initial values of the weights randomly. Input: truth table of the XOR. 43

How it works? Set initial values of the weights randomly. Input: truth table of the XOR Do Read input (e.g. 0, and 0) 44

How it works? Set initial values of the weights randomly. Input: truth table of the XOR Do Read input (e.g. 0, and 0) Compute an output (e.g. 0.75343) 45

How it works? Set initial values of the weights randomly. Input: truth table of the XOR Do Read input (e.g. 0, and 0) Compute an output (e.g. 0.75343) Compute the error (the distance between the desired output and the actual output. (Error= 0.75343) Adjust the weights accordingly. 46

How it works? Set initial values of the weights randomly. Input: truth table of the XOR Do Read dinput t( (e.g. 0, and 0) Compute an output (e.g. 0.75343) Compare it to the expected (i.e. desired) output. (Diff= 0.75343) Modify the weights accordingly. Loop until a condition is met Condition: certain number of iterations Condition: error threshold 47

JOONE 48

Design Issues Initial weights (smallrandom values [ 1,1]) Transfer function [How the inputs and the weights are combined to produce output?] Error estimation. Weights adjusting (akatraining, Teaching) Number of neurons. Dt Data representation tti Size of training set 49

Transfer Functions Linear: The output is proportional to the total weighted input. Y=B*x, Y=B+x Threshold: The output is set at one of two values, depending on whether the total weighted input is greater than or less than some threshold value. Non linear: The output varies continuously but not linearly as the input changes. Y= [Sigmoidal] e x 2 a 2 2 50

Error Estimation RMSE (Root Mean Square Error) is commonly used, or a variation of it. D(Y actual, Y desired )= 2 ( Ydesired ) ( Yactual ) 2 51

Weights Adjusting After each iteration, weights should be adjusted to minimize the error. All possible weights Hebbian learning Artificial evolution Back propagation 52

Back Propagation N is a neuron. N w is one of N s inputs weights N out is N s Ns output. N w =N w +ΔN w ΔN w =N out * (1 N out )* N ErrorFactor N ErrorFactor =N ExpectedOutput N ActualOutput This works only for the last layer, as we can know the actual output, and the expected output. What about the hidden layer? 53

Weights Adjusting Error Factor of X1 = (ΔY1 w * X1 w of Y1)+ (ΔY2 w * X1 w of Y2)+ (ΔY3 w * X1 w of Y3)+ (ΔY3 w * X1 w of Y3) 54

Number of neurons Many neurons: Higher accuracy Slower Risk of over fitting Memorizing, rather than understanding The network will be useless with new problems. Few neurons: Lower accuracy Inability to learn at all Optimal number: Trail and error! Adaptive techniques. 55

Data representation Usually input/output data needs pre processing pepocess Pictures Pixel intensity Smells Molecule concentrations Text: A pattern, e.g. 0 0 1 for Chris, 0 1 0 for Becky Numbers: Decimal or binary 56

Size of training set No one fits all formula Some heuristics: Five to ten times training i samples as the number of weights. Greater than W 1 a W : number of weights a: desired accuracy W n log 1 a 1 a n: number of nodes Greater than 57

Applications Areas Function approximation Classification Clustering Pattern recognition (radar systems, face identification) 58

Applications Electronic Nose NETtalk 59

Electronic Nose Developed by Dr. Amy Ryan, at JPL, NASA. Goal: Detect low concentrations of dangerous gases. Ammonia is dangerous at a concentration of a few parts per million (ppm) Humans, can't sense it until it reaches about 50 ppm. An E Nose can differentiate between Pepsi, and Coca Cola. Can you? 60

Electronic Nose An E Nose uses a collection of 16 different polymer films. These films are specially designed to conduct electricity. When a substance is absorbed into these films, the films expand slightly, and that changes how much electricity they conduct. e - e - e - e - e - e - 61

Electronic Nose Because each film is made of a different polymer, each one reacts to each substance, or analyte, in a slightly different way. An artificial neural network combines the differences in conductivity to detect the substance that caused them. e- e - e- e - e - e - 62

NETtalk Experiment Carried out in the mid 1980s by Terrence Sejnowski and Charles Rosenberg. Goal: teach a computer how to pronounce words. The network is fd fed words, and phonemes, and articulatory features. It really learns. Listen! 63

Advantages / Disadvantages Advantages Adapt to unknown situations Powerful, it can model complex functions. Ease of use, learns by example, and very little user domain specific expertise needed Disadvantages Forgets Not exact Large complexity of the network structure

Status of Neural Networks Most of the reported applications are still in research stage No formal proofs, but they seem to have useful applications that work

Conclusion Artificial ca Neural Networks aea are an imitation of the biological neural networks, but much simpler ones. ANNs can learn, via trial and error. Many factors affect the performance of ANNs, such as the transfer functions, size of training sample, network topology, weights adjusting algorithm, No formal proofs, but they seem to have useful applications. 66

Refrences BrainNet II Creating A Neural Network Library Electronic Nose NETtalk, Wikipedia A primer on Artificial Neural Networks, NeuriCam Artificial Neural Networks Application, Peter Andras Introduction to Artificial Neural Networks, Nicolas Galoppo von Borries Artificial Neural Networks, Torsten Reil What is a "neural net"? Introduction to Artificial Neural Network, Jianna J. Zhang, Bellingham AI Robotics Society, Bellingham WA Elements of Artificial Neural Networks, Kishan Mehrotra, Chilukuri K. Mohan and Sanjay Ranka, MIT Press, 1996 Thanks to all folks who share there material online 67

Artificial i lneural lnetworks Mohamed M. El Wakil mohamed@elwakil.net 69

Learning Paradigms Supervised learning Unsupervised learning Reinforcement learning 70

Supervised learning This is what we have seen so far! A network is fed with a set of training samples (inputs and corresponding output), and it uses these samples to learn the general relationship between the inputs and the outputs. This relationship is represented by the values of the weights of the trained network. 71

Unsupervised learning No desired output is associated with the training data! Learning by doing! Faster than supervised learning Used to find out structures within data: Clustering Compression 72

Reinforcement learning Like supervised learning, but: Weights adjusting is not directly related to the error value. The error value is used to randomly, shuffle weights! Slow, due to randomness. 73