Chapter -2 Artificial Neural Network 2.1 Introduction Artificial Neural Network is inspired by the neuron structure of human brain. The brain learns from experiences and adapts accordingly, which is beyond the scope of computers. This brain modeling also assures a less technical way to develop solutions to reduce human intervention. The implementation of neural network in computing will provide us key advancement in computing. Computers do things well, like performing complex maths or recognizing faces. But computers have trouble recognizing simple patterns. Computers are unable to analyze and generalize the patterns generated in the past and convert them into actions of the future. The advanced study of neural network provides understanding of thinking mechanism like, human in computing. This research focusses on method that brains store information as patterns [17]. Some of these patterns are very complex to analyze and to recognize individual faces. This process of storing information as patterns, analyzing those patterns, and then solving problems involve a new field in computing. In neural network to solve specific problems training of these networks and its creation are involved. Through neural network we can perform different techniques like learn, behave, forget, react etc. The exact workings of the human brain is still a mystery, some of the aspects of this processor are known In particular, the most basic element of the human brain is a specific type of cell which unlike the rest of the body doesn't appear to regenerate. Because this type of cell is the only part of the body that isn't slowly replaced it is 7 P a g e
assumed that these cells provide us with our abilities to remember, think, and apply previous experiences to our every action These cells, all 100 billion of them, are known as neurons Each of these neurons can connect with up to 200,000 other neurons [17] The interconnection of neurons and its complex parts, control mechanisms and its subsystems provide the power to the neural network as that of the human brain. It also comes from learning and genetic programming. The neurons passes the information between them through electrochemical pathways. These neurons are classified into different categories, depending on the classification method used. The current systems are still not as capable as human brain. These artificial neural networks can be able to replace only the most basic elements of this complicated, complex and powerful organ. For the developer, who is trying to solve problems, neural computing was never about replacing human brains. It is about machines and a new way to solve problems. [16] Let we take overview of human neurons. Neuron is the fundamental element of neural network. The biological neurons receives input then it performs different non leaner operations and then generates final output. Typical nerve cell is having four parts: dendrites, Soma, Axon and Synapses. The task of dendrites is it accept inputs. The soma process the input. Axon- turn the processed inputs into outputs and Synapses- provides contact between neurons. The biological neurons are not too simple, but they are structurally more complex. A biology provides a better understanding of neurons and network designers can continue to improve their system by building upon man s understanding of the biological brain. Artificial Neural Networks (ANNs) are computational tools that have found extensive acceptance in many disciplines for modeling complex real-world problems. [17] 8 P a g e
ANNs may be defined as structures comprised of strongly interconnected adaptive simple elements called neurons that are capable of performing massive computations for knowledge representation. The usefulness of ANNs comes from the characteristics of the biological system such as nonlinearity, high parallelism, robustness, fault and failure tolerance, learning capability, ability to handle imprecise and fuzzy information, and their ability to generalize. [20] Artificial models possess following characteristics: (i) (ii) (iii) (iv) (v) Nonlinearity allows better fit to the data Accurate prediction in case of uncertain data and measurement errors is possible due to presence of noise insensitivity. High parallelism implies hardware failure-tolerance and fast processing. Learning and adaptivity allow the system to update its internal architecture in response to changing environment. Application of the model to unlearned data is possible due to generalization. The main objective of ANN based computing is to develop mathematical algorithms that enables Artificial Neural Networks to learn by imitating information processing and knowledge acquisition similar to the human brain. ANN-based models can provide practically accurate solutions for specifically formulated problems and for process which are understood by field observations and experimental data only. In microbiology, ANNs have been utilized in a variety of applications ranging from modelling, classification, pattern recognition, and multivariate data analysis (Basheer and Hajmeer, 2000). 9 P a g e
One of the recently emerged applications of ANN is digital image processing. Interest in digital image processing stems from two principal application areas: improvement of pictorial information for human interpretation; and processing of image data for storage, transmission, and representation for autonomous machine perception. An image may be defined as a two dimensional function, f(x, y), where x and y are spatial coordinates, and the amplitude off at any pair of coordinates (x, y) is the intensity or grey level of the image at that point. The amplitude values are all finite, discrete quantities, it is called as a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. A digital image is composed of a finite number of elements each having a particular location and value. These elements are referred to as picture elements, image elements, pels and pixels. The areas of application of digital image processing are wide and varied (Gonzalez and Woods, 2002). 2.2 Phases of Neural Network The neural network consists of three group, or layers or phases. The input phase, hidden phase and output phase. The activity or input unit represents the raw data that is provided to the network. The hidden phase is working on the basis of the input data, and the weights on the connection among the input and the hidden units. The output phase depends on the activity of the hidden units and the weights among the hidden and output units. There are different types of network on the basis of the layers activity. A simple type network is referred as where the hidden units are free to construct their own 10 P a g e
representation of the input. The weights between the hidden and input units decide when each hidden unit is active, and so by adjusting these weights, a hidden unit can select what it represents. There are also other architectures like single layer and multilayer. In single layer, all layers are connected to one another. In single layer network generally consist of only inputs and outputs. The inputs are fed to outputs through a series of weights. In multilayer all units are in different layers includes inputs, hidden and outputs. [Reference 28: beee_thesis] ` Figure 2.1 General neural network architecture 2.3 Activation functions in neural network The activation functions are also referred as threshold function or transfer function. The activation functions are used to transform the activation level of a neurons (units) into in output signals. There are number of activation functions in neural network. The different types of functions are identity function, step function, piecewise Linear function and sigmoid function. 11 P a g e
2.3.1 Identity activation function: The identity activation function also called liner activation function. If the identity activation function is used in the network then it is easily shown that the network is equivalent to fitting a liner regression model of the form, where are the k network inputs, is the i-th network output are the coefficients in the regression equation. As a result, it is uncommon to find a neural network with identity activation used in all its perceptrons. 2.3.2 Sigmoid activation function: In artificial neural network sigmoid functions are used to introduce nonlinearity in the model. A neural network element computes a linear combination of its input signals, and applies a sigmoid function to the result. The sigmoid function satisfies a property between the derivative and itself such that it is computationally easy to perform, so they are more popular in neural network. φ(v) = 1 1 + exp ( av) Derivatives of the sigmoid function are usually employed in learning algorithms. The graph of sigmoid function is S shaped. This function is defined as a strictly increasing function and it is common activation function used in the creation of artificial neural network. Sigmoid function is defined as strictly increasing function that exhibits a balance between linear and nonlinear function. The sigmoid function is a unipolar function. 12 P a g e
2.3.3 Step function: It is a unipolar function and also called Threshold function. 1 if v 0 φ(v) = { 0 if v < 0 } The output of neuron K using a threshold function is y(k) = { 1 if v k 0 0 if v k < 0 } v k is the induced local field of the neuron m v k = w kj x j + b k j=1 In this model if the output of the neuron takes on the value of 1 if the induced local field of neuron is non-negative, otherwise it takes 0. 2.3.4 Piece Wise Linear Function It is a unipolar function which can be defined as 1, v +1/2 φ(v) = { v, + 1 > v > 1/2} 2 0, v 1/2 Where the amplification factor (inside the linear region of operation) is assumed to be 1. The particular situations of piece wise linear functions are 1. If the linear region of operation is maintained without running into saturation, a linear combiner arises. 2. If the amplification factor of the linear region is made infinitely large, it reduces to a threshold function. 13 P a g e
2.4 Learning Rules in neural network In neural network there are many types of learning rules, generally they are divided into three categories. 1. Supervised Learning 2. Un Supervised Learning 2.4.1 Supervised Learning In Supervised Learning, training set is available. In this type rule is provided with a set of example with proper network behavior. In Supervised Learning, Inputs and expected output pairs are given as a training set. In this type of learning, parameters are adjusted based on error signal step by step of learning; parameters are adjusted based on error signal step by step. The learning rule is given with a set of examples (the training set) along with proper network behavior {x1, d1}, {x2, d2},, {xn, dn} Here, the input to the network is xn and dn is the corresponding desired target output. According the input the output is generated. To move the network outputs closer to the preferred targets, the learning rule is used to change the biases and weights of the network. We undertake that in supervised learning, at every instant of time when the input is applied, the desired response (d) of the system is provided. The distance between the actual and the desired response serves as an error measure and is used to correct network parameter externally. For instance, in learning classifications of input patterns or circumstances with recognized responses, the error can be used to alter the weights so that the error decreases. For learning mode, training set is required which is a set of input and output patterns. 14 P a g e
2.4.2 Unsupervised learning In unsupervised learning is also known as self-organized learning. In unsupervised learning target output is not available. In this type the weights and biases are altered in response to network input only. In unsupervised learning clustering is used for pattern reorganization. In unsupervised learning the desired response is not known, so to improve network behavior explicit error information cannot be used. In this type information is not available for correctness of incorrectness of responses, so learning must be accomplished based on observations of responses to inputs that we have marginal or no knowledge about. In unsupervised learning the algorithms use patterns that are redundant raw data having no label regarding their class membership, or associations. In this mode of learning the network must discover for itself any existing patterns, separating properties, regularities etc. while discovering these the network undergoes change in its parameters. Unsupervised learning is called learning without teacher because teacher does not have to involve in every training step but teacher has to set goals for desired output. Learning with feedback is also important in neural network. Learning from feedback is called incremental learning and it is very much important in unsupervised learning. 15 P a g e