Introduction to Artificial Neural Network(ANN)
Parag Verma
28th Dec, 2019
Introduction to ANN
Whenever we discuss about Machine Learning(ML), there usually comes a point where we tend to talk about the model training part. This is the crux of any ML model as it executes the part which unearths the hidden nuancs in the data and derives insights.Right from Linear regression to Random forest, each algorithm essentially has that crux that skims through the data to identify pieces of generalisability in order to derive insights.Neural network are nothing different in the sense that they have an inherent algorithm that trains on the dataset and derive results. However,there is a lot of gung ho about the neural networks as it is able to solve problems with high level of accuracy and is particularly robust when it comes to text and image analysis.In this blog we will look at the essential building blocks of a basic neural network
Neurons as the basic Unit
Neural networks are modelled on neurons which facilitates transfer of messages within the nervous sytem.Each neuron recieves some input(stimulus),performs some computation and passes it along the chain to the other neuron.As shown in the below figure,millions of such neurons are connected to coordinate various activities within the human body.
Artifical neural network uses the same concept in the sense that it sets up a unit to mimic the neurons, enables some processing at it and then conveys the message/signal to the next neuron(layer).On a conceptual level, one can think of the below diagram to understand the basic building blocks of a neual network.
Lets discuss the neural network in the light of the most basic configuration known as Perceptron
Nodal Representation of ANN
The most simplest ANN configuration(also known as Perceptron) is shown below
A Perceptron has the following components:
- One or more input layers
- A bias
- Activation function
- Single output layer
The Perceptron recieves one or more inputs,multiples them by wieghts,adds the bias component and then passes it along an activation function to generate the output.Bias and weights are tunable(adjustable) parameters of the neurons and are generated based on some learning criteria.
Need of a Mapping function
To restrict the output of a neuron within a certain boundary,we need a restrictive kind of functioanlity. This restriction is provided by an activation function.It is nothing but a mapping mechanism beween input and output.
Need of a bias component
If all the inputs are zero, then the multiplicative weights will haven no effect.To mitigate this, bias added
Different Types of Activation Function
Activation functions are nothing but components that uses a mathematical equation to determine the output of a neural network.It determines whether a neuron should be activated(fired) or not depending upon it relevance in model prediction.They also help normalise the output in the range 0,1 or -1,1
Identity function
It maps the input to the same output value. In this sense it is proportional to the input
Unit Step function
If the value of the input is above a certain threshold, output is true otherwise zero.It is very useful in classification problems
Sigmoid function
It is also called S shaped function. Commonly used Sigmoid functions are logistic(from logistic regression) and hyperbolic tangent function.
- Binary Sigmoid: Output values vary between 0 and 1
- Bipolar Sigmoid: Output values vary between -1 and 1.Also known as hyperbolic tangent function or tanh
ReLU function
It stands for rectified linear unit(ReLU).It is the most used activation function.It outputs 0 for negative values of x and x for other values
Feedforward and Feedback ANN
There are two main types of artificial neural networks:
Feedforward neural networks:Feedforward neural network is a network which is not recursive. Neurons in this layer are only connected to neurons in the next layer, and there is no feedback or cycle. In Feedforward signals travel in only one direction towards the output layer
Feedback neural networks:Feedback neural networks contain cycles. Signals travel in both directions by introducing loops in the network. The feedback cycles can cause the network's behavior change over time based on its input. Feedback neural network also known as recurrent neural networks(RNN)
Working of an ANN
To understand the working of an ANN, lets represent it as shown below
The figure represents a flow of input through the network along with various mathemtical function it is subjected to. Step by step working of ANN is given below:
Step 1:Input values are passed to the neurons.They are then multiplied by weights and a bias(always equal to 1) is added to them.Nuerons from one layer are connected to the neurons in the next layer through weighted connections.These weights are adjusted through the course of the entire process so that it becomes more accurate
Step 2:x1 ....xn represents either the input data or the output from the previous layer of neurons.w1 ....wn represents the weight of the connections to the next layer of neurons. The value of the next layer's neurons is computed through the summation block shown in the fugure
Step 3:The summed up value is then fed to the activation layer.As mentioned before, the bias ensures that the input to the activation function in other than 0.
Step 4:Activation function introduces non linearity.Without an activation function, the ANN would just be similar to a linear regression.Linear models only work well on data that is in the shape of a straight line. When data gets more complicated, the activation function ensures that the neural network is capable enough to handle such data. Activation functions compress a neuron's value into a smaller range. The compressed value is then passed onto the next neuron.All the summed up inputs are passed through the activation function.In this way it forward propagates the processed input to the next layer
Step 5:After forward-propagation,the predicted/experimental values are then compared with the actual values. The error between the predicted and actual values is minimized through a process called gradient descent.By backtracking from the output layer and going through every connection, calculating the gradients and updating each connection's weight accordingly, a more accurate neural network is formed.
Step 6:The above process is done through several iterations, and the number of iterations is determined by the learning rate, which specifies how fast the neural network learns. Once those iterations have passed, the neural network has finished learning the data and is ready to be tested
Applications of an ANN
ANN is widely used in the following fields:
- Natural language processing(NLP)
- Anomaly detection
- Pattern Recognition
- Image processing
Link to Previous Blogs
List of Datasets for Practise
https://hofmann.public.iastate.edu/data_in_r_sortable.html
https://vincentarelbundock.github.io/Rdatasets/datasets.html