Font Size: a A A

A New Adaptive Activation Function Of Neural Network With Application To Deep Learning Research

Posted on:2021-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q LiuFull Text:PDF
GTID:2428330623978278Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
The concept of deep learning originates from the research of classic artificial neural networks.It is a method based on data representation learning in machine learning,which has played a positive role in the development of artificial intelligence.By combining low-level features to form more abstract high-level to represent attribute categories or features,deep learning is represented by discovering distributed features of data.Deep learning is also a new field in machine learning research.Its motivation is to build and simulate the neural network of human brain for analysis and learning.It simulates the mechanism of human brain to interpret data.Convolutional neural network is a type of feedforward neural network with convolutional calculation and deep structure,which is one of the representative algorithms of deep learning.In convolutional neural networks,a function is usually required to perform the non-linear mapping of input features.This important function is collectively called the activation function.This paper is based on the Tensorflow learning framework to study and discuss activation functions.By studying and summarizing the activation functions that have been proposed by the predecessors,it is found that the hyperbolic tangent function(Tanh)and the Sigmoid function are commonly used activation functions in neural network models.Their function values grow faster in the area near the zero,while they grow more slowly away from the zero.This form of function is consistent with the activation and inhibition states of nerve cells when they are stimulated,and has been applied a lot in early researches.When the input is too large or too small,however,the gradient of the neuron approaches zero,so that the weight cannot be updated well when the error back propagation algorithm is applied,which is one of the reasons why the early neural networks could not be deepened.Therefore,this thesis proposes a new activation function to improve the convergence of deep learning networks.Based on the existing activation function,this thesis constructs a smooth activation function with parameters for deep learning neural networks,and achieves adaptive adjustment of parameters.For different data sets and network structures,an online correction formula for parameters is built based on the error back propagation algorithm to make it universal.From contrast experiments with some commonly used activation functions such as ReLU,Leaky_ReLU,Sigmoid,etc.,it can be found that the new activation function proposed in this thesis significantly improves the accuracy of image processing on different data sets,and solves vanishing gradient problem,non-smooth,overfit and other issues during the training process.In order to verify the general adaptation of the new function proposed in this paper,numerical experiments were carried out on multiple data sets and other deep convolutional neural networks.Compared with other activation functions,the convergence accuracy was significantly improved.
Keywords/Search Tags:deep learning, Convolutional Neural Network, Activation Function, TensorFlow, diffusion of gradients, Feedforward neural network, supervised learning, overfitting
PDF Full Text Request
Related items