Font Size: a A A

Applied Research On Gradient Descent Algorithm In Deep Learning

Posted on:2020-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:T X WangFull Text:PDF
GTID:2428330578961310Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the data,artificial intelligence technology has been deeply rooted in people's minds,and deep learning technology has also emerged with artificial intelligence technology.The success of deep learning lies in its ability to solve a series of problems in neural networks.In dealing with deep neural networks and convolution neural networks,the support of gradient descent algorithms is often indispensable.The gradient algorithm in deep learning can be applied to the solution of many problems,such as linear regression model,matrix decomposition,and block chain consensus algorithm.The main problem of Block chain technology is the consensus problem between nodes.There are often Byzantine fault nodes between the miner nodes to disturb the current trading system.The solution to these Byzantine nodes problems is called Byzantine fault-tolerant technology(or Byzantine resilience technology).In dealing with Byzantine consensus problems,there are often many number of Byzantine nodes between normal nodes.In recent years,the problem of Byzantine fault tolerance has attracted the attention of researchers and has become a hot topic in the industry.The existence of Byzantine nodes can lead to unreliable information exchange between nodes,which will interfere with the normal operation of the algorithm and may even lead to server paralysis.It is one of the effective methods to deal with the Byzantine consensus problem by using the stochastic gradient descent algorithm in deep learning.Throughout the last two years,Blanchard first proposed using stochastic gradient method to deal with the Byzantine fault model,and proposed the Classic Byzantine model.Based on the classical Byzantine model,Blanchard added the Krum function aggregation rule in the iterative process,ensuring that under the classical Byzantine model,the entire parameter server system can be robust based on the Krum aggregation rules.Then based on the Blanchard classic Byzantine model,Cong Xie proposed a more general Byzantine model,the General Byzantine model,which actualized the classic Byzantine model,highlighting the arbitrariness of the Byzantine structure,while Cong Xie proposed a median based aggregation rule.That is,the geometric median,and the convergence analysis of the aggregation rule based on the geometric median,and the external Byzantine nodes attack in the experiment,the interference is proved,and the aggregation rule basedon the geometric median is proved to be robust.Unfortunately,the deep learning gradient descent algorithm is used to solve the Byzantine fault problem.At present,the technology at domestic and abroad is still not mature enough,and there are still many aspects worthy of further investigation.This thesis has done relevant research based on this status,mainly including the following work:(1)Firstly,this thesis summarizes some concepts of gradient learning methods for deep learning,introduces different application environments of BGD,MBGD and SGD,and introduces the optimization algorithm based on SGD improvement,and through gradient learning algorithm and traditional machine in deep learning.(2)The thesis applies the stochastic gradient descent method under deep learning to solve the Byzantine fault,starting from the Byzantine fault model and the iterative process,and introducing the Byzantine elasticity concept.(3)Based on the median-based Byzantine aggregation rules,a new Byzantine aggregation rule based on trimmed mean is proposed.(4)In the last,the convergence rules based on the trimmed mean are analyzed in the strong convex environment and the non-strong convex environment,and the node attack experiments based on the trimmed mean aggregation rules are proved to be robust.
Keywords/Search Tags:deep learning, stochastic gradient descent, Byzantine aggregation rules, trimmed mean
PDF Full Text Request
Related items