Research On Parallel Optimization Methods For Image Recognition

Posted on:2023-12-03

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Zhao

Full Text:PDF

GTID:2568306848481424

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of big data,processor computing power and algorithm models,deep learning algorithms have achieved better results than traditional machine learning algorithms in many applications.In order to improve the universality of the models and obtain better training effects in practical applications,the structure of neural networks is becoming more and more complex,the number of network layers and parameters is increasing,and the scale of models and data sets used for deep learning training is becoming larger and larger.The increase of data set size and network layers makes the training process of deep learning consume a lot of storage and computing resources,and the time required for training also increases,resulting in a higher demand for computing power.The high computing capacity of multi-core,many-core,clusters and other parallel computer systems can provide high computing power support for the training of deep learning models,and can effectively solve the problem of low training speed of deep learning models.Therefore,the design of deep learning algorithms based on various high-performance parallel computing platforms has become a research hotspot.Aiming at the problem of image recognition in machine learning,this paper studies the parallel optimization methods of image recognition on cluster parallel system based on the existing image recognition methods.The parameter updating mechanism of distributed stochastic gradient descent algorithm is improved by introducing parameter server mechanism.On the one hand,the gradients calculated by worker nodes is sparsely processed to reduce the communication between worker nodes and parameter server nodes.On the other hand,the sending of model parameters from the parameter server nodes to the worker nodes is converted to the sending of gradient accumulation values greater than a threshold after sparsely processing,so as to further reduce the communication between the worker nodes and the parameter server nodes.In addition,two methods to deal with momentum loss are introduced successively to improve the accuracy of image recognition model.The main work of this paper is as follows:(1)A ASGD-based gradient sparsification accelerated training algorithm is proposed.Firstly,the gradient compression ratio is set,and then the threshold is determined according to the gradient compression ratio.Only the gradients calculated by the worker nodes that are greater than the threshold are uploaded to the parameter server nodes,so as to reduce the communication between the worker nodes and the parameter server nodes.In addition,a momentum correction strategy is introduced to improve the training accuracy of the model.Experimental results on CIFAR-10 data set show that,compared with ASGD,the ASGD-based gradient sparsification accelerated training algorithm proposed in this paper can greatly improve the training speed of the model and can improve the model accuracy at the same time.(2)A GS-ASGD-based model parameter sparsification strategy is proposed.On the basis of GS-ASGD,the communication process that the parameter server nodes send model parameters to the worker nodes is optimized.The parameter server nodes send the model parameters to the worker nodes,which is converted into sending the sparse gradient accumulation values greater than the threshold to the worker nodes.The worker nodes use the local model parameters to subtract the product of the above gradient accumulation values and the learning rate to obtain the updated model parameters.Compared with GS-ASGD,the GS-ASGD-based model parameter sparsification strategy can further reduce the communication between worker nodes and parameter server nodes.At the same time,a sparsification aware momentum is introduced to avoid momentum loss in sparse scenarios.The experimental results on CIFAR-10 data set show that,compared with ASGD and GS-ASGD,the GS-ASGD-based model parameter sparsification strategy proposed in this paper has better effect,which not only improves the model training speed,but also improves the model accuracy to a certain extent.

Keywords/Search Tags:

Image Recognition, Parallel Optimization, Deep Learning, Parameter Server, Distributed Training

PDF Full Text Request

Related items

1	Research And Implementation Of Training Mechanism In Distributed Deep Learning
2	Research On Cooperative Hybrid Parameter Update For Data Parallel Deep Learning Training Jobs
3	Research On Optimizing The Training Efficiency Of Distributed Deep Learning For Heterogeneous GPUs
4	Parameter Servers Placement For Distributed Training
5	Research On Key Technologies For Training Efficiency Optimization Of Distributed Machine Learning Over WAN
6	Research On Key Technologies For High-Performance Parallel Training Of Large-Scale Deep Learning
7	Research And Implementation Of Data Parallel Training Optimization Methods For Deep Learning Models
8	Research On Parameter Communication Optimization For Distributed Machine Learning System
9	Research On Parameter-exchanging Optimizing Mechanism In Distributed Deep Learning
10	Communication Optimization Of Distributed Deep Learning Based On Gradient Priority