Font Size: a A A

Efficient Neural Architecture Search:Algorithms And Applications

Posted on:2022-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:R Q LuoFull Text:PDF
GTID:1488306323963729Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In current years,the rapid development of artificial intelligence largely benefits from the development of deep learning technology.Deep learning bases on deep ar-tificial neural network and back propagation algorithm,automatically extracts features from data and learns the pattern.Due to the development of both hardware technology and computer system technology,the speed of current computers is increasing.Mean-while,the theory of deep learning is also developing.These make training deep and complex neural networks more practical,and also motivate researchers to design more complex neural networks with better performance.However,designing neural networks is a laborious work.First it requires the designers to have good knowledge on deep learn-ing to design appropriate neural networks that can be successfully trained and deployed.Second it requires the designers to have domain knowledge on the target task.This brings challenges and difficulties when applying deep learning to specific tasks.Neural architecture search aims to automatically design well-performing neural network architectures for specific target task,to explore neural network architectures that outperform the ones designed by human experts,and largely reduce the human efforts.However,there exists one key problem in previous neural architecture methods:high cost due to the inefficiency of the methods.This thesis conducts deep study on the neural architecture search technology,and proposes a series of works from the aspect of search algorithm,training method,learning method and so on to improve the efficiency and performance of neural architecture search algorithms,and applies the technology to real applications:1.Considering the low search eficiency of conventional search algorithm in discrete search space,this thesis proposes to search neural architectures based on gradient information in continuous space.The proposed method encodes discrete neural architectures into continuous space via an encoder,builds the mapping between the architectures and corresponding accuracy,and obtains a better architecture via gradient ascent optimization.This largely increases both the efficiency and accuracy,and becomes one of the most popular methods in the area.Experiments on image classification and language modeling show that the proposed method achieved the best performance which surpassed the models designed by human experts and discovered by other neural architecture search methods.2.Considering the weak stability and relatively low accuracy of one-shot neural ar-chitecture search,this thesis proposed balanced one-shot neural architecture opti-mization method.This thesis conducts detailed analysis and points out the insuf-ficient optimization and imbalanced training problems in one-shot training which further verified through experiments.Consequently this thesis proposes balanced training,which samples architectures during one-shot training in proportional to their model sizes.This improves the stability and accuracy of one-shot neural architecture search.3.Considering the huge cost caused by training and evaluating large number of neu-ral networks required by the algorithm,this thesis proposes neural architecture search algorithm based on semi-supervised learning.It leverages numerous neu-ral network architectures that are not trained to help the training by building an accuracy predictor based on encoder,decoder and predictor and learning through supervised learning,self-supervised learning and unsupervised learning.This reduces the cost and improves the efficiency and accuracy.4.Considering the importance of accuracy predictors in neural architecture search and the issues existed in previous neural model based accuracy predictors,this thesis proposes non-neural model based accuracy predictor.This thesis analyzes the characteristics of the specific task of neural network architecture accuracy pre-diction,points out that the tabular data structure representation is more suitable for non-neural model,and consequently proposes to use gradient boosting decision tree as the accuracy predictor.Experiments show that it improves the prediction performance and consequently improves the search performance.Meanwhile,with the better explanation ability of the accuracy predictors,this thesis uses it to prune the huge search space according to the influence of different architecture features on the model output,making it smaller and better,which further improves the search efficiency and accuracy.5.This thesis applies neural architecture search technology to various areas and tasks(natural language processing and text to speech)and achieves remarkable results.This thesis successfully uses the technology to compress large scale pre-trained language model by proposing or utilizing block-wise training,progressive prun-ing and performance approximation.Due to the flexibility in model architecture,it is able to meet diversified compression requirements and achieves better perfor-mance against other methods.This thesis also creatively uses the technology to compress text-to-speech model.It profiles the bottlenecks of current model,ac-cordingly designs search space that contains lightweight operations,and utilizes appropriate search algorithm to search for lightweight models that can better meet many resource constrained conditions.The searched lightweight model achieves excellent compression ratio and speedup.
Keywords/Search Tags:Machine Learning, Deep Learning, Auto Machine Learning, Neural Architecture Search, Image Classification, Language Modeling, Text to Speech, Model Compression
PDF Full Text Request
Related items