Font Size: a A A

Efficient Deep Neural Architecture Search And Design With Knowledge Transfer

Posted on:2024-09-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:J M FangFull Text:PDF
GTID:1528307319964399Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The emergence of deep neural networks has greatly driven the development of artificial intelligence in various fields,particularly in computer vision tasks where the state-of-the-art performance has been achieved in tasks such as image classification,object detection,semantic segmentation,etc.These related techniques have also been applied to various scenarios,e.g.,autonomous driving,smart cities,and industrial inspection etc.Deep neural networks have become the most widely used fundamental model in various vision perception tasks and its architecture directly affects the perceptual accuracy and computational efficiency.It is of obviously critical importance for architecture design.Current neural architecture optimization methods mainly lie in two streams of automatic search and manual design.Search methods can obtain the optimal architecture representation in the current design space,while designing new architectures can introduce more potential computing modules,which can break through the limitations of current design paradigms.This dissertation focuses on the search and design of deep neural network architectures.Based on the idea of knowledge transfer,four innovative research methods are proposed from four perspectives: cross-dataset,cross-task,cross-scale architecture knowledge transfer,and cross-region feature knowledge transfer.These methods respectively solve the problems of high search cost on large datasets,difficulty in downstream task search,limited flexibility of search space,and saturation of convolutional neural network design.The main research content and contributions are as follows.1.Neural architecture search needs to evaluate a large number of candidate models,which typically incurs significant computational cost.The cost becomes more unaffordable on large-scale datasets.This dissertation proposes an elastic architecture search method that builds cross-dataset architecture knowledge transfer.Starting from the architecture searched on a small dataset,a new population of model architectures are initialized.It requires only small adjustment cost to evolve new architectures,allowing for a fast transfer to larger datasets and effectively reducing the complexity of neural architecture search on large-scale datasets.On the Image Net image classification dataset,with the similar accuracy,the search cost is reduced by 56 times compared to the previous work NASNet and 89 times compared to Amoeba Net.2.As different visual tasks have different characteristics,it require to adjust network architecture and parameters to better match new tasks.However,downstream tasks often require large-scale pre-training,resulting in huge architecture search costs for new tasks.This dissertation proposes a fast neural network adaptation method that achieves crosstask architecture and parameter knowledge transfer.This method includes a parameter remapping technique that enables parameter mapping between heterogeneous networks.A pre-trained neural network can be selected as a seed network.With its architecture and parameters automatically adjusted,it can be efficiently and accurately adapted to various downstream visual perception tasks.Compared to the representative NAS work DPC for semantic segmentation,the overall computational cost is reduced by 1737 times while also improving the m Io U accuracy by 1.3%.3.The search space is crucial for the performance of the obtained architecture.Directly expanding the range and dimension of the search space will introduce a huge search cost.This dissertation proposes a densely connected search space by building cross-scale architecture knowledge transfer.With dense connections representing different width and depth settings,the originally parallel high-complexity width search is transformed into a serial path search problem.It is the first to achieve joint search of width,depth,and downsampling nodes under the differentiable paradigm,significantly improving search flexibility at low cost.Meanwhile,a chain cost estimation algorithm is proposed to accurately optimize model hardware costs.Compared to the previous state-of-the-art NAS work Proxyless NAS,with similar accuracy,the model computation cost is reduced by22.4%,and the GPU inference latency is reduced by 19.0%.4.Convolutional neural networks are saturating in terms of performance and design space.Transformers,originally used for text tasks,have advantages in modeling global feature relationships.However,visual tasks often involve high-resolution data and object information at various scales,resulting in huge costs when using transformers directly.This dissertation proposes an efficient Transformer architecture for visual tasks,including a novel local information interaction method-introducing messenger tokens to achieve feature knowledge transfer across local regions.Efficient and flexible information gathering and distribution can be performed.Compared to the previous state-of-the-art method Swin-Transformer,with computation cost reduced by 15.6%,the Image Net image classification accuracy is improved by 1.2%.Furthermore,this novel Transformer architecture provides ideas for the design of the new generation of neural architectures,bringing more possibilities to the current design space.Starting from the perspective of knowledge transfer,this dissertation conducts a series of researches around efficient neural network architecture search and design.Knowledge transfer of architectures and features are achieved from multiple aspects.The efficiency and flexibility of neural architecture search and design is effectively improved,laying the foundation for the application and deployment of neural network architectures.
Keywords/Search Tags:Visual Recognition, Deep Neural Networks, Neural Architecture Search, Neural Architecture Design, Knowledge Transfer, Visual Transformer Networks
PDF Full Text Request
Related items