Research On Compression And Acceleration Method Based On Neural Clustering Algorithm For Transformer Model

Posted on:2022-10-07

Degree:Master

Type:Thesis

Country:China

Candidate:N N Wang

Full Text:PDF

GTID:2558307154479264

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Transformer has been widely used in a variety of natural language processing tasks,due to its core component self-attention mechanism can well capture the dependencies of arbitrary word pairs.However,this mechanism’s quadratic com-plexity to the sequence length limits its application in long sequence tasks.In order to obtain a more efficient attention mechanism,our paper proposes two efficient models based on neural clustering algorithm.In our paper,we first design a clustering algorithm based on neural network,and then propose two kinds of sparse work,namely Block Attention Model and Approximate Attention Model based on this algorithm.Block Attention Model reduces the complexity of Transformer from O（N²d）to O(NN^1/2d).The approximate attention model reduces the Transformer to linear complexity O（Nkd）,where N is the sequence length,d is the dimensionality of the word embedding in the attention and kis the number of clusters.Both models reduce the complexity of standard Transformer and greatly improve the model’s time and memory efficiency.We validated Block Attention Model on machine translation,text classification,natural language inference,text matching tasks,and pre-training tasks.In terms of ef-fectiveness,compared with baseline models（Transformer,Reformer and Routing Transformer）,the model shows a comparable or even better effectiveness in each task,and has obvious efficiency advantages（time and memory）in long sequence tasks.In ad-dition,we also validated the Approximate Attention Model on machine translation,text classification,natural language inference,text matching tasks.Experiments confirm that the Approximate Attention Model has great advantages over the baseline model in both effectiveness and efficiency.Especially in terms of efficiency,Approximate Atten-tion Model on IMDB dataset of text classification saves the memory at least 33.7%,and the training time by at least 32.4%.

Keywords/Search Tags:

Self-Attention Mechanism, Neural Clustering Method, Block Attention Model, Approximate Attention Model

PDF Full Text Request

Related items

1	Research On Text Classification Method And Its Interpretability Based On Attention Mechanism
2	Attention Mechanism Based Deep Network For Human Action Recognition In Video
3	Computer Model Research Of Visual Attention Based On Cooperative Work Between Spatial And Object Attention
4	Research On Image Segmentation Model Based On Position And Channel Attention Mechanism
5	Research On Machine Translation Model Based On Self-Attention Mechanism
6	Improved Tacotron2 Speech Synthesis Method Based On Forced Monotonic Attention Mechanism
7	Research And Application Of Deep Knowledge Tracing Method Based On Attention Mechanism
8	Research On Panoptic Segmentation Network Based On Attention Mechanism
9	Research Of Salient Regions Detection Based On Bottom-Up Visual Attention Model
10	Computational Model Of Visual Attention Mechanism And Its Application Research