Font Size: a A A

Research And Analysis Of Text Classification Theory Based On Deep Learning

Posted on:2022-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:X HeFull Text:PDF
GTID:2518306764476594Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
As a basic and important task in the NLP field,text classification has a very wide range of usage scenarios in the fields of user portraits,question-answering systems,and machine translation.This thesis is mainly based on the deep learning method.Through in-depth analysis of the attention mechanism and the research status at home and abroad,two text classification models are proposed.The first model in this thesis proposes a method that combines fixed-weight synthetic attention with randomly initialized synthetic attention to obtain rich text representations while using an adaptive text fusion strategy to extract more valuable feature information so as to obtain a more expressive text representation.In the traditional attention calculation module,in order to calculate the connection between words(i.e.,the attention connection matrix),the usual method is to calculate the time-consuming and memory-intensive dot product operation between words.The synthetic attention matrix used in this thesis uses simple feed-forward layer propagation instead of dot product operation,and on this basis,two methods of calculating attention connection matrices,fixed weight synthetic attention and random initialization synthetic attention,are used.Fixed-weight synthetic attention notices consistent global semantic information,while randomly initialized synthetic attention directly learns task-based simultaneous cross-instance alignment with less computation.Using the adaptive text fusion strategy,more key feature information is proposed from the two text representations according to actual needs,and a text representation with richer semantics is obtained,which is of great help to the subsequent text classification module.The comparison test with the benchmark model shows that the model has achieved good results.The second model in this thesis mainly uses a collaborative attention mechanism to obtain text representations,while a highway network is used to help the model train.The traditional multi-head attention calculation maps the input to different high-dimensional subspaces,calculates the attention separately,and then splices and merges all the information from these attention heads.However,the text information obtained by this multi-head attention actually has a certain degree of redundancy.By using cooperative attention,all attention heads share a part of the weight matrix,and at the same time,each attention head captures its own unique information.Using the mixing matrix makes it relatively independent when integrating the attention heads,and the advantage of this approach is that the matrix information that can be extracted by different attention heads can be compressed or expanded by adjusting the dimension of the mixing matrix.Because the model network has a deep level,the highway network is used to optimize the difficulty of gradient reflow during training to a certain extent.The comparison test with the benchmark model shows that the model has achieved good results.
Keywords/Search Tags:Natural Language Processing, Text Classification, Deep Learning, Attention Mechanisms
PDF Full Text Request
Related items