Research On Long Document Classification Method Based On Attention Mechanism

Posted on:2020-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:L Liu

Full Text:PDF

GTID:2428330623457551

Subject:Electronic and Information Engineering

Abstract/Summary:

With the rapid development of information technology,especially the popularity of the Internet,information capacity is exploding,and there is an urgent need for a technology to organize and manage information efficiently.Text categorization technology can find key information and text features in a detailed and efficient manner.It can be that we quickly obtain information that is of value to people.In the field of machine learning,there are already many text categorization methods,which have made excellent achievements in many aspects than traditional methods,such as classification effect,flexibility,generalization ability,etc.Nowadays,the text classification method based on machine learning has been Become a classic example of research and application in related fields.The paper first introduces the general process and related technologies of text categorization technology,analyzes the research and development status of text categorization technology at home and abroad,and puts forward the main research content of the paper on the basis of machine learning theory.Traditional deep learning based document classification methods require the use of full textual information to extract features.In this paper,in order to tackle long document,we proposed three methods that use local convolutional feature aggregation to implement document classification.The first proposed method randomly draws blocks of continuous words in the full document.Each block is then fed into the convolution neural network to extract features and then are concatenated together to output the classification probability through a classifier.The second model improves the first by capturing the contextual order information of the sampled blocks with a recurrent neural network.The third model is inspired by the recurrent attention model(RAM),in which a reinforcement learning module is introduced to act as a controller for selecting the next block position based on the recurrent state.Experiments on our collected four-class arXiv paper dataset show that the three proposed models all perform well,and the third model achieves the best test accuracy with the least information.

Keywords/Search Tags:

deep learning, long text classification, random sampling, convolution feature aggregation, recurrent attention model

Related items

1	Research And Implementation Of Chinese Long Text Classification Algorithm Based On Deep Learning
2	Application Of Improved Deep Learning Algorithm In Chinese Text Classification
3	Research On Text Classification Model Based On Deep Learning And Attention Mechanism
4	Research On Long Text Classification Algorithm Via Multi-model Fusion With Attention Mechanism
5	Research On Text Classification Of Chinese News Based On Deep Learning
6	Research Of Text Classification Methods Combining Self-attention Mechanism And Convolution Optimization
7	Research On Deep Learning Text Classification Method Based On BERT Model
8	The Research Of Text Sentiment Classification Based On Deep Learning
9	Short Text Classification Algorithm Based On Temporal Convolution And Attention Mechanism
10	Research On Text Emotion Analysis Based On BiTCN And Pre-training