Algorithm Research On Text Classification And Named Entity Recognition Based On Deep Text Feature Representation

Posted on:2021-05-12

Degree:Master

Type:Thesis

Country:China

Candidate:L H Yu

Full Text:PDF

GTID:2428330611966944

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With powerful feature learning ability,deep learning has been effectively applied and broken through in the field of natural language processing.How to learn the good text feature representation is one of the keys to judge the quality of a deep learning text representation algorithm.Good text feature representation can improve the performance of text classification and recognition.Text classification and Named Entity Recognition are two basic tasks and research hotspots in natural language processing.Based on these two tasks,this paper proposes two deep learning modes to extract effective deep text feature representation and improve the performance of text classification and named entity recognition.The research work mainly includes the following two aspects:1)A Global-Local Mutual Attention model(GLMA)for text classification is proposed.The model extracts global and local features simultaneously,and it utilizes a global-local mutual attention mechanism to learn the interaction and mutual effects between them to extract more effective global and local feature.The global-local mutual attention mechanism includes a local-guided global attention and a global-guided local attention.On one hand,the local-guided global attention assigns weights and combines global features of word positions that are semantic related to capture combined semantic.On the other hand,the global-guided local attention automatically assigns more weights to relevant local features to capture key local semantic.Besides,the weighted-over-time pooling in the model can effectively extract discriminant global-local feature representations.The experimental results on 23 datasets demonstrate that the model can extract more effective global and local feature representation and improve the accuracy of text classification.2)A Multiple-Level Topic-Aware Representation model(MLTA)for named entity recognition is proposed.The model utilizes a bi-directional recurrent neural network to extract sequential feature and models multiple-level topic representation: word-level topic representation and corpus-level topic representation by introducing the neural topic model.The word-level topic representation learns the relationships between words and latent topics which can capture different semantics under different contexts.The corpus-level topic representation can extract the corpus level global information and access deeper understanding on the meaning of each word.The experimental results on three named entity recognition datasets demonstrate the effectiveness of the proposed model.Besides,quantitative and qualitative experimental analysis and visualization analysis further verify the effectiveness of the multiple-level topic-aware representation model in identifying named entities that are ambiguous or out of vocabulary.

Keywords/Search Tags:

Text Classification, Named Entity Recognition, Text Feature Representation, Mutual Attention Mechanism, Topic Modeling

PDF Full Text Request

Related items

1	Reaserch On Named Entity Recognition For Web Recruitment Text Based On Deep Learning
2	Research On Named Entity Recognition For Chinese Text
3	Research On Chinese Short Text Entity Chain Indicator Oriented To Retrieval System
4	Research On Named Entity Recognition Method For Weibo Text
5	Research On Association Rules Mining Based On Multi-topic Classification And Named Entity Recognition
6	Research Of Text Classification Based On Word2vec And Self-attention
7	Feature Coupling Generalization And Its Application In Text Mining
8	Research On Topic Modeling For Short Text With Enriched Feature Representation
9	The Research On Local Smooth Preserving Of Manifold Regularization Auto Encoder For Text Representation
10	Research On Named Entity Recognition And Entity Link Method For Short Text Questions