Research On Deep Learning Text Classification Method Based On HowNet

Posted on:2022-12-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Nie

Full Text:PDF

GTID:2518306773981419

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

With the continuous progress and development of science and technology,more and more text data have been produced.The processing of text data is the general trend.In recent years,text data classification technology has developed rapidly.In the face of the increase of massive news data,public opinion data and other text data,text classification methods are also innovating.However,there are great differences in the structure of different text data,and there will be loss and semantic incompleteness in massive text data.Text data has the characteristics of unclear semantic expression,high dimension and sparse data content.The traditional classification methods often do not consider the semantic accuracy.Therefore,different text classification methods should be used for different text data information,so text classification has always been one of the hot issues in the field of natural language processing.Aiming at the shortcomings of the current methods,this paper proposes a deep learning text classification method based on How Net(DL-TC-HN).Firstly,the semantic classification is carried out through the two-way LSTM neural network with attention mechanism in deep learning,and then the text with sparse feature words is sent to the knowledge base for expansion,and spliced through the How Net semantic similarity calculation method,Combined with the topic model,it is finally classified by classifier.The main research work of this paper is as follows:(1)Prevent high dimension and large amount of calculation of text data.This paper uses How Net based semantic similarity calculation algorithm to calculate the similarity of feature word vector.The text is preprocessed through the Bert model and calculated in the vector dimension.By considering the spatial structure and semantic structure of the feature vector,the accuracy of similarity calculation is increased,and the data that does not meet the threshold conditions of spatial structure in the calculation process is eliminated,so as to reduce the operation time and improve the calculation efficiency.Through the data set of Stanford reasoning corpus,the semantic similarity calculation algorithm based on How Net is compared with a variety of classical algorithms in terms of efficiency and calculation accuracy,which proves the effectiveness of this method.(2)In view of the fact that the traditional text classification does not consider the semantic influence,this paper proposes to use the two-way LSTM model with attention in deep learning to fully extract the text data at the semantic level.The parameters of each layer of neural network are obtained through training,and finally a more accurate text semantic feature word vector is obtained.In view of the sparse text data and the incomplete feature words,the CN DBpedia knowledge base is cited to obtain the relationship between entities through the triples of the knowledge base,so as to expand the feature relationship.Through the threshold of the knowledge base,the entity relationship that finally meets the conditions is determined,so as to expand the semantics.Based on the above process,the results are finally sent to the classifier with BTM topic model for text classification.This process effectively avoids the deviation in the calculation process and the error caused by the incomplete model structure,and makes the final classification result more accurate.Through four data sets,the text classification method based on deep learning is compared with a variety of classical algorithms in terms of efficiency and computational accuracy,which proves the effectiveness of this method.

Keywords/Search Tags:

text classification, deep learning, topic model, HowNet

PDF Full Text Request

Related items

1	Research On Deep Learning Text Classification Based On Fusion Topic Features
2	Research On Text Classification Method Based On Topic Model And Deep Confidence Network
3	Research And Implementation Of Text Classification And Recommendation System Based On The Deep Learning
4	Research On Short Text Classification Algorithm Based On LDA And Deep Learning
5	Research On Feature Expansion And Classification Of Short Text Based On Topic Model And Deep Learning
6	Research On Text Classification Based On Deep Learning And Topic-driven
7	Study On Topic Model Based Multi-label Text Classification And Stream Text Data Modeling
8	Research And Implementation Of Multilingual Text Classification System Based On Deep Learning
9	Research On Short Text Classification Based On Deep Learning And BTM Model
10	Improved Text Topic Representation And Learning Method