Font Size: a A A

Research And Implementation Of Text Classification Algorithm Based On Three-way Decision And Convolution Neural Network

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2428330605470079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the arrival of the era of big data,text data has exploded.In order to improve the efficiency of information acquisition,text classification technology has developed rapidly.News text is an important form for people to get information.How to classify news accurately and quickly so that users can accurately recommend interesting and valuable information has become a hot issue.In order to solve the problem of high dimensionality and sparse features in existing news text classification,this paper will improve the traditional feature extraction methods in feature selection,and complete the research work of text classification algorithm.In this paper,the flow of existing text classification technology is analyzed,the commonly used classification algorithms are introduced,the traditional feature selection algorithm is improved with three-branch decision theory,and the convolution neural network classification algorithm with attention mechanism is implemented.The main work is as follows:(1)In this paper,the existing text classification algorithms are studied.By consulting the literature,the development process of text classification is understood,and the advantages and disadvantages of Bayesian,SVM,CNN,RNN and other classification algorithms are analyzed,which lays a foundation for the improvement work below.(2)In this paper,expressiveness index is added to the traditional feature selection algorithms CHI and TF-IDF to improve the weight of highly distinguished feature words.Then,this two improved algorithms are used as Double-evaluation functions of three decision-making branches to divide feature words into positive,negative and boundary domains,filter feature words in boundary domains,and merge them with positive areas to determine the final specificity.Signature combination;and then combined with Bayesian classifier to experiment,to verify that this method can improve the classification accuracy.(3)Combining attention mechanism with CNN,adding attention layer after input layer improves the quality of feature words.The experimental results on THUnews dataset verify that A-CNN in this paper improves the classification accuracy,which reaches 97%.
Keywords/Search Tags:Text Classification, Feature word extraction, Three Branches Decision, Convolutional Neural Network
PDF Full Text Request
Related items