Font Size: a A A

Research And Its Application On Chinese Text Categorization Algorithm Based On CHI And Convolutional Neural Network

Posted on:2019-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:X JinFull Text:PDF
GTID:2428330545953841Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and spread of Internet technology,the capacity and type of Internet resources are growing explosively.Most of the Internet resources exist in text form,facing countless masses of information,how to effectively manage and use it,and find valuable information from which is a hot and important research direction.Text Categorization technology is a multi-domain technology including Information Retrieval,Machine Learning and Natural Language Processing.It is an important research direction for Information Processing and Data Mining.Artificial Feature Engineering and shallow layer classifier combine Text Categorization with Statistical Machine Learning methods,a classical text categorization method.However,This statistical Machine Learning methods require manual feature engineering,which is time-consuming and labor-intensive.The CNN Text Categorization model can automatically extract features during the training process,it can capture local features of text features,avoiding falling into a local optimal solution,from the initial data,through a hidden end-to-end model,the final classification results are directly output,which greatly increases the possibility of obtaining a global optimal solution.However,the black box modeling property of the CNN model,and the CNN model becomes difficult to interpret.The CHI feature selection + SVM classifier method in the field of text classification requires artificial feature engineering and is easily trapped in local solutions.CNN black box nature is difficult to explain.In view of the above problems,this article uses the heuristic method and the idea of weighted fusion to try to combine the advantages of the two models,a priori knowledge of CHI feature selection for text classification effectiveness,the CNN model can identify the advantages of local correlation of text features and the ability to automatically extract features without human intervention,this model attempts to combine the advantages of the two models,adding the traditional CHI Feature Selection to the Convolutional Neural Network hide-black box,in order to enhance the CNN classification ability of the Convolutional Neural Network and explain the black box process of feature selection of the CNN.This paper proposes a Chinese Text Categorization model C-CNN based on CHI Feature Selection and Convolutional Neural Network—Convolution Neural Network Chinese Text Classification model C-CNN based on CHI Feature Selection,With TensorFlow,a Google open source Machine Learning Platform and its application in Intelligent Medical Question Answering System,experiments were designed to verify that the accuracy of the Text Classification algorithm C-CNN is greatly improved.
Keywords/Search Tags:text categorization, CHI feature selection, CNN, Medical QA System
PDF Full Text Request
Related items