Font Size: a A A

Research And Application Of Text Classification Based On Natural Language Processing

Posted on:2021-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:K HeFull Text:PDF
GTID:2428330614965684Subject:Logistics engineering
Abstract/Summary:PDF Full Text Request
In recent years,natural language processing(NLP)has become one of the most popular research areas in machine learning.Text classification is an important problem in NLP.In this thesis,using the text from journal papers as experimental data,we research the problem of Chinese text classification and propose two models for Chinese text classification.We propose a text pre-processing algorithm called pre-processing term frequency inverse document frequency(PRE-TF-IDF),based on weight pre-processing.In the traditional word frequency algorithm,the weight of a word is assigned only based on the frequency of the word,while the position of the word is not evaluated.The PRE-TF-IDF algorithm is an improvement over the traditional term frequency inverse document frequency(TF-IDF)algorithm by adding the steps of weight pre-processing and word density weighting.We show that our pre-processing algorithm can help the model provide a better classification accuracy.We propose a text classification model called convolutional neural network and support vector machine classifier(CNNSVM),based on the combination of the convolutional neural network(CNN)and the support vector machine(SVM).We add an attention mechanism to the traditional CNN architecture and simplify the model's parameter setting.In addition,we use the SVM-based classifier to replace the softmax layer of the traditional CNN model.We show that our model has a better ability of feature extraction and also provides better generalization.
Keywords/Search Tags:Natural language processing, text classification, machine learning, convolutional neural networks
PDF Full Text Request
Related items