Font Size: a A A

Research On Text Classification Based On Deep Learning

Posted on:2020-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2428330572981091Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet,a large amount of unstructured data has been generated,especially daily updated news texts.This paper studies news texts from two aspects,which are the theme classification and the emotional analysis of text respectively.By classifying texts according to the theme of the text,complex and diverse texts can be easily managed.It is also convenient for schools,companies,hospitals and various organizations that need to classify the various textual data that is constantly being produced according to specific classification criteria.By analyzing the sentiment orientation of the text,the user's comments in the e-commerce platform can reflect the satisfaction degree of the goods by the customer;in the blog,it can reflect the emotional attitude of the masses to certain kinds of events,as well as the trend of public opinion;in film and television reviews,it can reflect the degree of popularity of certain film and television works by the audience.The sentiment analysis of news texts can reflect whether the prospects of certain industry sectors or certain enterprises are positive or hidden,or whether social hot news events are positive or negative energy.In the research of the text topic classification model,the text topic classification model is trained by long and short time memory neural network(LSTM neural network).At first,crawl the news corpus with the text topic category label,perform the corresponding data cleaning work according to the characteristics of the corpus,and then perform data pre-processing work such as word segmentation,remove stop words,and class labels to numbers,convert the text into a word vector as input to the LSTM neural network.The main hyperparameters in the neural network process are studied.Through the comparison experiments of different super-parameter values,the appropriate hyperparameter training model is determined.Finally,the front-end interface design and application of text topic classification are realized.In the study of text sentiment analysis model,fastText neural network is used to train text sentiment analysis model.At first,data cleaning based on text features,such as removing advertising-like noise data,too long and too short,and irregular news texts,after the data is preprocessed by word segmentation and the like as the input of the fastText neural network,the text sentiment analysis model is trained.Based on the research of text sentiment analysis model,the idea of ensemble learning is introduced.Through the resampling of training samples,multiple weak classifiers are trained,and then combined by the bagging ensemble learning algorithm to become a strong classifier.The strong classifier combined with weak classifier has higher accuracy,can adapt to more different data sets,and has stronger generalization.It has great theoretical significance and practical value in the research of text sentiment analysis.
Keywords/Search Tags:Fasttext neural network, LSTM neural network, Text theme classification, Text sentiment analysis, Ensemble learning
PDF Full Text Request
Related items