Font Size: a A A

Research On Web Public Opinion Analysis Method Based On Online Learning

Posted on:2019-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:2428330563999160Subject:Computer technology
Abstract/Summary:PDF Full Text Request
For offline learning to deal with mass network public opinion data,the memory is limited,the training speed is too slow and so on.This paper proposes two online learning classification models to process the online public opinion data:one is the network public opinion emotion classification model based on online learning,and the other is online thematic topic classification model based on online learning.For the online learning algorithm FTRL-Proximal learning rate will increase with the number of training gradually disappear.An improved learning rate optimization algorithm is proposed,in which the denominator of learning rate is set as the root mean square of the cumulative gradient,the numerator is set as the root mean square of the cumulative update of the parameters,and the first and second order moments of the gradient are estimated.Deviation correction of root mean square of parameter update amount.The Doc2vec model is used to train the emotion data to get the feature vector.The FTRL-Proximal algorithm with improved learning rate is used to update the parameters of the logistic regression to get the online learning logistic regression classification model.Combined with Doc2vec model constitutes a complete online learning model of sentiment classification of online public opinion,and verify the effectiveness of improved learning rate algorithm and the effectiveness of online learning sentiment classification model.For the feature selection of CHI algorithm,only the frequency of feature word document is considered.The improved algorithm TDF-CHI is proposed.Feature selection is performed by calculating the document frequency of feature words and the correlation between word frequency and category.The TDF-CHI algorithm is used to select the features and remove the redundant features.Then the importance of the remaining features is measured by the RFFS algorithm,and the second feature is selected to obtain the optimized feature set.The improved algorithm is used to select features and the vector space model is trained to get the eigenvector.The FTRL-Proximal algorithm with improved learning rate is used to solve the parameters of Softmax regression model.The online learning Softmax model is obtained,which combines with vector space model to form a complete online learning public opinion topic Classification model,and verify the effectiveness of the improved feature selection algorithm and the feasibility of the online learning topic classification model.
Keywords/Search Tags:online learning, learning rate, Doc2vec, sentiment classification, feature selection, subject classification
PDF Full Text Request
Related items