| With the fast-growing development of the internet industry currently,various network applications come into being,such as micro-blog,e-commerce,forums and blogs.Along with these applications are the large amounts of network text data.The information generated from these data are not only valuable to network applications,but also important to users.In order to extract effective information from the massive text data,sentiment classification is brought to research field.This paper uses the supervised machine learning method to research and implement the sentiment classification of Chinese network short texts.Three open-source tools are adopted in this paper based on machine learning,which are mainly used for training word vector and mining the shallow semantics between words;extracting the location of key words in the feature of the sentence structure;carrying on the sentiment classification and the sentiment polarity prediction.The main research contents of this paper are as follows:1)In order to improve the accuracy of classification one step closer,this paper uses word2 vec,the word vector tool,to switch the massive text data into vector values in high dimensional space,and acquire semantic similarity between words by using cosine values between vectors.Through experiments,it can be verified that this method can be applied well to extract the features of similar characteristics.And when these features are extended to the dictionary of emotional characteristics,they provide support for the subsequent extraction of sentiment features.2)This paper proposes a method of sentiment classification based on sentence structure.By analyzing the positive and negative emotions in the network text,it is found that there are certain structural features of the statement.In the case of a certain sentence structure,with the corresponding emotional words,the emotional type of short text can be determined.This paper applies emotional feature as well as specific syntactic structure as sentiment characteristics,and inputs these characteristics to libsvm classification.It can be tested by experiments that this method has a good classification effect.3)This paper conducts two ways based on sentiment classification of semantics.One is regression prediction,also called the sentiment polarity prediction.The other one is second sentiment classification,and it carries PCA method to reduce dimension of emotional characteristics before classification.Through the experiment,it can be foundthat the method of sentiment classification based on semantic is effective.4)It leverages sentiment classification method based on semantics to conduct sentiment classification of micro-blogging corpus,and applies the results to public opinion analysis system,making the micro-blogging public opinion analysis system come true. |