Font Size: a A A

Research And Application Of Securities Comments Sentiment Analysis

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y C XiongFull Text:PDF
GTID:2428330629488449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The capital market in China continues to grow and mature,but there are also situations in which individual investors have small funds,weak professional capabilities,and personal emotions that affect investment decisions.The studies of securities comment sentiment analysis(SCSA)have achieved many results using techniques such as machine learning and deep learning to study the investors' securities comments on the Internet to understand their sentiment tendency,but the studies only used small sample size,and didn't use some latest techniques of text sentiment analysis(TSA).This paper firstly collected and organized lots of Shanghai securities composite index(SSEC)on the online securities forum eastmoney guba,secondly constructed a three-level securities comment sentiment dictionary(SCSD)based on large size of research samples and the recent research techniques of TSA,thirdly integrated features such as sentiment into the research of SCSA,fourthly built a mutil-feature fusion securities comments sentiment analysis model(abbreviated as MF2SCSAM)based on word vectors and part-of-speech(POS)vectors,fifthly constructed investor sentiment index based on the model,furthered enriching the study of SCSA.The details are given as follows.(1)A securities comments corpus was built.This paper firstly crawled one-year(from August 1,2018 to July 31,2019)comments(about 968.3 thousands)from the forum eastmoney guba as raw data;secondly the data were cleaned and organized;thirdly the above data were segmented by the jieba word segmentation tool and removed stop words,fourthly this paper selected about 860.6 thousands comments with number of words less than or equal to 32 to build the securities comment corpus.Afterwards,about 30.1 thousands comments were selected and marked by manual ternary classification: bullish sentiment(call),neutral sentiment(neutral),and bearish sentiment(pull)as experimental dataset.(2)A three-level securities comment sentiment dictionary was constructed.This paper firstly collected emoticons in the corpus,and classified them as the securities comment emoticon words,which are called the first-level securities comment sentiment words(SCSW);secondly collected and organized the professional words in securities field,and classified them as the securities comment sentiment words,which are called the second-level SCSW;thirdly summarized and de-duplicated the words from the three common sentiment dictionaries as the basic sentiment words,which are called the third-level SCSW.The above three levels SCSW constituted the SCSD.(3)A mutil-feature fusion securities comments sentiment analysis model was proposed.First,the comments were trained to word vectors by the Google's open-source Word2 Vec tool;secondly selected the sentiment words with obvious sentiment among SCSD as seed sentiment words,and calculated the average cosine similarity of the seed sentiment words and the sentiment words from each level of SCSW to construct the sentiment vectors;thirdly marked the POS features of comments by the jieba word segmentation tool,and created the POS vectors by random initialization;fourthly the sentiment vectors,word vectors and POS vectors were input to the Bi GRU model for training respectively,attention mechanisms were used to focused on some important features,and a softmax function was adopted to determine categories.In order to verify the effectiveness of the MF2 SCSAM model,the MF2 SCSAM model was experimentally evaluated with CNN,Bi RNN,and Bi LSTM models and those three models fusing the feature vectors of this paper.At the same time,the MF2 SCSAM model was evaluated with the Bi GRU model and it fusing the feature vectors of this paper in different ways.Experimental evaluation verified that the MF2 SCSAM model had the best effect.(4)A SSEC sentiment index was constructed.The one-year(from August 1,2018 to July 31,2019)SSEC sentiment index was constructed based on the results that MF2 SCSAM model analyzed the comments in the corpus.
Keywords/Search Tags:Securities Comment, Sentiment Analysis, Multi-Feature Fusion, Shanghai Securities Composite Index Sentiment Index
PDF Full Text Request
Related items