Font Size: a A A

Analyzing And Predicting Stock Market Using Data Mining Technology

Posted on:2019-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:AbdulqaderFull Text:PDF
GTID:2348330569978176Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The prediction of stock market returns trend is a hot topic and challenging problem in modern financial theory and capital market.With the development of information technology and network,data mining technology,which can process a large amount of semi-structured or structured stock comments of text forms,and machine learning technology,which can predict stock price volatility,are all widely used in stock market researches.This way can not only solve the puzzle that previous stock investors face because of the limitation of data acquisition technology and their disability in processing stock comments of text forms,but can also obtain more potential information.Therefore,it plays a guiding role in stock investors' making investment decisions because it can predict trend of stock returns more accurately.This analysis is conducted on Shanghai stock exchange(SSE)180 index.The stock data are collected from April 1,2015 to September 30,2016.The data consist of the stock market data,and the financial news data.The market data was collected from the following database http://data.10 jqka.com.cn/.While the financial news articles of the SSE-180 index were extracted from a major financial news website http://guba.eastmoney.com/.Firstly,text and data mining technology was used to mining and process more than 6 million stock comments,transform unstructured text data into document vectors,and construct investor sentiment index;then,three experiments were conducted to examine whether stock comments,can be used to predict the stock market prices.The details are as follows:First,predict the direction of return on market index of the SSE-180 index by combining news mining and stock price.Utilize Chinese text mining technique to convert crawled unstructured data in the form of financial news into document vectors,and then establish the support vector regression model between the news document vectors and the corresponding daily stock returns for predicting the stock returns.The SVR model was constructed and its performance was compared with random forest(RF)model,the results have shown that the accuracy of prediction using SVR model was found better than that of RF model.Second,in relationship investigation,this experiment consisted of the correlation and causality analysis tests.First,analyze sentences' emotional tendency using dictionary based-approach method.Then,the Pearson correlation has been applied to investigate the correlation between investor sentiment and financial markets,and the Granger causality analysis is conducted to test if the sentiment expressed in news articles is useful for forecasting the movement of prices in the stock market.The results find a positive correlation between investor sentiment and stock prices.A sentiment expressed in news articles is the Granger-cause of stock returns,and it can be used to predict and describe the stock prices.Third,to detect the presence of a stronger correlation at least in some time series data,this research is dedicated to investigate the influence of sentiments expressed in news articles on stock market price through the event study methodology and the cross-sectional regression.First of all,the event study methodology is used to measure the influence of stocks news on stock market returns.The capital asset pricing model(CAPM)is used to determine the abnormal stock returns(AR)and cumulative abnormal returns brought by news reports on related stocks.Then,the multivariate regression model uses to analyze the influence of investor sentiment on stock market volatility based on t-test,including the day when the stock markets exert the most influences,the degree of the influence and other significant problems after the stock comment texts are released.To be more specific,this research uses influence factors of emotions contained in stock comment text data on stock market price as one of the explanatory variables of the multivariate regression model and uses the cumulative abnormal returns(CAR)as the dependent variable.The statistical test method is used to test the goodness-of-fit,the equation significance,the investor sentiment and other significance problems of the multivariate regression model so as to analyze the relationship between emotion factors and stock market fluctuations.The experiment found that: First,after the publication of the stock comments text,the corresponding stocks did have abnormal returns,indicating that it indeed has a significant impact on the stock price.Second,whether it is from the average abnormal returns(AAR)or cumulative abnormal returns,the response intensity of the individual stock price to bad events is greater than the positive case.Results show that the affect of investor sentiment content consistently have a strong and immediate impact on stock market,which the stock market returns is significantly affected by the daily market news sentiment more at a time of losses than during a time of gains.
Keywords/Search Tags:Data mining, SVR model, Sentiment analysis, Correlation, Causality analysis, Event study methodology, Multiple regression
PDF Full Text Request
Related items