Font Size: a A A

The Topic Mining And Its Application Of Stock Investment Based On Guba Text

Posted on:2019-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:H L ZhangFull Text:PDF
GTID:2429330566486432Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,investors are increasingly inclined to express their views on the market through various stocks forum on the Internet,and obtain the stock market information that they are concerned about.This information often plays a subtle role in the investment decisions of investors.Therefore,in this paper,from the perspective of big data text mining,the hot topics of stock market are mined from the guba text data crawled on the Internet,and then applied to the stock investment.The first step of topic mining on Guba text is to choose the appropriate topic mining algorithm,this paper adopts the LDA model which is rarely taken on the stock market.Actually,LDA is applied widely and highlights the advantages.In order to contrast the LDA model effect,in this paper,compared with the traditional text clustering algorithm,mining the hot topics of Guba_cjpl top posts text in February 2018,what turns out are that the topics mined by LDA model are better and LDA model has stronger extended performance,and so on.The topic mining methods have few applications in the field of stock investment,so this paper proposes to build a topic investment strategy based on the hot topic mining of guba text.Based on topic data,the topic heat factor is constructed to describe the relationship between the topic and its stock which considers both the topic industry heat and topic concept heat.After constructing the topic heat factor,testing the single factor and proving that it is effective,then building multi-factor library as the main factor with other common factors together,building multi-factor quantitative stock selection model.Comparing with the traditional sorting and scoring method,this paper takes the model as a binary classification problem so selects the logistic regression model.Logistic regression model is easy to get the optimal solution,and it directly creates the model on the classification probability,its prediction results are probabilities of approximate category and the probability results can be used as the weight of capital allocation.In this paper,using logistic regression model to build multi-factor stock selection strategy,back-testing from April 2016 to September 2016 on the hs300 constituent stocks,as a result,the annualized yield of our strategy reaches 21.1% and outperforms the benchmark.Then also building multi-factor stock selection model without topic factor using logistic regression,before and after adding topic factor,comparing the results: randomly sampling the periods to build strategy for many times,then getting two groups of samples.Doing significance testing to the promotion of sharpe ratio and annualized returns from two groups of samples,found that their corresponding P-value are all close to zero,that means thay all have significant promotion effect and topic factor has significant improvement to the strategy,which fully verified the effect of the topic mining method.This paper research is helpful to perfect the study of the theory about China's stock markets hot topic mining,it not only can enrich the methods and techniques of select stocks in the stock market based on the topic investment,but also can provide specific advice on the stock selection strategy to our investors in the stock market.
Keywords/Search Tags:Guba Text, Topic Mining, LDA Model, Topic Investment, Logistic Regression Multi-factor Stock Selection
PDF Full Text Request
Related items