Font Size: a A A

Research On Multi-View Topic Model And Application

Posted on:2014-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2298330434972199Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Currently, there are more than2,300stocks in Chinese A-shares market and there almost is one more IPO share coming into the emerging market every day. The rapid growth of Chinese stock market makes it difficult for investors to manage portfolio. There are much larger amount of financial and related political news articles released every day. Accordingly, investors should read a lot of news articles to get a general idea about what is going on in the financial market. How to automatically identify active stocks in terms of the large amount of news articles is of a highly challenging task for both retail and institutional investors.In the paper, we design a stock mining system, i.e., a Chinese A-shares Network (CAN), to efficiently mine active stocks through the Business Scope Descriptions (BSD) and the financial news from the Web. We collect the BSD documents for all A-shares companies till September29,2011and the corresponding financial news (research on industries) from January1,2010to September29,2011. Experimental results validate the effectiveness of the CAN system:the stocks in the same "sector" defined by the CAN have higher pairwise correlations than those by the experts, and it obtains a higher recommendation accuracy compared to the baseline.The paper collects introduction and related news in recent two years of all Chinese A-shares Market Stocks by automatic crawlers on internet websites then stocks would be classified by both text information and time-series price data, compared with expert labeling, which makes better results and higher correlation. Hotspot in stock forum and its correlation with price and amount is also considered, giving investors and markets a brand new view of suggestion and thoughts based on text information, especially topic models.
Keywords/Search Tags:Topic Model, Latent Dirichlet Allocation, Correspondence LDA, Web Crawler, Stock Recommendation
PDF Full Text Request
Related items