Font Size: a A A

Clustering For Stock Data Analysis On Hadoop

Posted on:2019-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:W Y ChenFull Text:PDF
GTID:2428330572954082Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,people have come to realize the importance of data.Data is not only a resource,but also the treasure.Among application fields of big data,financial data analysis is regarded as a promising field.Stock analysis is always a very popular topic in financial analysis,involving multiple knowledge domains.Previously,people adopted basic analysis method to predict stock movements,such as analyzing macroeconomic and microeconomic policies,development of the industries concerned,investor attitudes,indicators of enterprise's development and so on.With the development of big data techniques,it is a hot research subject to predict the trend of stock by discovering the law among mass historical stock data.This paper studies stock big data by cluster analysis,and the main works are as listed below:Data collection.We have collected 800 GB stock data by web crawler and TuShare(open source python package),including the basic information of company and historical stock data(data recorded daily and recorded at each moment).Construction of platform.We have set up a Hadoop cluster with 6 machines in our laboratory.One of them is a Master node(NameNode in HDFS,JobTracker in MapReduce),and the others are Slave nodes(DataNode in HDFS,TaskTracker in MapReduce).Cluster analysis.We have performed two clustering algorithms based on MapReduce:K-means and NMF(Non-negative Matrix Factorization),and results are analyzed after clustering.It indicates that the trends of stock in the same cluster share similarities.
Keywords/Search Tags:Big data, Web crawler, Hadoop, Cluster analysis, NMF
PDF Full Text Request
Related items