Font Size: a A A

Gold Price Forecasting Application Based On Text Network

Posted on:2020-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2370330596486007Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the volume of data in the internet is huge,and types of data are abundant.Most of the data in the network are unstructured data,among which text data is of relatively high value and cost-effective compared with audio,video and pictures.In the network text data,news,as the main source of information for the public,can be seen everywhere.In recent years,the analysis of traditional structured data has gradually failed to meet the research needs,and unstructured data analysis such as news texts began to appear.In the text data,there are different semantic associations between words and words.The research method adopted in this paper is to convert it into a text semantic network for subsequent analysis,and to predict the model through network information.For application,the methodology of the study is promoted to the field of financial investment reasonably,choosing the news of gold futures as the object.In the financial investment market,gold futures,as a mature financial product,are favored by investors.The fluctuation of gold price affects the decision-making of investors and becomes the focus of investors' attention at the same time.Therefore,how to predict gold price turns out to be a hot research area in the academic field in recent years.The paper is based on the exploratory analysis of the crawled news texts related to the gold futures,in order to use the unstructured news text data to predict the gold futures price.On the one hand,keywords that cannot be quantified can be added as variables to the model,in order to enhance the interpretability of the model.On the other hand,the network information is further added to the prediction model to improve the prediction accuracy through the weighted text network analysis.The specific research contents are as follows:Firstly,based on the Python crawler technology,the news related to gold futures about 9 years,and the price of gold futures in the corresponding time period are crawled from the network;then,based on the R software text mining,the text data obtained by crawling is cleaned,and processed news text into document term matrix through text analysis;Based on WGCNA algorithm,the document-terms matrix is used to the weighted network analysis,explore the structure of the network,analyze the temporal changes of the network properties with time,and use the Gephi software to visualize the network structure.Finally,the text information is added to the SGLS-Logistic model to predict the fluctuation of gold futures prices,compares the model with Lasso-Logistic and MCP-Logistic models in addition,and proves that the classification result is much better.
Keywords/Search Tags:gold futures, web crawler, text analysis, network analysis, SGLS-Logistic model
PDF Full Text Request
Related items