Font Size: a A A

Web Text Mining And Its Application In Correlation Analysis Between Events

Posted on:2009-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:S W QiFull Text:PDF
GTID:2178360278956678Subject:Systems Science
Abstract/Summary:PDF Full Text Request
World-Wide Web(WWW) contains rich information resources. As we know, it is a difficultproblem all along to acquire useful information quickly. The web text mining can extract theuseful knowledge. Hence web text mining is very important both in theory and in practice.This thesis emphasizes on studying the web text mining and its applications in the eventcorrelation analysis, and makes contributions in the following three aspects:(1) Analyzes three kinds ofWeb text classification methods.Analyzes on contrast and does experimental study on three text classification methods ofSVM, KNN and Bayes. Does stability tests on two feature selection methods of IG and CHIaiming at swatches we studied in this paper, and chooses the SVM being fit for the paper'scontinuing work.(2) Proposes a new Chinese word features - based method of web event correlation mining.Web event graph model is constructed based on F-D algorithm. We represent the event byfeature word vector and describe the correlation between the events by the correlation betweenfeature word vectors. As above, an novel method of web event correlation mining is introduced.Does experiments on the Web texts about the 5.12 earthquake, and studies the correlationbetween events which have happened.(3) Proposes a web information flow–based method of studying the correlation betweenweb information and stock price.The concepts: web information intensity and stock price intensity are proposed and defined.Also we construct a model to describe the correlation between web information and stock priceto study the price inflection in the stock market. By this method, the latent correlation betweenweb information and stock price inflection can be exploited effectively..
Keywords/Search Tags:Web text mining, event correlation, information intensity
PDF Full Text Request
Related items