Font Size: a A A

Theme Extracted Based On Community Structure Found In The Data Warehouse

Posted on:2012-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:B KongFull Text:PDF
GTID:2218330371451823Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Subject-oriented is one of the data warehouse's characteristics. Subject-oriented means that the data of data warehouse is organized by subject. The accurate selection of themes is the first condition of designing data warehouse successfully. If the subject is not proper and reasonable, the data can not be organized reasonably. The construction of data warehouse would also lose its practical significance. So how to determine the subject of the data warehouse is very important for building data warehouse. At present, the design of data warehouse needs requirement analysis to determine the subject in general. However, this method relies on the designer's experience and the accuracy of requirement analysis to much. So it is often difficult to ensure the selection of subject elements reasonably.The paper studys the extraction of the data warehouse subject based on this. A new method is also presented for the extraction of data warehouse subject from the large number of literature based on the theory of complex networks and the discovery of community structure in this paper. In this paper, the discovery of community in the weighted network is studied. Based on the ideas of information spreading, an algorithm is presented for discovering the community of weighted network. This algorithm transforms the nodes of weighted network into vectors through the information spreading between nodes. Thus the clustering of network is transformed into vector clustering, which solves the discovery of community in the weighted network effectively.Based on the extraction program of subject, the paper extracts the related words from sea ice literature and constructs a words association weighted network. An analysis is taken on the characteristics of complex network. The community structure algorithm is applied into words relation network about marine areas and find the community structure of the network. Finally, the extraction of the subject on the ocean data warehouse is implemented. This method provides supplementary information for the extraction of subject in data warehouse.
Keywords/Search Tags:complex networks, community discovery, data warehouse, subject extraction
PDF Full Text Request
Related items