Font Size: a A A

Clustering And Integration Method-for Metadata Based On Improved Glowworm Swarm Optimization Algorithm

Posted on:2019-09-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P LiFull Text:PDF
GTID:1368330602482901Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The continuous development of information technology has made the application of traditional information systems and business intelligence systems more and more extensive.With the continuous generation and accumulation of data resources,more and more attention has been paid to the value of data resource.Thus,data mining technology has emcrged.As a core tool for construction of database and data warehouse,metadata plays a crucial role in data organization,data management and data warehouse construction as well as data mining in traditional databases.As an important method of data mining,the clustering and integration of metadata also provides a more effective way of data mining for data warehouse.However,there is little research on the clustering and integration of metadata.In the context of Big Data,especially the context of data flow,data structure is more complex,data scale is larger nd data generation is more dynamic.These trebds create new challenges for the clustering and integration of metadata.Therefore,this paper incorporates Swarm Intelligence Algorithm with global optimization ability to solve the problems about the clustering and integration of metadata.The aim is to make the clustering of metadata more accurate and integration of metadata effective in the construction and management of data warehouse.This paper includes the improved Glowworn Swarm Optimization algorithm to solve the problems about the clustering and integration of metadata,and discusses the corresponding clusteringmethods from the perspectives of metadata record and metadata tree,and the integration methods from the perspectives of isomorplhic and heterogeneous.The major researches and imiovations points in this paper are summarized as follows:(1)An integration method for static metadata management has been designed based on a dynamic index tree.With the aim of solving integration problems in traditional static metadata set,this paper focuses on the integration strategy for multi-attribute heterogeneous metadata.Firstly,a method of isomorphic metadata integration based on dynamic index tree has been designed and the corresponding cleaning operation has been realized during metadata integration.Based on this,an integration method to construct multi-attribute heterogeneous metadata has been constructed by designing a multi-level metric algorithm for similarity of metadata.(2)The clustering method of metadata records has been designed based on improved Glowworm Swarm Optimization algorithm.The clustering problems of metadata can be divided into the clustering of metadata record and the clustering of metadata tree according to objects.In the researches on the clustering method of metadata record,good-point set theory has been introduced into GSO algorithm to optimize its initial population distribution and enhance its clustering effect,and the improved GSO algorithm has been combined with K-means algorithm and K-prototypes algorithm to design new algorithms for the clustering of metadata record(GSOK algorithm and GSOKP algorithm).(3)The clustering method of metadata trees has been designed based on GSOKP algorithm.In view of the structural characteristics of the metadata tree,the maximum frequent path similarity is used to measure the similarity between the metadata trees to improve the calculation efficiency,and the feature vector of the maximum frequent path has been renewed by combining GSOKP algorithm with maximum frequent path technology to improve the clustering accuracy.On this basis,a clustering method GSOKP-EP algorithmofmetadatatreeshasbeendesigned.(4)Based on the researches on the clustering methods of metadata record,the clustering and integration method for dynamic metadata,based on incremental decision tree in the environment of data flow has been studied by combining the basic idea of a dynamic index tree in the static metadata integration.Firstly,design the construction method of an incremental decision tree for metadata.Secondly,design specific ways to realize the clustering,branching and pruning in the incremental decision tree.Thirdly,introduce GSOKP algorithm and GSOKP-FP algorithm to solve the clustering problems of metadata records and metadata trees during clustering operation,to realize branching and pruning through in formation gain and error rate marked by category,to control the scale of the incremental decision tree for metadata and to manage metadata integration more effectively.
Keywords/Search Tags:Glowworm Swarm Optimization algorithm, Metadata, Clustering, Frequent path, Incremental decision tree
PDF Full Text Request
Related items