Font Size: a A A

Design And Application Of Metadata For Service On Big Data Management And Extraction

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChenFull Text:PDF
GTID:2518306506463394Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularization of big data and artificial intelligence technology,massive multi-source heterogeneous data increases sharply,especially the increase of unstructured data.When processing multi-source heterogeneous data,the traditional big data platform is faced with the challenges such as insufficient data acquisition and processing capacity,difficulty in data structure unification,and difficulty in data operation and maintenance,which has brought obstacles to insight into the value of data.One of the reasons is that the data comes from all kinds of systems,and the data construction standards of each system are different,so the data sources are different.It is necessary to dynamically extend the management of data sources,which requires the organization and management of metadata.In order to gather heterogeneous data sources to a unified platform and ensure the safe use of big data,it is necessary to carry out unified standard management and data access control management,such as data host management.In big data management,the following application problems need to be considered: the hot and cold scheduling management based on the life cycle of data,data storage security and copy management,abnormal data backup management,the storage strategy of data partition sharding for mass data storage and scheduling optimization,etc.All of the above problems need to be managed effectively with the help of metadata and related strategy design.For this purpose,this paper will mainly discuss metadata methods and applications that serve big data.This thesis studies the data management problems in big data and uses RDB's metadata extension method to dynamically define and manage data tables,data views,data access and data organization in Big Data,which solves problems encountered in data source expansion,data heterogeneity,data reorganization and data migration.The main work of this thesis is divided into the following two parts.First,proposing the metadata structure of the data table based on the data version,and explored to solve the problem of original data management in the process of data extraction and data update;designing a data source metadata structure with data mapping to expand the data source dynamically and analyze heterogeneous;designing a host authorized dynamic view metadata structure to solve the definition problem of multiple forms of data organization;putting forward a data listening metadata structure with time stamp to solve the problem of data loading and migration timeliness in big data;Second,researching an incremental update method based on time block,which can incrementally extract metadata;finally,this thesis develops a metadata management demo system to verify the research proposed in this thsis.
Keywords/Search Tags:Big data, Metadata, The data source, Heterogeneous data
PDF Full Text Request
Related items