Font Size: a A A

The Preliminary Research And Implementation Of Data Registration Methods Based On DOA

Posted on:2016-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:M J WangFull Text:PDF
GTID:2308330461455569Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data, the data structure has changed a lot, becoming from traditional, structured into massive, multi-source, heterogeneous, dynamic, real-time and complex. These data covers various disciplines and fields. Big data is mainly applied into big data extraction, mining, analysis, management, which are the common technologies in all walks of research and production work. These data is also increased explosively, hundreds of TB data or even hundreds of PB data in industry/enterprise has been far beyond the processing capacity of traditional management technology and information system. Therefore, it is an emergency in the real world to seek for effective management techniques, methods, means and platform to deal with big data. In order to effectively solve these problems caused by big data, and also to fit with the software architecture, Data Oriented Architecture emerged as the times require, which provides the unified definition of data, data management and data service. It also provides a new and effective solution for information sharing, system expansion, data management, data analysis and data mining support, software engineering, information security, enterprise data security.This paper has studied each function and design in the process of the data registration, especially the research on data registration method, and it has been based on the data oriented architecture, various characteristics of big data and data management. Taking the data as the core, it established big data thinking mode-"Everything is data". Based on this view, it defined the concept of generalized data, and that is everything that can be registered is called generalized data. Taking the generalized data registration as the goal, combined with the characteristics of data in data registration center which provided unified data management, it defined the general metadata standard; and according to the type, source, accessing right of data source, ect, it defined and designs the three methods of data registration. During the process of the data registration, in view of the same or similar data may be reregistered, and in order to efficiently retrieve the data, it discussed data storage and the establishment of index. Finally, it focused on the implementation of the three methods of data registration, especially automatic data registration. This paper includes the following aspects:(1) Study on the definition and classification of the registered data. Because it is generalized data which is going to be registered, so the definition of the data here is the definition of generalized data. In the time of registration, taking into account future data storage and retrieval, it also needs to study the data classification principle.(2) Study on the general metadata and its standard. First, it introduced the concept, function and types of metadata, and according to the domestic and foreign research on metadata and combined with the real requirement of this article, it designs appropriate metadata specification as a general template for DOA architecture.(3) Study on the process of data registration. First of all, this paper puts forward three kinds of registration methods, which are manual, semi-automatic and fully automatic method. According to the types, structure and its accessing right, it selected the corresponding registration method. In the time of registration, the same data may be registered repeatedly, it needs to conduct testing before the completion of the metadata information storage. This paper designed similarity detection methods according to the different types of data resources, including voice, image and text, not supporting video data detection for the moment.(4) Study on the design of directory structure and the metadata storage. In order to manage the metadata better, this paper designed the storage directory of metadata based on B+ data in the underlying data structure. During the design of directory structure, it selected three important elements of metadata as the key information. This paper chooses XML documents which store the metadata information, and also proposed the design the database for auxiliary query. Based on the database, it is more effective to query and locate the metadata, and to find out the repeated registration behavior.The main results and innovation of this paper:(1) Designed a general metadata specification based on the DOA framework. Compared with several existing metadata standards, it analyzed both the basic attributes and special attributes of various types of data on the foundation of the unified data management concept, and it established a universal metadata standard to meet the unified data management needs in data registration center.(2) Achieved three data registration method based on the Data Register Center. This paper defined the three types of data registration methods, namely manual registration method, semi-automatic registration method, and automatic registration method. According to the type of data, source, access to the property, it achieved three different registration methods by using different technologies, methods and means, including web crawler and natural language processing technology.(3) Implemented visual user interface for data registration and management. Data registration based on the DOA framework is a prerequisite for the implementation of the core component that is data registration center. In order to make the data registration center users more convenient and extensive to manage the data, this paper used the interface design tools, making a comparison of three methods of data registration differences and similarities, and designed the three kinds of data registration method interface.(4) Raised three different data registration methods on the basis of the DOA framework. According to the data types, sources, access, etc, based on several operations on data with a way of uniformity, including the definition offdata, the management of data, the services, the article put forward three data registration methods, namely, manual registration method, semi-automatic registration method, and fully automatic registration method.
Keywords/Search Tags:DOA Framework, Generalized Data, Metadata Specification, Registration Methods
PDF Full Text Request
Related items