Research On Semantic Analysis And Mapping Method Of Semi-structured Data

Posted on:2013-01-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Fu

Full Text:PDF

GTID:2218330374466032

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The emergence of semi-structured data has driven the development of heterogeneousdata integration in the enterprises. Its characteristics of modeless and self-describing couldbring great convenience in application. But confusional form between structure and data alsobrings difficulty in heterogeneous data integration. In the process of heterogeneous dataintegration, it has become one of the important research issues that quickly and effectivelydetermining the mapping between data item in the semi-structured data and data items in thestructured data.The paper uses the particularity of data element, analyzes the structure of data item in thesemi-structured data and data element in the structured data and proposes matching algorithmbetween data item and data element. The matching algorithm is based on levenshtein distancealgorithm and fused the thought of longest common subsequence, weight and backward focus.After realizing similarity calculation between data item and data element, we can realizematching between data item of semi-structured data and data item of database.The paper takes typical semi-structured data as example, analyzes large quantities ofExcel, understands the laws of data filling, and summarizes the common styles.With the helpof the notations in Excel, realizing the information extraction of Excel, including headers,data items, related data items and so on. After extracting a certain data item and related dataitem, analyzing and summarizing the laws of their composition structure, meanwhileanalysing and integrating the laws of their context relationship. Through studying theoreticalknowledge about data element and its application in oil field, analysing semanteme of dataelement and summarizing the laws of composition structure of data element. Studying thelaws of composition structure of Data item and data element and summarizing the laws ofhigh similarity.According to the laws of high similarity, designing mapping algorithm andcalculating similarity between data item and data element. Finally, the paper introduces therealization of the mapping system and proves correctness and feasibility of theory that thepaper poses, through taking the standard data element of Chinese Petroleum Company, EPDMdata dictionary and databases of five Service Company as experimental data.

Keywords/Search Tags:

Semi-structured Data, Data Element, Similarity Calculation, Data Mapping

PDF Full Text Request

Related items

1	Research On Integrated Technology Of Semi-structured Data
2	Research And Application Of Semi-structured Data Storage Method On Blockchain
3	Study Of Mining Data Streams Based On Semi-Structured Data
4	Research On The Data Model And The Approaches To Data Mining In The Semi-structured Data
5	Research And Application Of Conversion Between XML Which Is A Kind Of Semi-Structured Data And Structured DB
6	Clustering Research Of Semi Structured Data And Its Application In Product Design
7	Research Of Semi-structured Data Storage Technology On Xml
8	Research On Large-scale Structured And Semi-structured Biodata Query Method
9	Research On The Storing And Querying Of Semi-Structured Data On The Web
10	Research Of Multimedia Data Conversion And Storage Based On XML