Research And Implementation On Massive Unstructured Data Organization

Posted on:2009-06-03

Degree:Master

Type:Thesis

Country:China

Candidate:B Zou

Full Text:PDF

GTID:2178360275971924

Subject:Computer system architecture

Abstract/Summary:

The continuous development of Computer application led to the dramatic increase in amount of data. The growth rate of unstructured data is far greater than structured because the speed of data structured process is limited by artificial processing speed. The use of traditional directory hierarchy mechanism to organize large-scale unstructured data management has a lot of shortcomings. Directory tree cannot be the expression of the logic of relations between the massive data and themselves; maintaining the consistency of directory tree will be very difficult and great expenses when the unstructured data become large scale. Therefore, the massive unstructured data to study the organization become pressing issues now.Through analysis of information on the organization and management methods (such as directory hierarchy mechanism, indexing and retrieval, database and semantic file systems), and the combination of massive information management needs of organizations: the user participation, automation and model extraction, etc. designed and implemented a massive unstructured data organization and management systems MUDOMS. It use object model to describe information, use attribute-value pair to describe the characteristics of information, provide interface to users for creating attribute-value pair and relationships among attributes base on their understanding, within these records the process of user's understanding on the data. System also uses mixed index mechanism THLI (Tree Hash and Link-list Indexing) to index the attributes and relationships. MUDOMS also provided hot navigation; through this user-friendly users can find and access data and information quickly. Based on user habits, it also creates a personalized logic view, which use different classifications and display order to convenient use.On the basis of the user participation in attribute creation, also discussed the mechanism about obtain attributes and relationships automatically according to time, space and context, then re-organization them.According to the test and comparison, MUDOMS achieve a method on massive unstructured data management; added the artificial intelligence to obtain semantic attributes. According to comparison with similar software (Baidu hardisk search and Google desktop) testing, the time used to index data is 60 percent lower than similar software, and the space is 70 percent lower than similar software in average. When memory capacity is large enough, the time MUDOMS used to retrieve data in average is 20 times less than similar software.

Keywords/Search Tags:

Unstructured Data, Massive Data Organization, Attribute Semantics, Semantic Extraction

Related items

1	Research And Implementation On Indexing Mechanism For The Ocean Data Organization
2	Research On The Unstructured Data Ontology And Relevant Algorithms
3	Research On High Performance Data Cube And Its Semantics
4	Research And Application Of Massive Historical Quasi Real-time Data Management Platform
5	Research On Data Organization Algorithms Of Massive Object Storage Systems
6	Semantic Recognition Of Individual Activities Based On Social Media Data
7	Research And Application Of Techniques For Collection And Retrieval On Unstructured Data
8	Research On Organization Andvisualization Of Massive 3D Laserpoint Cloud Data
9	Research On Visualization Technology Of Multi-attribute Massive Geological Exploration Data
10	Research Of Image Semantics Underlying Feature Extraction