Font Size: a A A

Refinement Of Data Analysis Base On Ontology

Posted on:2016-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:X D LiFull Text:PDF
GTID:2308330473465475Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet technology, semantic documents embedded metadata(RDF, RDFa, Microformats, etc.) contain more structured and semi-structured data. Tens of thousands of such documents can be accessed, and the number is growing rapidly. In order to achieve semi-structured data to be read and understood for the machine and the user, we must provide an effective means to retrieval and analysis semi-structured information.Metadata on the Semantic Web describes not only the properties of things, but also the hierarchical relationships between things. The traditional association rule mining only concerns on the thing itself, but it can’t concern on the nature of things so that generates a lot of useless rules. By mining things in a high level, association rules can be more useful.In order to present search results intelligently,computers must understand user’s real search demands through semantic data. Through a entity search model, search changes from based on the traditional text to based on the object, and it makes search engines can understand the real needs of the user more intelligently. The entity model search will be converted into Node-labelled tree model stored in distributed inverted index. It can realize intelligent search query by the content and structure query.We also design algorithm for mining frequent itemsets through distributed inverted index. DiiElact is given to improve efficiency of intersection operation through the vertical division of the transaction sets and parallel computing, and it is efficient and scalable. Through the entity search model, we can achieve the hierarchical relationships between things and mine the high-level association rules combined DiiElact algorithm.
Keywords/Search Tags:Metadata, Information Retrieval, Inverted index, Association rules, Frequent itemsets
PDF Full Text Request
Related items