Font Size: a A A

WordNet Based The Semantic Matching Of Concept Lattice On The Search Engine

Posted on:2009-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiFull Text:PDF
GTID:2178360242487784Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years the number of the internet users has picking up and the intranet information has rapidly expanding. Search engine has become our important method to obtain new knowledge. However, the traditional search engine system are mostly based on the keywords matching, so it will carray out some questions while searching. Such as the overly large number of return results and the difficult to find the information related to queries. Inteligent is the aim of the future search engine. This superiority should manifest itself in two ways, one is the understand of the queries, the other is the analysis of the page content. Concept based search engine meet the needs of the future information retrieval system. Formal Concept Analysis (FCA) is a field of applied mathematics based on the mathematization of concept and conceptual hierarchy. Thereby, it activates mathematical thinking for conceptual data analysis and knowledge processing. The major content in FCA is to extract formal concepts and connections between them from data in form of formal context so as to form a lattice structure of formal concepts. Concept lattices have been regarded as perfect abstraction of knowledge system. We can obtain the "central idea" of the corresponding formal context after studying the semantic relationship of the concepts in the lattice, With the development of the application of the concepts in the lattice, the matching of the concept lattices will play an increasingly important role, which is the center mission of this text.This paper proposes a wordnet based the semantic matching of the concept lattices algorithm on search engine. The intelligence of the current search engine show in the semantic analysis of the natural language and the understand of the content information. And this must develop on the support of semantic knowledge store which is the media and bridge to help computer to understand human language and also the important materalan prerequisite to make computer more and more clever. Among the many semantic knowledge store, wordnet has become the most important common semantic resource and practical criterion because of its simplest structure and detailed content. We use wordnet to compute the semantic relations of words, and then form the concept graph. Through the compute the similarity between node and node, node and concept, we gained the similarity of the lattices at last.This paper proceeded to study the application of the semantic matching algorithm on the search engine. The search engine here are based on the FCA models. We mainly studed the selecting direction of the spider based on the concept lattice and the matching between the queries and the document. There are the select of the seed urls, html parsing, indexing, attribute abstracting, concept lattice building and so on.Test data certificates the feasibility of the wordnet based semantic matching of concept lattice algorithm in refining the context and constructing IR model based on FCA. The superiority of the IR model based on formal context is embodied in the facet of organizing data, that is reflecting the potential relation between the documents. Combined with the context reduction method, the model provids the customer a practical browsing manner based on lattice. The practical value and function of FCA IR model are certificated in the FCA seach engine system.
Keywords/Search Tags:Form Concept Analysis, Concept Lattice, Search Engine, Sematic Matching
PDF Full Text Request
Related items