Font Size: a A A

Research On Several Key Issues In Vertical Search

Posted on:2009-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:T PengFull Text:PDF
GTID:2178360245469996Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
For the specialty of vertical search engine, this paper deals with problems of product hierarchy extraction and product-oriented query expansion. The main innovation contributions of this paper are listed below:1. This paper proposes an algorithm of Product Hierarchy Extraction based on Page Analysis. This algorithm is aimed at detecting repeating patterns from encoded pattern string according to the node's DOM path from leaf to root. And then we classify product-urls into several categories. Finally we pick a name in the page for each category. We got an accuracy of 71% in clustering and an accuracy of 77.3% in naming.2. This paper presents a novel method based on Concept Lattice which can give user query expansion. In information retrieval, the term of doc-keywords relation can be regarded as the context in Formal Concept Analysis. Thus, a document represents an object and its keywords represent attributes, and a concept lattice can be constructed. According to the distance between concept nodes in a concept lattice and the product hierarchies, we can get product-oriented query expansions. The result shows that users find a piece of information became more concisely and quickly.3. This paper presents a smart information retrieval system. Based on this smart system, people can lead their researches more easily in a personal lab.Part 1 is the preprocessing step, part 2 is the kernel module of the paper, and part 3 is the engineering implement.
Keywords/Search Tags:vertical search engine, concept lattice, page analysis, product hierarchy, query expansion
PDF Full Text Request
Related items