Font Size: a A A

Product Information Extraction And Concept Lattice Structure Displaying Based On FCA

Posted on:2009-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2178360242498352Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Given a query in the web, the search engine usually returns thousands upon thousands of search results, which are dynamic and brief. The most part of them are irrelevant to a specific user, so the users have to browse through a long list or a complex tree view structure to get what they want. The information in the form of list view is easy and clear to understand, but can not reflect the relation and differences of products, and 80% of the users will not browse the results more than three pages; and the form of tree view structure can easy to show the relations of products, but the products are pre-classification, user have to find product information in accordance with a fixed path for the lack of flexibility. Therefore, in order to help users obtain the product information of genuine interest; we must design a simple and practical method for product information search and display.After studying the theory of formal concept analysis, we find that concept lattice structure is actually a relationship network structure of concepts, and there are many direct or indirect relations in the concept lattice, such as, inheritance relations, binary relations and similar relations between nodes of concept lattices. If we apply FCA on the practical application, it can display product information and show the other related information to the users. It is not only improving the accuracy of the information but also enriched the contents of the relevant search results. Therefore, this paper presents a method for extracting formal concept from web information and the displaying strategy based on concept lattice of FCA.There are two parts discussed in this paper, one is web product information extraction, and the other is product information displaying and its optimization strategies based on concept lattice.The main task of the information extraction part is to extract the attribute information of the product from web pages. We extract needed product information from HTML code based on regular expressions technology. The main ideas of information extraction based on regular expressions technology are follows: first, we have to search the web pages contain target information; and then analyze the pages in order to obtain the HTML code according to the web pages. The next step is to analyze the structure of the pages and write appropriate regular expressions. The last step is to extract target information via pattern matching in order to make the system has the function of analyze pages and extract information of user needs automatically.The main idea of product information displaying and optimization strategies is to show the relationship and difference of product based on the order relations of concept lattices and provide the purchase decision-making help for users. First, this part presents three kinds of implied relations in the concept lattices. And then it is expounded that how to displaying product information based on concept lattice structure Concept lattice not only can show the relevance of information, but also can eliminate irrelevant information, reduce browse complexity and improve the usability of information. Therefore, this paper presents the definition of key lattice and core concepts, and gives the method of mining core concepts. Finally, it is proposes the method of optimize integrated information by relationship of attributes.The main contributions are as follows:(1) In order to obtain the results of user needs, we extract the product parameter information selectively using the powerful ability of regular expressions.(2) It is expounded that how to displaying product information based on concept lattice structure. This structure not only can show the clear relations of concepts for the user, but also can provide the relationship and difference of product based on the order relations of concept lattices.(3) It is presents the definition of key concepts and key lattice, and gives the method of mining core concepts from key lattice.(4) It is proposes a method --concept clustering. Its main idea is to measure the relationship among concepts by a similar scale, and finish the classification based on attribute belongingness.
Keywords/Search Tags:concept lattice, information extraction, key lattice, core concept, relationship
PDF Full Text Request
Related items