Font Size: a A A

Research On Several Key Technical Issues In Product Review Retriveal

Posted on:2011-01-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:W X TianFull Text:PDF
GTID:1228360305483571Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Product reviews are kind of information that is of great importance in the course of producing or buying products. To the producers, the product reviews can help them improve the product performance or quality; to the customers, the product reviews can act as reference when they want to buy something. But for the functional limit of the current search engine in the retrieval granuality, identifying reviews and classifying reviews by the polarity, the product reviews can’t be easy get via current commercial search engine. The users have to do a lot of filtering and classifying work on the returned document set. On the other hand, the current research of product reviews retrieval makes its way getting help from the general search engine. In terms of the difficult points such as characterizing reviews, selecting retrieval granuality, calculating the relevance between the reviews and keywords, analyzing the evaluation units and so on, no systematical resolution has been proposed presently.Under the background of this, the thesis addresses the issues in the techniques of product review retrieval, keeping an especial eye on the application of linguistic theory. The thesis makes a start with the techniques of syntax analysis, the model of retrieving the product reviews, the method of obtaining product properties, the method of analyzing the evaluation units, the method of obtaining the evaluation words, and the way to deciding the polarity of evaluation as well. The goal of the thesis is to provide the function of retrieving product reviews to users after collecting, arranging and saving the documents from the web. The fruits of the thesis are as follow.1. A method of analyzing syntax of natural language based on conjunctive relations of words is proposed. The method observes the dependence axiom and uses the directions of the conjunctive relations and converts the process of syntactic analysis of natural language to the analysis of conjunctive relations between words in the sentences, and utilizes the nature feature in the concept level that the conjunctive relations reflects. The method analyzes the syntax through constructing a knowledge base of conjunctive relations of words and determining the relations between words by analogizing the relations in the knowledge base. The method unifies the lexical form, syntax and semantics and has strong practicability. It is applied in the analyzing the evaluation units and obtaining the evaluation words and can be applied in other text information processing.2. A product reviews retrieval model is proposed. Through considering the key elements such as text retrieval, subjectivity and objectivity classification, and polarity analysis overall, we propose a retrieval model specializing in the product reviews. The model avoid the two-phase way that is adopted in traditional subjectivity retrieval techniques, setting base for improving the performance of retrieving product review systematically. The model divides the document into sentence groups as the definition in linguistic using dynamic programming method. The model makes sentence groups as process unit and so has more sound retrieval granularity and provides a promise for characterizing reviews.3. A method of analyzing evaluation units based on word conjunctive relations knowledge base is proposed. The analysis of evaluation units is regarded as sequence annotation labeling and resolved using Maximum Entropy Model. The probabilities that are used to training maximum entropy model come from the word conjunctive relation knowledge base shown in the method of syntax analysis, which improves the efficiency of the training and obtains a high score of precise.4. A method of obtaining properties of a given product is proposed. The pages that contain product properties are divided into introduction type, summary type and table type by the analysis of text data of products in the web. Then a page classifier is used to separate the pages into all those types. We define different templates for different page types to extract the properties of the product, and all the properties from different sources are lastly summarized into the properties set of the given product.5. A bootstrapping method of acquiring evaluation words is proposed. The method adopts an iterating way through mutual verification of the extracting templates and the seed words, and expands the evaluation word set. While scoring the candidate words or candidate templates, the similarity between the candidate words and seed words in the conjunctive relations knowledge base is also considered besides of the proportion of the candidate words in the seeds set, and then the precise and stability of the method is improved.In the end, we assemble all the key techniques and set up a product review retrieval system, retrieving the product reviews by the product name or model number, and summing up the result by the evaluated object and polarity and then return to the users.
Keywords/Search Tags:product reviews, sentence group, conjunctive relation, syntactic analysis, retrieval model, affiliation relation
PDF Full Text Request
Related items