Font Size: a A A

Studies On Answering Imprecise Queries Over E-commerce Web Databases

Posted on:2011-07-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:X LiFull Text:PDF
GTID:1118330332962467Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid expansion of the World Wide Web, E-commerce has developed fastly as well. To exhibit the product information by using web site is becoming a key for e-business. The web site is usually supported by an underlying online database, and this type of databases is referred to E-commerce Web database that is accessible only via web form based interface. Recently, with the universal use of the Internet and fast grows of the size of E-commerce Web databases, accessing the E-commerce Web database has become an important way for people to obtain the product information.The existing E-commerce Web database query processing models have usually assumed that users know what they want and they supported only a strict query matching model. But with the change of the E-commerce databases users from professional users that known application area to lay users that demaning"instant gradification", this precise query processing model is difficult to suitable for these users'query style. The users have insufficient knowledge about the structure and content of the database, and their query intentions are often vague or imprecise, thus the query conditions can just describe the query intentions approximately. Consequently, the items that are relevant to the query conditions are also needed by the users besides the query results that match the query conditions exactly. In order to obtain the relevant answer items, the user has to reformulate query conditions until she/he gets the satisfactory answers or gives up. It can be seen that the study on technologies of anwering imprecise queries of E-commerce Web databases is very important for the large number of users that need obtain the more relevant information from the large size E-commerce Web database in once time.In this dissertation, the problems of imprecise query, which occur in searching the Web databases and standing in need of solutions, are investigated. Also, from the perspective of satisfying the users'imprecise query needs, an efficient imprecise query solution and corresponding technologies for the E-commerce Web database, in accordance with the order of imprecise query, query results ranking and top-k retrieval, are proposed. The main contributions of this dissertation are summarized as follows:(i) To deal with the problem of imprecise query of the E-commerce Web database, an imprecise query answering approach, which is based on approximate functional dependence relationship, is proposed. Based on the concept of the agree set, the maximum set is exported, and the minimum nontrivial functional dependence sets are generated consequently, which is used to find the approximate dependence relations. By using the approximate dependence relations, the attribute weight measuring approach is proposed. The first attribute to be relaxed must be the least important attribute and has the maximum relaxation degree. Next, based on the ideas of association rules, the semantic similarity measuring methods of categorical attribute values is proposed. According to the relaxation threshold, attribute weight and semantic similarities of attribute values, an adaptive query relaxation rewriting algorithm is proposed. Results of experiments demonstrate that the performance and results of attribute weight and attribute values similarity measuring methods proposed are stable and reasonable respectively, the query relaxation method proposed has higher recall and can resolve the problem of imprecise query of the E-commerce Web database effectively as well.(ii) To deal with the problem of many answers returned from an E-commerce Web database in response to an imprecise query, a query results ranking approach which is based on probabilistic information retrieval model, is proposed. Firstly, based on the database and query history, this approach takes advantage of the probabilistic information retrieval model to capture the correlations between the unspecified and specified attribute values as well as the user preferences, and then constructs the scoring function and ranks the query results according to the ranking scores. Results of experiments demonstrate that ranking method proposed can meet the user's needs and preferences effectively, which means that the ranking quality of imprecise query results of E-commerce Web database can be improved as well.(iii) In order to improve the efficiency of the query results ranking algorithm, a top-k retrieval method based on threshold algorithm, is proposed. Based on the monotonous scoring function of different attribute values constructed by PIR model, a TA-based top-k retrieval solution is proposed. Next, the algorithms of tuples'orders creating, tuples'orders clustering and top-k tuples retrieval, are presented. Results of experiments demonstrate that the tuple's order clustering algorithm can find the cluster center correctly; the top-k retrieval method has higher precision and better efficiency, which can improve the retrieval efficiency of the large da taset environment.
Keywords/Search Tags:E-commerce Web database, imprecise query, approximate functional dependence, query results ranking, top-k retrieval
PDF Full Text Request
Related items