Font Size: a A A

The Research Of Personalized Categorization Approach For Web Database Query Results

Posted on:2010-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:K ShaoFull Text:PDF
GTID:2218330368999989Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The rapid expansion of the Internet has made a variety of online databases accessible to a large number of users. Users can access the database through the query interfaces provided by the Web site and the database available on the website is called Web database. However, for most ordinary users, they often have insufficient knowledge about the structure and contents of Web database, and often have vague or imprecise ideas when searching the Web databases, so they may not be able to formulate queries that accurately express their query intentions. Therefore, the query user submits should not act as rigid constraints for the query results, i.e. the query is a exploratory query. However, the exploratory query can result in too many irrelevant answers over large database.In order to resolve this problem, existing work either categorizes or ranks the results to help users locate interesting results. However, most existing work assumes that all users have the same user preferences, but in real life different users often have different preferences. Therefore, this paper proposes an improved solution based on the decision tree algorithm to address the diversity issue of user preferences for the categorization approach. The first step is to analyze query history of all users in the system offline and generate a set of clusters over the data, each corresponding to one type of user preferences. When user asks a query, the second step is to present to the user a navigational tree over clusters generated in the first step such that the user can easily select the subset of clusters matching his needs. The user then can browse, rank, or categorize the results in the selected clusters. The navigational tree can provide the best categorization results with the lowest navigatioan cost using an improved decision tree algorithm which considers the cost of visiting both intermediate nodes and leaf nodes in the tree.Compares with the other category solutions, an empirical study demonstrates the benefits of our approach which have high quality and efficiency for categorization.
Keywords/Search Tags:Web database, user preference, decision tree algorithm, tuples clustering, query result categorization
PDF Full Text Request
Related items