Font Size: a A A

Research On Object-Level Retrieval Results Clustering Presentation Method Over Relational Databases

Posted on:2014-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhengFull Text:PDF
GTID:2248330398952382Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the aspect of retrieval results presentation, tuple-level information retrieval over relational databases system usually returns a large number of relevant retrieval results to users, while the overlapping phenomenon often occurs in the Top-k retrieval results. There are two main reasons for above problems. The first one is that tuples contained keywords may come from different data tables, and these tables connect with each other in various ways, which always make the system generate a large number of similar retrieval results, so that redundancy exists when showing the final results; the second reason is that keywords are often ambiguous, in addition, the semantic information captured is so limited that it leads to retrieval results containing multiple topics during the retrieving period, and users spending a long time to exclude irrelevant results in order to find the retrieval results which are highly correlated with their interests.This thesis proposes an object-level retrieval results clustering presentation method over relational databases to solve the problems above-mentioned. Treat and process data in the view of object, and integrate relevance and diversity of keyword retrieval results seamlessly. This method comprises two parts:an object-level retrieval results custering algoritlhm which is named O2RC algoritlhm and three presentation patterns:thumbnail form, tree form and link directory form. The former clusters retrieval results from two aspects-the structure and the content of retrieval results-as clustered retrieval results, and the latter demonstrates them to users from different perspective.Firstly, retrieval result set is clustered into M different structural clusters according to the object label tree and information code algorithm, and this step aims to solve structure redundancy in retrieval results which caused by multiple connection ways among data tables. Secondly, calculate the content similarity between the two retrieval results in the isomorphism cluster from the following four aspects:object subtopic, object description, the proportion of keywords and the content included of proper trees. This phase is used to address the second problem that retrieval results expressed the same theme demonstrated repeatedly. Finally, using the three visual forms above-mentioned to present clustered retrieval results attached with simple semantic description to users.Treating the02RC algorithm as the core, this thesis designs the framework of object-level retrieval results clustering presentation over relational databases and implements a prototype system. Cluster results through02RC algorithm and Buckshot algorithm respectively, and compare and analyze clustered results in the aspects of response time, accuracy rate, F-measure and presentation effect. The experimental results indicate that O2RC algorithm has the better clustering effect and the higher clustering efficiency. It not only enriches the categories of retrieval results returned, but also reduces the redundancy among structure and overlap between content and enhances the intelligibility of retrieval results. Three presentation patterns improve the navigability of results presentation from different perspective, and upgrade the users’ experience. In summary, the object-level retrieval results clustering presentation method over relational databases has excellent practicability.
Keywords/Search Tags:Relational Databases, Object-Level, Retrieval Results, ClusteringAlgorithm, Results Presentation Method
PDF Full Text Request
Related items