Font Size: a A A

Research On Semantic Similarity Metric Based On WordNet And Its Application In Query Suggestion

Posted on:2015-01-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L MengFull Text:PDF
GTID:1268330431959106Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Semantic similarity metric is a hot topic for many years in artificial intelligence, psychology, and cognitive science. Nowadays, it has been successfully applied in many fields. As a key issue of natural language processing, the most important aspect is semantic dictionary. One semantic dictionary that can express the relations between concepts is indispensable resources. WordNet developed by Princeton University is an excellent example. Its basic idea is simple and clear. Currently, WordNet has become a de facto international standard and the reasonableness of its framework has been recognized by lexical semantics field and computing dictionary filed.At the same time, with the explosive growth of data, more and more people rely on search engine to obtain information. Query suggestion becomes a hot topic, which can help users to better articulate query intention. With query suggestion more and more important, query information sparse problems make query suggestion face many challenge. This is seriously restricting query suggestion for further application. Using semantic similarity measure to promote the research of query suggestion is an effective solution, which is important direction for further research.Based on the discussion above, the dissertation represents data from the level of semantic and focuses on concepts’semantic similarity. Furthermore semantic similar measure is applied into similar query metric. The main contributions of this dissertation are as follows.1. The dissertation proposed an IC model in WordNet based on concept’s topology. Different from previous work, the new model is corpora independent. The information content of a concept is the function of the topology of itself and its descendants. Experiment shows that the new model is able to provide more accurate similarity evaluation and achieves significant performance than related work.2. The dissertation proposed an effective algorithm for semantic similarity metric of word pairs in WordNet. Different from previous work, in the new algorithm not only path length, but also IC values have been taken into account, which can distinguish different concept pairs effectively. We evaluate our algorithm on the data set of Rubenstein and Goodenough, which is traditional and widely used. Coefficients of correlation between human ratings of similarity based on seven algorithms are calculated. Experiments show that the coefficient of our proposed algorithm with human judgment is0.8820, which demonstrate that our new algorithm significantly outperformed others.3. The dissertation proposed a query similarity metric algorithm based on semantic analysis. Different from previous work, the new algorithm represents data from the level of semantic. It takes full consideration the information of keywords and user clickthrough, mining the relations of queries. Experiments show that clustering queries based on the new algorithm can more accurately capture the similarity query than related works.4. This dissertation presents a query suggestion algorithm which is topic oriented. Different from previous work, the new algorithm takes full consideration of query relations in meaning; similarity values, query context and so on, and then suggests the similar queries to user. Experiments show that the new algorithm can effectively improve the precision of Web search.The achievements of this paper have high academic value. They have been successfully applied into the field of information retrieval. Furthermore they can be extended to web page classification, Q-A system, advertisement pushing, E-commerce and so on, which indicate a larger commercial value and broader application prospects.
Keywords/Search Tags:information content of concept, semantic similarity, query topic, similar query metric, query suggestion
PDF Full Text Request
Related items