The Research On Long Query Expansion On The Concept Of Semantic Similarity

Posted on:2014-11-30

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Yang

Full Text:PDF

GTID:2348330485494957

Subject:Information Science

Abstract/Summary:

PDF Full Text Request

With the rapid development of internet, information retrieval in the network is also developing quickly. Currently, the main form of information retrieval is search engine, which has been the second network service following E-mail. The presenr search engines mostly use keywords for information retrieval, but the limited input of words can't completely express the mean of query. The ambiguity of words causes the search engine to return a large amount of unrelated documents, greatly reduces the recall and precison. On the other hand, users sometimes use long query for information retrieval. Due to the offsets of query subjects, the retrieval result is not ideal. Therefore, in order to solve the above problem, scholars proposed query expansion technique, which modify the original query words to improve the the query retrieval precision and recall. They indeed achieved some results, but mostly for shorter queries. However, in recent years, the foreign scholars pay more attention to the long query studies, that's because the natural language sentences can express complex and specific information needs better. It is a trend of future query expression of users. The rich semantic relationships of long query also provides better search basis for the semantic query expansion, it should be helpful for understanding the language feature and different syntax habits of users.Therefore, in order to solve the topics offset?low precison and related documents sort rearward in low recall of long query in search engine, this paper proposes the long query expansion on the concept of semantic similarity. First using AAlesk to find the correct meaning of query word, then add the semantic concept of query word in WordNet to the original long query. Second to cluster the concepts based on semantic similarity, and get the query clustering set, then calculate the clustering sets' overall level of semantic relevancy and concepts semantic importance, obtain the best candidate concepts. Finally, according to the score in the concept set to find the keyword, and use them to represent the original long query. In addition, this paper also apply KeyGraph keyword extraction method to process the long query, and put the two kinds of results into three different types of retrieval models for search experiment. The experiment results show the retrieval efficiency of improved long query is better, especially the method proposed in this paper can express the real information needs of users from the semantic level, greatly improves the precision and recall of long query, more suitable for application on existing mainstream language retrieval model.

Keywords/Search Tags:

Long query, Semantic similarity, Retrieval model, WordNet

PDF Full Text Request

Related items

1	Development And Application Of Domain Specific Semantic Retrieval System Of Institutional Repository
2	Research On Semantic Similarity Metric Based On WordNet And Its Application In Query Suggestion
3	Conceptual Semantic Similarity Calculation Based On WordNet And Its Application Research
4	Research On Semantic Similarity Between Words And Between Short Texts Based On WordNet
5	Research And Application Of Wordnet-Based Semantic Similarity Measurement
6	Research On Domain Semantic Retrieval Model Based On Ontology
7	The Research Of Semantic Similarity Between Short Text Based On WordNet
8	Research Of English Sentence Similarity Measure Based On Wordnet
9	Ranking Algorithm Based On The Semantic Retrieval Of Lexical Semantic Tree
10	Multiple Semantic-based Similarity And Relatedness Measurements In WordNet