Font Size: a A A

Research On The Key Technology Of Search Engine Query Error Correction

Posted on:2015-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:G H DouFull Text:PDF
GTID:2268330428972713Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Query error correction technology can process many operations according to the query words i npu-tted by the user in search engine. If users types into error query word, query error correction te chnology can make it. According to user input query words, the search engine gives users the best correction query words. Later, Users can choose what they want to search. If the search engine can do these, it will improve the user’s search experience, and improve usability and fault tolerance. Currently many ways can calculate the similarity of the query word context by using the language model. The methods decide whether the query word is wrong by the similarity results. If the query word is wrong, it will give correction query word. But these methods do not take into account other factors among query items. Therefore this reduces the capability of query error correction. Moreover, An error correction strategies can’t be effective for a variety of types of error in the error correction process.In order to solve the above problems, the paper will use a method of query error correction based on web log. Through the paper studies the web log of users, the paper gets the error types of users’query words in daily day. Then the paper will query error correction. Firstly, the paper need to build web log corpus. And using multiple sets of test corpus experience web log corpus. Then we will get the results of the test set’s threshold. When we analyze the results of the threshold results, and also will add to the effect of the number of words and the word number. Secondly, the paper will analyze and study different error types, and use different error correction strategies for different types of errors. Therefore we need to build a word-spelling model, which will process spelling error problems. In paper, multi-word, less word and not words error type will use the strengths of the minimum edit distance error correction algorithms and fuzzy matching to error correction.The main innovations of this paper are as follows:The paper proposes a new query error method. This method improve N-gram language model. The paper add the effect of the number of words and the word number. By adding impact factor can effectively improve the capabilities of query error correction.The paper proposes error correction strategies by different error types. And improving the ability of query error correction.The experimental results show that:The query error method of the query words can effectively improve the overall accuracy of query error. Compared with the traditional model of probability and statistics, the paper’s methods have higher accuracy. In the error correction of various errors’types process, Combined with minimum edit distance and fuzzy matching algorithm can effectively improve the accuracy of error correction.
Keywords/Search Tags:Query Error Correction, N-gram Model, Web Log
PDF Full Text Request
Related items