Font Size: a A A

Research And Implementation Of Chinese Error Correction Method Based On Search Engine

Posted on:2020-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:S W YangFull Text:PDF
GTID:2428330611497511Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Information retrieval is one of the most important ways to obtain and query information at present,and it is the basic service in the Internet.Information retrieval brings convenience to users when they obtain information,and it also has drawbacks.When the user enters an incorrect query string,the information returned by the information retrieval system may deviate from the user's true intention,resulting in loss of user traffic.Therefore,querying the error correction problem in information retrieval is a difficult problem to be solved.Through the research on Chinese error correction methods,this paper finds that the methods proposed by current researchers have the following shortcomings:1.The research method only focuses on some common types of errors,and does not consider a small number of actual types of errors,resulting in unsatisfactory error correction effects.2.Pay more attention to the error correction method research,and ignore the influence of the sorting model on the error correction result,or use only a single characteristic to score the obtained candidate set,so that the candidate obtained by the user may not be the optimal option,which affects the effect of error correction seriously.In view of the above deficiencies,this paper proposes a Chinese error correction method based on search engine.Firstly,this paper research the user network log,analyze the cause of the input word error,and classify it according to the error reason.Finally,adopt different types of strategies to correct errors according to different error classification.This topic mainly involves the following work in the research process:1.Improved an error correction strategy for multiple error types.By assigning corresponding weights to the keyboard keys according to their relative positions,the deficiency of key adjacent error correction in the Pinyin error correction method in the error correction strategy is improved.2.Improved the method of establishing the ranking model.Based on the integration of the four factors of N-gram model,query click rate,similarity of form,and edit distance,the ranking model is established by introducing the features of Pinyin similarity to improve the accuracy of the error correction method.
Keywords/Search Tags:weblog, cause of wrong words, Pinyin error correction, sorting model
PDF Full Text Request
Related items