Font Size: a A A

Sort Of Facing Pages Keyword Weight Calculation

Posted on:2014-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:T L GaoFull Text:PDF
GTID:2268330392462521Subject:Computational Linguistics
Abstract/Summary:PDF Full Text Request
With the development of information technology and the increasing popularity of theInternet, Search engine by the attention, In recent years, most mainstream search engines arebased on keyword, Search engine which based on keyword search, the accuracy of calculatingweights of each word in query will directly affect the quality of page sort, Therefore correctlycalculate the value of the retrieval condition words right is crucial.This study is trying to find a user-oriented web page sorted query keyword weightcalculation method, so that, Page Rank of search engine which is based on keyword can achievea higher level. Lay a good foundation for subsequent retrieval processing. In order to completethe study purpose, the work consisted primarily of the following three parts:Query analysis. Vocabulary characteristics of modern Chinese corpus and query. Humanmarked100,000queries and100,000of modern Chinese corpus statements, compare the numberof words contained in words the proportion of each part of speech, part of speech sequence in asentence,the differences of disable the use of the word;5000marked Core Words query’s owncharacteristics with the words weight analysis.Caculating keyword weight in page rank. Segmentation and part-of-speech tagging to userquery log, as keyword extraction task a classification task, combined query its owncharacteristics, Ultimately determine each term of eight context characteristics as thecharacteristics of the decision tree forest classification. And the calculation method of the variousfeatures were introduced. Error analysis of the experimental results, add some rules on the resultsof the model classification after processing.Analysis of experimental results. Compared decision tree classification with traditionalkeyword extraction, and weight calculation method results. Selecting random sample of about1000query from the user’s query log for manual evaluation, Use cross-validation evaluation tocaculate accurate and recall rate; Compare the win rate between models with the traditional websort weights and decision tree classification method; select a few queries, search in"www.baidu.com",get the page ranking effact between use keyword model to determine thesequence of search and the effact which don’t process the keyword to search. Experimentalresults show that the keyword extraction and weight calculation method used in this paper inPage Rank weight calculation is feasible.
Keywords/Search Tags:Keyword Extraction, Weight caculate, Page rank
PDF Full Text Request
Related items