Font Size: a A A

The Evaluation Of Keywords Classification And Recommendation In Search Engine

Posted on:2016-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:W B ZhongFull Text:PDF
GTID:2308330479994806Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of Internet, Internet users has playing a more and more important role, the user-centered information production mode caused the explosion of the Internet information, people now facing the problem of "information overload" in daily life. After the search engine technology appear, people find way to get the needed information quickly. Let the search engine to understand people’s search demand more precisely has become a very important subject on search engine which need to be in-depth research.In this paper, we do a research about Text Categorization in Chinese search engine, discussed the keyword text categorization problem, and summarize the achievement in the area of text classification. This paper also lists the key technology required in text categorization, including Chinese words segmentation, text feature extraction and presentation as well as the method of text classification. First, this paper discusses the characteristics of keyword text; Compare with long text, short text has its distinctive characteristics, existing in this essay include network BBS, weibo, reviews, search terms, etc. This paper mainly studies the keyword classification problem of Chinese search engine, in comparison with Rocchio algorithm, K nearest neighbor algorithm, and linear SVM algorithm works in Chinese keywords text categorization. This paper adopt a multi-feature combination method in feature extraction to address the Keyword text characteristic sparse problem, Which make dramatically improved in classification accuracy; At the same time, using a weighted entropy calculation based on information entropy can do a more accurate representation of the information contained in, to a certain extent, this method can improve the effect of classification.With explosive growth of the amount of data, the calculation of single server capacity has been unable to satisfy the requirements of processing massive text data. Thus in this paper, we produce a strategy of machine learning techniques associated with hadoop distributed processing technology to solve the problem of the classification of the massive keyword; Combined with records of advertisers to buy keywords, we adopt the content-based recommendation method, the classification model of the Chinese search keywords was used to solve the problem of cold start of keywords, designed to achieve keywords to search AD buyers personalized recommendation model, through the model implements precise recommendation from new search keywords to the AD buyers. Through the study of the precise classification and labeling of search keywords, can achieve accurately targeted advertising, improve search results and improve the user search experience satisfaction.
Keywords/Search Tags:Chinese search engine, keyword, classification, distributed, recommendation
PDF Full Text Request
Related items