Font Size: a A A

A User Intent Analysis Method Based On The Query Log

Posted on:2017-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:T Y ZhuFull Text:PDF
GTID:2348330518470925Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid popularization of Internet, the amount of data on the Internet is of explosive growth. Today, the Internet users is more and more inseparable from the search engines. After a large number of users access to the search engine,they can produce huge amounts of query log. These logs contain a large number of user behavior patterns. Therefore,using the data mining technology, from the user's query log found in user intent, can bring huge commercial value.In the light of the user's query intention classification task,there are a lot of classification algorithm, one of the most commonly used is the decision tree classification model. The classical decision tree classification algorithm is ID3 algorithm. It has many advantages, for example: the algorithm running time is short,the utilization rate of the data is high, and there is no unsolvable risk, and so on. But the shortcoming is also very obvious. First of all there is no pruning process with ID3 algorithm, and algorithm cannot processing continuous attributes,the ID3 algorithm solution may is a local optimal solution rather than the global optimal solution. This paper analyzes the advantages and disadvantages of classical ID3 algorithm in detail, and two of these shortcomings: "excessive logarithmic time affects performance" and"multi-value attribute to reduce classification accuracy is improved. This article uses the UCI public datasets to verify the validity of the algorithm and performance.This paper introduces the user query log, and put forward a kind of method of log preprocessing and the results of statistical analysis. Finally, this paper put the improved ID3 algorithm in use of the classification task for user query intention, put forward a model of decision tree classification, and its classification performance is analysed.
Keywords/Search Tags:Query log, User intent classification, Decision tree, ID3 algorithm
PDF Full Text Request
Related items