Font Size: a A A

A Study Of Transactional Search Intention Classification Model

Posted on:2013-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Z DuanFull Text:PDF
GTID:2248330371466686Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
This thesis mainly studies on transactional search intention classification model (TSICM). The search intention, can be understood as the goal that users want to search for information or access resources on the Internet, it also can be quantified as the search result set that users want to get. In the field of search intention classification, there is not the unique and unified standard category system at present. Base on the search intention category system of Broder, and refer to the category system of Rose and Levinson, we present the new category system of transactional search intention (TSI) included five sub-classes search intentions, there are download, entertainment, interaction, obtain and shopping online respectively. At the same time, the five sub-classes search intentions are divided into a number of detail search intentions from users’realistic applications. In this work, we focus on the acquisition of classification features and the method of model building. According to the different source of features acquisition for model building, TSICM is divided into Beforehand Model (BM) and Afterwards Model (AM). The BM mainly acquires features from the content of user query string, the features include unigram, bigram,trigram and Name Entity Recognition (NER), while the AM gains features from the extended resource of user query, such as user query log and temporary webpage results crawled by Search Engineer (SE). The former can be pick up these features about URL address snippets, relative terms (RT), user click through behaviors (UCT) and so on, while the latter are mainly consist of the webpage title and snippets or terms in the webpage link to other webpage. In this study, a series of experiments are conducted for the features of the TSICM respectively, including the features of BM and AM and their combinations. From the experimental results, it is shown that the great mass of features from user query string itself play an important role in classification of transaction search intention, at the same time, a series of abundant features acquisition from extended resources of search query are also very helpful and positive to classify user search intention, and the combination of all features is more effective. In addition, we introduce the commonly classification algorithm in this article, they are Decision Tree (DT), K Nearest Neighbors (KNN) and Support Vector Machine (SVM), and carry out experiments about three classifier using the same combinational features and our category system of TSI respectively,then, compare all experimental results in group, the results show that three classifiers in the same features have different effects on the TSICM, and the comprehensive performance of SVM classifier is the best.
Keywords/Search Tags:search intention, transactional search intention category system, intention feature acquisition, intention classification model
PDF Full Text Request
Related items