Analysis Of Web Usersâ€™ Query Intent

Posted on:2015-10-22

Degree:Master

Type:Thesis

Country:China

Candidate:H Q Zhang

Full Text:PDF

GTID:2298330452453275

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Since the emergence of internet more and more web content increase rapidlyervery day. These contents exist in traditional media such as web pages, documents,multimedia (images, audio and video), BBS, email, Blog, and some very popularsocial networks such as twitter, facebook included. And it is becoming more and moredifficult to find the information needed by the internet users on such vast and variedinternet world. so how to accurately predict underlying intent behind the query posesa huge challenge to the search engine.A userâ€™s query goal always has its unique meaning for each query, which demandto return the satisfactory results according to the requirements of individual users,rather than merely according to the query items. How to accurately predict theunderlying intent behind the query submitted by web users is the focus of modernseach engine now. In the early stage of pioneering study of identifying usersâ€™ queryintention mainly carried out by artificial help. However in this paper, usersâ€™ queryintentions are identified automatically. To implement it, we do the work as follows.1. Classification standard is based on Broderâ€˜s classification. Considering thatthe behavior of the query of Navigational and Transactional is almost the same that aweb site is needed to be navigated before further activities on it. There are also somesimilar classification features between them, while the difference is big compared toInformational query. So Navigational and Transactional query should be classified asone category compared to Informational type.2. In order to integrate with the search engine successfully, classificationalgorithm based on machine learning is used. While each classification algorithm hasits advantages and disadvantages. Some common classification algorithm carefullyanalyzed is needed. Given the vast amount of data on internet, the classificationmodel should meet the demand of low time complexity, we choose support vectormachine (SVM) as classification algorithm.3. Experimental data set used is from real web search engine logs, about2millions of queries and artificial annotated queries up to1,935which are typicalqueries.4. The key to establish a good classification model is to have adequateclassification features. To get effective features, not only the search engine logs are needed such as usersâ€™ click through features including nCS nRS and mRank, but alsosome other information needed. Through observing how users employ the searchengine to get information, average number of queries based on sessions (AveQuery) isproposed as an effective classification feature. Some features from the query itemsalso combined to classification features. These features are statically analysised fromthe data set. Some featuresâ€™ differences are clear, while some are not clear whichmaybe not a linear classification feature.5. Precision rate and recall rate are used to evaluate the classification modelwhich are common evaluations in information retrieval field. However consideringthe un-balanced distribution of informational and noninformation query, F-value isadded to evaluate. Results show that by combining multiple features help identifyquery intent, and the classification accuracy is up to80%.

Keywords/Search Tags:

search engine, query intent, queries Classification, feature combination

PDF Full Text Request

Related items

1	A Method That Mines Userâ€™s Search Intent And Recommends Related Queries Base On Search Engine Logs
2	Automatic Classification And Analysis Of Query Intent
3	The Research On Getting The Search Intent Of Users
4	Research On Domain Classification Of Search Engine Queries
5	Research On Search Engine Oriented Natural Language Processing Technology
6	Research On Query Intent Identification
7	Research On A Method Of Mining Userâ€™s Search Intent Based On Knowledge Graph
8	Research On Topic Based Query Intent Identification
9	Relevant Techniques Of Named Entity Query Processing For Search Engine
10	A Framework for User Guidance in Web Search Engine Interfaces Based on Past Users' Behaviour