Font Size: a A A

Study Of Fields-oriented High Quality Information Retrieval Based On Web Data Mining

Posted on:2009-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z R YangFull Text:PDF
GTID:2178360248452543Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In today's information world, the technology of Internet/web is in its full bloom, and the information from WWW continue to grow. Finding the content which the users are real interest in has become the focal point of organizations and specialists in the related fields. The search engine could find out the documents which have some relation to the key words. However, there are too many results and the precision ratio is not to high. The traditional search engine technology has adapted to people's certain needs, but because it's commonality it can not satisfy the requirements of users' personality demands with different background, different purposes and different time. And the purpose of research is to fully utilize the user's personal information through the flexible means, such as interests homing mechanism of user or diverse of retrieval schemes to collect the information from Web, and make full use of network information, then improving the accuracy of inquiries and improve the quality of the retrieval, and to meet the demand for specific users.Firstly this paper analysis the search engine technology and Web data mining technology, and the principle of search engine and the clustering analysis of data mining. As the characters of user's accessing and the classification of users is the precondition of high quality personalized information retrieval, and the accessing activity has been store in the web log, and information can be eventually used in the mining of user's interest only when the web log data has been preprocessed . Therefore this paper thoroughly discuss the problems of web page filtering and user access paths and so on, to make the work of web preprocessed much more prefect.Secondly we explore the means that retrieval system capture high quality web pages, and how to build the index of pages, how to offer the high quality retrieval services. We focus on two basic indexes-the efficiency of retrieval and the results of retrieval. On the basis of searching , we represented a theory that mining the classification information of pages in the website according the position of the web page, and dynamic clustering through these information can provide the user a kind of dynamic clustering directory search service; to deduce the user's search preference and then build the model of user searching, finally general the user access patterns base on it.Finally the paper make a conclusion and summing up of all the research, then discuss the possible direction of further research.
Keywords/Search Tags:Web Data Mining, Information Retrieval, Personalized services, Web Log, Data Preprocessing, Clustering Analyzing, Accessing Activity
PDF Full Text Request
Related items