With the explosive growth of the Internet information, the development of information retrieval technology is increasingly matured. However, the current information retrieval system for different users’same retrieval words, to return to the same retrieval results, pay little attention to the users’personal interest. In the era of pursuit of individual character and people-oriented, people in urgent need of a more effective retrieval technology with new ideas and new technologies to solve this problem, thus, the researchers present the personalized retrieval technology. Personalized retrieval technology introduces the concept of personalized service, it is user-centered, take initiative to provide a full range of services for the purpose of meet the needs of individual users, depending on the users and their needs. The core and key of the personalized retrieval technology is construction and updating of user interest model, user interest model is the basis of many personalized process, its quality is directly related to the quality of personalized retrieval. Therefore, how to construct a user interest model with high quality, and apply it to other personalized process, is become an urgent problem.This paper takes the13115Program of Shaanxi Science and Technology Department—Documents Digitization and Resources Sharing Platform Construction as research background, takes science and technology document resources as research object, introduces the concept of group into the study of user interest model and query expansion, presents a personalized retrieval technology based on group, and apply it to the retrieval system of science and technology document resources. The main work of this paper is as follows:1. Discusses the development present situation of user interest modeling, query expansion and personalized retrieval system, introduces the related theory of information retrieval and personalized retrieval, makes a deep analysis of user interest model, query expansion, cluster analysis and other related technologies. These studies lay a theoretical foundation for the later work.2. For the lack of user interest modeling based on the traditional TF-IDF algorithm, by introducing the location information of the key words in the document, and combines content analysis with behavior analysis, improves the existing TF-IDF algorithm, based on this, this paper presents an algorithm to construct user interest model, the user interest model based on the improved algorithm is more accurate and comprehensive.3. Aiming at the problem of only consider the behavior of individual users when updates user interest model, by using cluster analysis technology to construct user’s interest group, and calculates the user’s group interest, then combines user’s group interest with user’s individual interest, this paper presents a user interest model updating algorithm based on group, this algorithm can timely and accurate tracking user’s interest.4. Based on the above model, aims at the defect of "topic drift" caused by query expansion based on pseudo feedback, combines implicit feedback with pseudo feedback, produces the top-ranked retrieved documents according to the user’s interest degree, this paper presents a query expansion algorithm based on implicit feedback and pseudo feedback, this algorithm can improve effectively the correlation degree between expansion words and user’s retrieval intention, reduce the topic drift phenomenon.5. Based on the above researches, this paper designs and implements the personalized retrieval system of science and technology document resources, the test results show that the retrieval results of this system can better meet the users’ personalized needs, help users easily and quickly find the information they need. |