Font Size: a A A

Network Information Extraction System Key Technology Research And Implementation

Posted on:2009-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y DaiFull Text:PDF
GTID:2208360245483028Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and Information Technology, the problem of information overloading and misleading has arisen. How to manage the tremendous mount of information on Internet and to provide personalized service have now become one of focus research in the domain of information service. A new hybrid model is proposed through research on user model, which overcomes problems existing efficiency of search precision.This paper proposes a new hybrid model based on research on modeling technology. In this hybrid model, user's interests consist of long-term model and short-term model. This paper expands on establishment and update of user profile. To describe user model exactly, tree-model based on time vector is proposed to express user profile, user model is hierarchical, and it can also discriminate long-term model and short-term model. Moreover, this paper introduces the K-Means and G-HAC combined clustering algorithm to improve the efficiency of cluster. To collect browse behavior of user, the minimal behaviors set for the estimation of interesting degree are obtained with correlation analysis. Furthermore, a method of drifting user's interests based on time window optimization is described in this paper. User profile tracks user's interest drift by classification error-rate and optimizes time window size.This system establishes user model with collecting combined Web content and behavior and adjusting user preference, and updates user model with time window optimization. Extraction of Web Information System filters the retrieval results according to user model and meets user personalized need with raising the accuracy and recall ratios.
Keywords/Search Tags:hybrid model, drifting user's interests, tree-model, correlation analysis
PDF Full Text Request
Related items