Font Size: a A A

Research On Key Technologies Of Web Information Extraction System

Posted on:2010-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:W TangFull Text:PDF
GTID:2178330332960973Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of web technology, the web has changed people's production and lifestyle, which became an important channel for access to information. Searching engine mitigated the conflicts between the extensive of web information and the needs of information user. While, due to lack of intelligence of searching engines, it could not meet the needs of user's personal demand. This dissertation investigates the personalized information services technology for solving the problem mentioned above.Firstly, this dissertation collects users'browsing content and browsing behavior by combination of explicit and implicit methods. Then after processing collected web pages, fine granularity interest added time factor is used to represent text. Moreover, cluster analysis of the text is made to find the users'interest. Secondly, after clarifying the necessity of the update of user's interest, mixed interest updating method and implicit update method are applied to update users'interest according to users'interest drift. Based on these results, we design and implement the system prototype, verify the effectiveness of text clustering, interested shift algorithm and web interest calculation. At last, the verified user's interest model designed in this paper can improve recall and precision, which can provide more accurate and effective personalized recommendation for users.
Keywords/Search Tags:Interested in fine granularity, User interest model, Implicit update, Interest drift
PDF Full Text Request
Related items