Font Size: a A A

Related Studied On Information Extraction And Information Recommendation Based On Web Data Mining

Posted on:2011-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z ShaoFull Text:PDF
GTID:2178360305977847Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the rapid development of Intermet, a lager amount of information is offered. However,so much information makes it even harder for users to find the information that they are interested.To provide the better service for users, discovering the users'potential interests is the key.The approach to resolve this problem is to apply the traditional data into Web page. How to get the desired information which can guide the user's decision-making behavior has become a very important and urgent issue. Comprehensively, accurately and efficiently extracting and recommending the information users need by researching on web page data model has become necessary.With this background,information extarction and recommendation technology (especially personnalized technology) were produced. Along with the flourishing development of E-commerce, it becomes a new branch of web data mining technology after the internet emergence.This paper firstly introduces the backgroud of the research significance and research actuality home and abroad.what followed is some research on web site information extraction and personalized information recommendation system. Information extraction technology mainly introduce its functions, classification and related technologies as well as block-related information extraction technology and algorithm.Recommendation technology focuses on the system's brief introduction, classification, input-output, performance as well as key technology of information recommendation(including ideology,classication and algorithm steps).This article focuses on website's block information extraction and site information recommendation especially e-website recommendation algorithm on collaborative filtering.For some issues of information extraction and recommendation technology, some of our solutions are proposed.Efficiency and accuray issues in extraction and recommendation have been resolved to a certain extent.the feasibility and effectiveness of the algorithm are proved by our experiment.The extraction method which use the entire page as the smallest unit of information can not meet the need of the rapid development of web page extraction.We can divide web page into several regions(Block) according to some algorihtms,and regard these regions as the basic units of information processing and extraction. A certain weight was given to Blocks, so we can efficiently extarct important information. On the basis of tacit information extraction in web users'log,this article uses clustering method to divide the users who has the similar interests into the same clustering.The work can be carried out off-line.Using this method can be significant salving online algorithm data processing time and improve the site's (especially e-commence website) recommendation efficiency by the way of collaborative filtering. It will be a good solution to these questions,such as:data sparse,system scalability,cold start and so on.However,this paper still has many deficiencies in the room for improvement:1.the balance between real-time performance and information quality in extraction technology and recommendation technology:accuracy and real-time performance in web page mining are contradicted. How to effectively improve the information quality are needed to study further at the same time of improving speed.2.information privacy protection and information security problems:how to protect the users' privacy while providing informations to conduct their decision-making behavior is worth further researching.
Keywords/Search Tags:Web Data Mining, Information Extraction, Information Recommendation, Collaborative Filtering, Block Topic Extraction, User Clustering, Personalized Recommendation
PDF Full Text Request
Related items