Font Size: a A A

Research On Related Technology Of Dynamic WEB Information Monitoring

Posted on:2012-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:H B LiuFull Text:PDF
GTID:2218330362450262Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, the number of sites and webpage on the Internet grows explosively. Facing the massive information, it is very difficult for users to discover real-time information by scanning web. The real-time monitoring of web information requires users to discover valuable information from enormous information pages, but users can only monitor single site or a few sites with low efficiency. In this context, this paper surveys on related techniques of web information monitoring and proposes a novel method for hot events discovery through dynamic clustering. The main content includes the following aspects.This paper researches on the basic framework of web information monitoring system. Recently there isn't mature framework for web information monitoring system, and the traditional information monitoring systems are realized using C/S mode. This paper proposes a novel framework based on B/S mode, the data are separated and stored on the basis of three-layer structure, and we use web service to provide real-time storage of the information.We also propose the basic implementation of web information monitoring. We use information channel as the basic unit of information monitoring. The channels are built automatically by users. A channel consists of one or more sites and the columns of the sites. Ajax is used to realize multiple-channel information monitoring.This paper proposes a page-based way to design cache and combines the server-side user Session, lightening the server-side load and reducing the response time and the monitoring information traffic. Monitoring information set is obtained last time in the monitoring page, and last piece of information's serial number is saved through Session last time. The server returns updated information instead of entire information of current channel in the present monitoring process.In order to fulfill different monitoring requirements of monitor users, research on User personalization is conducted. We adopt information channel as the basic representation of users' interests, and implement an information filtering schema in the channel directing at users' input keywords. Mapping from users' keywords to events description with a keyword map table is proposed. Boolean model is used to filter information. By the same way, keyword mapping is used to retrieve texts, and a monitoring mechanism is proposed.Information clustering technology of the Web information monitoring is discussed. Information clustering on information of a period can discover hotspots, which can be used to monitor information. Lingo algorithm is a adopted to cluster retrieval results. Lingo is a describing-first method, which can generate better cluster labels. We proposed an improvement method of Single-Pass based on a weighted combination of semantic similarity and cosine similarity to conduct cluster fusion and cluster re-discovery of Lingo clustering results. Experiments show that our approach can discover more categories and can describe categories better. Users can apply the clustering results as the foundation and direction of the information monitoring.
Keywords/Search Tags:information monitoring, clustering, personalization, information filtering, page caching
PDF Full Text Request
Related items