Font Size: a A A

Webpage Noise Reduction Application And Research In Interactive Television

Posted on:2012-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:A SongFull Text:PDF
GTID:2178330338484162Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
After tri-networks integration policy is promoted, interactive television has been more and more popular, program-related information needs in interactive TV have drawn more and more attention. Program-related information is those information that the channels and program content that are closely related to and people want to access, and currently very few studies have been done to meet the urgent needs, it is the background of this paper and the main topic of it.One of the major sources of information is the Internet, but the WebPages on the Internet often contains navigation bars, advertising, unrelated links, and other such type of irrelevant information, which we call webpage noise. The paper takes the reduction of webpage noise, in order to get the program relevant information and extract the main content of webpage, as the research main objective and task. Appling program-related information in interactive television is related to the description and storage of program-related information, display technology and synchronization of program-related information with program, we will focus on these topics in the rest parts of this paper.This paper first studies on the current existed standards of media content description. After analyzing that the standards existed are not suitable for program-related information description according to the characteristics of program-related information, and then propose a HTML + XML + database based information description scheme. HTML provides performance, while XML and database provide data, to achieve the separation of performance and data.After searching a lot of literature at home and abroad and some study on currently existed webpage noise reduction algorithms, proposed a maximum similarity matching algorithm for noise reduction in Web Pages based on LCS. It takes the advantage of relevance of WebPages from the same website, and can effectively remove noise block of web page. The algorithm first preprocesses the target webpage, and find similar page of target page, parsing target page and similar pages into two characteristic trees, and map them to two characteristic node sequences, the LCS algorithm can get the longest sub-sequence which is global optimal solution and find out the different characteristic nodes between the two characteristic tree as a candidate set, clustering the candidate set and scoring to identify web page important informative block. Experiment shows that the proposed algorithm can achieve good results.At the final part of this paper, realized an example system, which has be applied the webpage noise reduction to the interactive TV, set up the entire model system from the server-side to the client side. First we study the program-related information display technology. Based on Ajax technology, study the dynamic display technology about dynamic loading of program-related information and parsing in local browser. Meanwhile, talk about the synchronization of program-related information with video, and the compatibility issues of multi-terminal, and proposed the solutions. We realized the modules in our example system at the last part of this paper.
Keywords/Search Tags:Program related information, Noise reduction in WebPages, LCS, Interactive Television, Webpage display technology
PDF Full Text Request
Related items