Font Size: a A A

Web Portal Navigation Structure Extraction For Visually Impaired Persons

Posted on:2011-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:L LinFull Text:PDF
GTID:2178360302974578Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In this era of information explosion, as the main carrier of Internet, the contents of web portals have become richer and richer. The channels in current mainstream web portals are numerous and their relations are complex, which is a challenge for the visually impaired persons to perceive the contents in those web pages.As most pages in current web portals contain links for navigation, namely navigational links, it will assist the blind to browse the website if these links can be extracted and construct to a tree navigation structure that can represent the content structure of the website. This is the motivation of this paper.Some web pages of web portals contain many navigational links, such as front page of a channel, such pages, namely navigational pages, can serve as navigation purpose. Those web pages have obvious features that the location and content of navigational links remain static for long while others not. In other words, navigational links are contained in the template of page snapshots at different times.Based on the observations above, this paper proposes a template detection and extraction algorithm to extract the navigational links. As the template of a page usually contains navigational links for upper level pages, which cannot be regarded as navigational links for current level, this paper introduces a strategy of leveled extraction to reduce this impact. Besides, this overall process uses classifiers to identify the navigational links among the candidates and to distinguish navigational pages.Algorithm analysis and experimental results indicate that leveled extraction strategy can improve the precision of navigational link extraction significantly. The overall results is good with classifiers to recognize navigational links and pages.
Keywords/Search Tags:data extraction, navigation structure, template detection, classifier
PDF Full Text Request
Related items