Font Size: a A A

World Wide Web site summarization

Posted on:2003-07-22Degree:M.C.ScType:Thesis
University:Dalhousie University (Canada)Candidate:Zhang, YongzhengFull Text:PDF
GTID:2468390011986491Subject:Computer Science
Abstract/Summary:
As the size and diversity of the World Wide Web grows rapidly, it has been more and more difficult for the user to skim over a Web site and get an idea of its contents. Currently, manually constructed summaries from a large number of volunteer experts are available, such as the DMOZ Open Directory Project. This research is directed towards automating the summarization task. In this paper, we describe an approach which applies machine learning and natural language processing techniques to summarize a Web site automatically. We compare the automatically generated summaries with DMOZ summaries, home page browsing and time-limited site browsing, separately, for a number of academic and commercial Web sites. Overall, the automatically generated summaries generally convey the same information to the reader as DMOZ summaries do, and they can save users browsing time in terms of understanding the main contents of the Web site. This is confirmed by a comparative evaluation by human subjects.
Keywords/Search Tags:Web
Related items