Font Size: a A A

Research And Application Of The Techniques Of Text Media Analysis On World Wide Webs

Posted on:2007-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z WuFull Text:PDF
GTID:2178360182966726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nowadays, the amount of information in scientific field and human societies is exploding rapidly. People are expecting the information retrieval technology to be more on amount, higher on specialty and better on filtering. Then the new problems occur: The way of learning has already stepped into the internet and webs field. And on the other hand the knowledge is bursting up. To solve out these problems, many researchers and technicians are trying to figure out an efficient framework to sailing in the information warehouse.This article is focused on the framework that handling the information in WWW background. We will explain relevant theories and some implementations.Chapter 1 gives an overview on the background of information retrieval.Chapter 2 illustrates the achievement of knowledge classification and multi-document summarization based on vector space model. For detail, a method called Fuzzy KNN is brought forward and is compared it with traditional KNN and SVM methods. Then a method of multi-document summarization that integrated with text classification and link analysis is expressed. Finally, an authorial concept named Relevant Degree is bought forward for the use of data mining.Chapter 3 shows an implementation of a web search engine system that integrated with the technologies mentioned in Chapter 2 and other existing methods. It is called Antares Web Search System. This chapter provides many details on how to implement an web search engine with information filtering and further processing. This system plays a role of an example for future works which focus on such filed.Chapter 4 introduces the Latent Semantic Analysis and Indexing probability model and WordNet project. A new framework for future study is proposed at the end of this chapter.Chapter 5 is the conclusion and prospect.
Keywords/Search Tags:search engine, WWW, classification, summarization, link analysis, latent semantic indexing, concept space, wordnet
PDF Full Text Request
Related items