Font Size: a A A

Optimal Search Theory And Support Vector Machine Applied Research In Information Retrieval

Posted on:2008-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2208360212975464Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology, the number of documents on the Internet increase exponentially. One of important researches focuses on how to organize, retrieval and process these great information. Information retrieval is a procedure that involves finding more relevant documents for an information need in a collection of documents. Traditional IR mainly concentrated on the precision and recall, but the resource limit is little considered. This paper focuses on this problem. We propose an optimal search model based on support vector machine, and the main work is as follows:1. We analyze the resource description and resource selection in the context of distributed information retrieval. We propose an optimal retrieval model based on support vector machine and optimal search theory. This model takes both recourse limit and search quality into consideration2. We analyze the support vector machine and kernel methods for text categorization. We propose kernel methods in sentence level based on the research in string kernels and word sequence kernels. Two possible algorithms are used to compute the valid kernels between documents: set of sentences kernels and sequence of sentences kernels.3. The LIBSVM package version CSharp 2.6, only support the basic kernel function, such as linear kernel, polynomial kernel and RBF kernel. The support for the pre-computed kernel is rare. The kernel methods in sentence level proposed in this paper and the word sequence kernel are all pre-computed kernels, so we have to recode the software to support this kind of kernels.4. Combining all the research results, we realize a text categorization and retrieval system to illustrate our opinion.
Keywords/Search Tags:information retrieval, text categorization, support vector machine, kernel methods in sentence level, optimal search theory
PDF Full Text Request
Related items