Font Size: a A A

Research And Implementation On Automatic Indexing Method Of Texts

Posted on:2010-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:J MaFull Text:PDF
GTID:2178360278459338Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
According as the widespreading application of computer and rapid development of Internet, the number of information resources on Internet grows explosively,and it has become the main infomation communication channel of today.Because of varied infomation resources of Internet which is based on uniform resource locator,so many people find that it is discommodious and inefficent.So it is more and more strongly to achieve uniform content locator for infomation resources of Internet.The thesis focuses on multi-dimensional indexing method of the text. Studies in more efficent atuo word segmentation algorithm,more accurate weighting algorithm, and the method of improve the recall results based on Latent Semantic Indexing.For the first problem, after investigating sub-machine-based word matching method, we define word for word verbatim traversal method which is mixed algorithm Maximum Matching verbatim traversal method and the first word index method. The simulations show that the word verbatim traversal method can improve efficiency of auto sub-word method.For the Second problem, after investigating the probability and statistics on the sub-word method we define the weighting factor both the frequency and the location of the word, and the frequency weighted non-linear statistical methods.Though the method,word can gain more corresponding weight.For the third problem, after investigating the Latent Semantic Indexing model, the frequency weighted non-linear statistical fuction is used as local weighting function.The simulations show that the upgrade latent semantic indexing model can obtain more accurate of recall results.
Keywords/Search Tags:Automatic Indexing, String Matching Algorithm, Statistical Word Segmentation AIgorithm, Local Weighting
PDF Full Text Request
Related items