Font Size: a A A

Research And Implement Of An Optimal Approximate Matching System Of Structureless Text

Posted on:2012-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y H CaoFull Text:PDF
GTID:2178330332486031Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid popularization of the Internet, human rapidly ran into the information age, who mastered the information can be in a good position in the competition, in this situation, how to find needed information in numerous complicated information becomes a research hotspot. In these information, unstructured text data is used mostly, so the approximate matching system of unstructured text appears. It is very important for information retrieval, text analysis and mining, so has been the study focus of people. Because of the Chinese syntax and semantic complexity, Chinese text-matching is very difficult. Therefore, designing an efficient and accurate non-structural text-matching system based on Chinese has a wide range of practical significance. Based on this reasons, this paper studies and implements an optimal approximate matching system of the Chinese non-structured text.The author's main work is as follows:(1) This article does some comparative analysis of current domestic and international study status, studies and analyse Chinese word segmentation, text feature representation and matching, clustering and other related technologies involved in the Chinese structureless text-matching.(2) Against advantages and disadvantages of various techniques, combining the actual requirements of the project authors involved in, this paper designs the corresponding software functions and system structure.(3) According to the scheme this paper designs, this parper proposes two implementations of based program API and based storage engine, and does detailed analysis on related technology, theory and implementation method involved in the two schemes.(4) Through experiments, this paper does further comparative analysis of the performance of the two schemes.
Keywords/Search Tags:text matching, Chinese word segmentation, text feature, program API, storage engine
PDF Full Text Request
Related items