Font Size: a A A

Fuzzy Matching Based On Edit Distance Algorithm Of Chinese Technology In The Environment Of A Large Amount Of Data

Posted on:2014-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:T S JieFull Text:PDF
GTID:2268330425978821Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Edit distance algorithm, also known as Levenshtein distance is a complex problem of the optimal solution to be broken down into a series of relatively simple problem of the optimal solution, then relatively simple problem of the optimal solution is further broken down until you can see at a glance the bestmethod Solutions for far. Edit distance algorithm a wide range of applications, such as DNA analysis, spell checking, speech recognition, plagiarism detection, similarity calculation.Detailed in this article is the similarity of the algorithm computing applications, that is, later in this article will be referred to the Chinese fuzzy matching applications. This paper describes Zonggezhishui information service platform to collect24(except land tax) units chaotic raw data is organized into standardized and available data, collected24units of data the primary key is not uniform, the Chinese name inconsistencies in the proposed the Chinese fuzzy matching techniques to match the names of these different units, and local tax data than the Chinese which is a measure of similarity, published on results, and promote the collection and management, focusing on the use of which abnormal data than on final all units of the the data utilized achieve Zenghezhishui target,.The following step-by-step introduction of the application:Chinese fuzzy matching on the basis of (participle), the core (edit distance algorithm), Chinese fuzzy matching techniques in the tax system. Initially directly on the method results matching success rate of only10%to20%, based on the edit distance algorithm Chinese fuzzy matching techniques to match the success rate of more than85%, the effect is significant.
Keywords/Search Tags:participle, edit distance algorithm, Zenghezhishui, Chinese fuzzy matching
PDF Full Text Request
Related items