Font Size: a A A

Research On Similar Handwritten Chinese Character Recognition Based On Stroke Sequence Recovery

Posted on:2018-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhouFull Text:PDF
GTID:2428330596952967Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Existing methods of handwritten Chinese character recognition usually use statistic features for training and classification,which achieve good effects in obtaining candidates.However,the differences of statistic features among similar Chinese characters are small,so success rate of recognizing the correct result from candidates is not ideal.Compared with statistic features,position and sequence information of strokes performs better in distinguishing similar characters.Nevertheless,strokes are weak in anti-interference ability and not easy to extract.Researches in this paper focus on the problem of similar character recognition.Based on the extraction of linear strokes,stroke information is used to distinguish similar characters.The main works of this paper are as follows:(1)Skeleton distortions will bring difficulties for stroke extraction.To solve this problem,a complex area detection method is proposed to locate intersections and junctions of strokes.This method uses traditional thinning algorithm to obtain the original skeleton.Points on the skeleton are classified into three types: end points,common points and complex points.Complex areas are extracted by scanning connected complex points with an 8-neighbour window.Through locating and deleting complex areas,distortions are removed and the skeleton is split into several stroke segments.(2)With the detection of complex areas,this paper proposes a stroke extraction algorithm for handwritten Chinese character based on optimum local correlation degree.Sub-segments are extracted from stroke segments.The concept of local correlation degree is designed based on direction and curvature information between sub-segments.Stroke segments which satisfy optimum correlation degree conditions are connected by interpolation.Through the connection of stroke segments,natural strokes are extracted.The last step detects abrupt turning points to get the final set of linear strokes.Through local correlation degree,most distortions are modified and position relationships among complex strokes can be reflected correctly.(3)After linear strokes are extracted,a similar character recognition method based on the recovery of stroke sequence is proposed.Similar characters are grouped and a template library is built.The unknown target is compared with all library characters in its group respectively.During the comparison,stroke sequence of unknown target is recovered according to standard stroke information in the library.Then strokes are converted into a time sequence curve.An improved segmented DTW algorithm is designed to calculate the optimal cumulative distance between curves.With this optimal cumulative distance,similar characters can be distinguished effectively.HCL2000 database is used in the experiment of similar character recognition.Fifteen groups of similar characters are classified from the database.The recognition accuracy is 93.45% and it reaches a stable effect with less than 30 training samples.Experimental results show that stroke sequence recovery method has a good ability in distinguishing differences among similar characters.Meanwhile,the recognition method using improved segmented DTW algorithm can efficiently control training cost.
Keywords/Search Tags:Handwritten Chinese character, skeleton distortion, stroke extraction, similar character recognition, dynamic time warping
PDF Full Text Request
Related items