Font Size: a A A

Research And Application Of Scene Text Detection And Recognition Based On Multi-channels MSER

Posted on:2019-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2428330590965945Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text information in scene can facilitate image understanding,and scene text detection and recognition is becoming a hot research field.Scheme of scene text detection and recognition consists of text detection,non-text removal and text recognition.Some problems of existing methods are listed as follows:(1)Maximally stable extremal region algorithm is a robust method for scene text detection,but the traditional MSER only considers the grey channel,which ignores the important color information.(2)As for removal of non-text MSERs,neural network is the mostly used method,but traditional neural network requires fixed-sized input,so this method cannot combine very well with MSER algorithm(3)Existing text recognition methods mostly focus on single letter which does not consider the inner relation of words.In view of the above problems,this thesis carries out the following research:(1)Enhanced multi-channels MSER.A preprocess of blurring and sharpening is applied before MSERs are detected,then based on traditional MSER the R,G,B,H,S,I and Grey channels are introduced.Those combined channels result in detecting more refined scene text.Experiments show that the proposed method can detect more scene text.(2)Parallel spatial pyramid pooling CNN classifier(SPP-CNN).For removal of nontext MSER,the main improvements of the proposed method are: Spatial pyramid pooling(SPP)is added in the traditional CNN,and it can make the network handle arbitrary-sized image and improve the classification precision as well;Manually-designed features are embedded in the CNN to construct a parallel classifier,which utilize both advantages of CNN and manually-designed features.(3)Scene text recognition model based on recurrent neural network(RNN).In process of recognition,detected word regions are treated as sequence data,so RNN is chosen to predict the word sequence.Long Short-Term Memory(LSTM)is introduced to solve the vanishing gradient problem,which improves the inner recurrent units of RNN.Experiment shows that the recognition accuracy is improved.
Keywords/Search Tags:Scene text detection, Scene text recognition, Multi-channels MSER, Neural network
PDF Full Text Request
Related items