Font Size: a A A

Layout Analysis And Recognition Of Graphic And Mixed Images

Posted on:2020-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z W ZouFull Text:PDF
GTID:2428330575455418Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,more and more users share images with their friends through social software(hereinafter referred to as "mixed images"),and there is a huge amount of information in mixed images,which makes it impossible for users to obtain important information in a short time.In order to help users get as much effective information as possible from mixed images,a layout analysis algorithm based on mixed images is proposed in this dissertation.This algorithm can not only distinguish text heading area,text body area and image area of mixed image quickly.Moreover,it can efficiently identify the content of text heading area and image area,and obtain important information in mixed image with low computational complexity.The main contents of this dissertation are as follows.(1)A connected domain layout segmentation algorithm based on contour projection is proposed.On the basis of image preprocessing of mixed image,firstly,this algorithm expands the whole mixed image into single word region based on eight connections.Then,different regions are roughly divided according to the regularity and periodicity of waveform in gray histogram after contour projection.Finally,each connected region is merged by introducing row(column)interval thresholds and graph-text interval thresholds to distinguish text header area,text text text area and image area more effectively.(2)A word recognition algorithm based on multi-level partition is proposed.The algorithm uses 36x36 dot matrix normalization to process the title Chinese characters.Firstly,the title Chinese characters are roughly divided.According to the absolute distance,the first m Chinese characters matching the title Chinese characters are selected from 7000 Chinese characters in the dictionary library.Then,the title Chinese characters are subdivided.The first n(n<<m)matching Chinese characters are selected from each m Chinese character according to Euclidean distance.Finally,the final matching is completed.Through rough and fine partition of Title Chinese characters according to the matching degree,the computational complexity is reduced,and the recognition efficiency of the algorithm is improved.(3)The image matching technology of SIFT algorithm based on local features is studied.Firstly,the image scale space pyramid is constructed to find the extremum points.Then,the stable feature points are determined by screening the extremum points.Finally,the image is matched and recognized according to the local descriptor represented by the feature points.The layout analysis system of the mobile end is designed by ransplanting the layout analysis algorithm based on image-text mixing to the mobile end.The system can segment the layout of mixed image,identify the title area and image area accurately,and push the recognition results to users through mobile phone.Through the analysis of algorithm experiment and system test,the effectiveness of the algorithm and the practicability of the system are proved.figure[22]table[8]reference[63].
Keywords/Search Tags:image mixed text, layout segmentation, word recognition, image matching technology
PDF Full Text Request
Related items