Font Size: a A A

Layout Analysis For Ancient Chinese Book Images Based On LOF And Wave Threshold

Posted on:2021-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y JiaFull Text:PDF
GTID:2428330620470582Subject:Engineering
Abstract/Summary:PDF Full Text Request
China has a long history and owns rich ancient Chinese books,with the rapid development of computer technology,using it to assist the research on ancient books is one of the best options.Due to the complicated layout structure of ancient Chinese book images,effective and accurate analysis is the prerequisite for the realization of Chinese characters recognition and retrieval of ancient books.This thesis conducts research on the ancient Chinese book images from the following two aspects.(1)Extraction of non-text components of ancient Chinese book imagesAiming at matter of poor accuracy of layout analysis caused by the non-text components of seals and annotations in ancient book images,the seal positioning method based on adaptive Canny operator,and the annotation extraction method based on Mask R-CNN were designed respectively.For the seal,the improved adaptive Canny operator was used to extract the edge contour information of the seal area,the shape parameters were selected to extract the seal features,so as to separate the ancient book seal from the surrounding Chinese characters.For annotation,firstly,employing the tool of labelme images annotation to label the ancient Chinese book images annotation dataset.Secondly,applying the Mask R-CNN model to segment the annotation images of ancient Chinese books,and got the prediction result mask images.Then,comparing the influence of different depth ResNet network on the recognition effect and speed of Mask R-CNN,selecting the optimal network architecture.Finally,the binary k-means algorithm was used to cluster the mask graph and extract the annotation components of ancient Chinese books.(2)A layout analysis method of ancient Chinese book images based on LOF and wave thresholdIn view of the diversity of layout composition,the complex structure and changeable style of Chinese characters in ancient books,an ancient Chinese book images layout analysis method based on LOF and wave threshold was proposed.On the basis of tilt correction of ancient Chinese book images,firstly,the layout features of ancient Chinese book images weresummarized by analyzing large ancient images.Secondly,a classification algorithm based on LOF classification was used to classify the layout area of ancient book images after projection,and identified the candidate mixed region which had segmentation problem.Finally,making use of the fluctuation threshold to segment the text and frame adhesion parts in the candidate mixed region to determine,outputted the text region in the ancient books layout.The 11560 images of ancient books are used as experimental data sets,which root in the mainstream literature on the study of Chinese characters in ancient Chinese books,such as "Wen Yuan for imperial Collection of Four ","Du Gong Bu Ji" and "Biography".Experiments are carried out on the layout analysis system of ancient Chinese book images,compared with layout analysis methods based on connected domain analysis,neural network and eigenvalue.The accuracy rate and recall rate of Chinese character images retrieval for ancient books are 87.02% and 81.31% respectively,and higher efficiency,its main performance is better than the comparison.It shows that,compared with similar methods,the thesis proposes way which analyses the layout of ancient Chinese book images can effectively analyze the ancient images,locate the text area and non-text area,thus laying a foundation for the retrieval and recognition of ancient Chinese character images.
Keywords/Search Tags:Ancient Chinese book images, Layout analysis, Seal positioning, Annotation extraction, LOF, Wave threshold
PDF Full Text Request
Related items