Font Size: a A A

Research On Text Detection And Recognition Of Laboratory Test Sheets In Natural Scenes

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q H HuangFull Text:PDF
GTID:2434330614956719Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the development of artificial intelligence has provided strong technical support for intelligent medical treatment.Intelligent interpretation of laboratory test forms based on computer vision can enable patients to understand their own conditions in a timely manner,which can effectively improve the efficiency of medical treatment and relieve the pressure of medical treatment.The first work of intelligent interpretation of laboratory test sheets is to convert the image data of laboratory test sheets into structured text data.Therefore,the task of text detection and recognition of laboratory test sheets is particularly important in natural scenes.In order to solve the problems that the current methods of test sheet text detection and recognition in natural scenes are not effective enough and cannot reach practical standards,this paper combines traditional image processing methods and depth learning methods to study the test sheet text detection and recognition in natural scenes.Specifically,the research is carried out from the following aspects:1)A new method of text detection in natural environment(BHS-CTPN)is proposed.The invention can effectively solve the problems that the effect of the current test sheet text detection method cannot reach the practical standard and the sensitive information area cannot be effectively filtered.Firstly,a series of preprocessing methods such as BRISK,Hough and Sauvola are introduced to correct laboratory sheets,remove sensitive information regions and enhance images.Secondly,CTPN network model is improved in terms of convolution kernel setting and anchor setting during feature extraction.Finally,text box merging strategy is optimized.Compared with CTPN model,BHS-CTPN method has increased its accuracy,recall rate and F1 value by 8%,10% and 9% respectively.Compared with Huawei API interface,which has the best effect in the field of test sheet text detection,BHS-CTPN method has increased its accuracy,recall rate and F1 value by 6%,3% and 5% respectively.A large number of experimental results show that BHS-CTPN method can effectively and accurately detect the text position in the laboratory sheet under natural scenes,which lays a solid foundation for later text recognition.2)An improved CRNN network model is proposed for character recognition of laboratory test sheets in natural scenes,which can effectively solve the problems of different text box sizes,fuzzy characters,and easy error recognition of similar characters.Firstly,the number of network layers is deepened during feature extraction.Secondly,the text box size is set according to the data distribution.Finally,the convolution kernel is split to optimize the network model.Compared with CRNN model,the improved CRNN method has increased its accuracy,recall rate and F1 value by 7%,5% and 6% respectively.Compared with Huawei API interface,which has the best effect in the field of text recognition of laboratory test sheets,the accuracy,recall rate and F1 value have increased by 3%,2% and 3% respectively.A large number of experimental results show that the improved CRNN method can accurately recognize text box sequences,which lays a solid foundation for the interpretation of later laboratory tests.3)A post-processing correction method for the recognition of laboratory test sheet characters in natural scenes based on language models is proposed,which can effectively solve the problem of confusion in the recognition of form and near characters in laboratory test sheet characters in natural scenes.In this paper,a statistical language model is first introduced to carry out conditional probability statistics on the recognition region matrix to predict the recognition results that best conform to the medical thesaurus.Then,the recognition results are corrected before and after according to the corresponding relationship of the examination items.Finally,the recognition results are corrected based on the fusion editing distance and the longest common sub-sequence method.After the post-processing correction method is introduced,the accuracy rate,recall rate and F1 value are increased by 2%,3% and 2% respectively.Experiments show that the post-processing correction method for text recognition of laboratory sheets in natural scenes based on language models can further improve the recognition accuracy of text boxes.To sum up,the three methods proposed in this paper can effectively solve the problem of text detection and recognition of laboratory test sheets in natural scenes.Make great contributions to the intelligent interpretation of laboratory test forms and promote the development of intelligent medical care.
Keywords/Search Tags:artificial intelligence, natural scene, laboratory test sheet, BHS-CTPN, CRNN, language mode
PDF Full Text Request
Related items