This article is based on OCR technology B737aircraft scheduled inspection datacollection systems R&D in key technologies for the Civil Aviation Authority project. Thisproject is one of the prerequisites of the Civil Aviation digital maintenance can be achieved.In particular, to solve the problem of the digital information of job cards, lay the foundationfor the civil aviation digital maintenance.The key technology of this project is to post-processing handwritten Chinese characters.It is a subsystem of the project mentioned above. The so-called handwritten Chinese characterprocessing is the use of a lexicon and language model to simulate human judgment typo ormissing word to correct the OCR (Optical Character Recognition., Optical characterrecognition), character recognition result.It is mainly used to optimize the completion of thecharacter recognition of handwritten Chinese characters in the content in the regularinspection work cards.The paper first analyzes the basic theory and algorithm of the present Chinese characterrecognition processing, using a combination of word matching and language model. Secondly,the B737Professional Dictionary, professional vocabulary, word matching method to matchthe specialized vocabulary handwritten Chinese character part of the job cards, outputmatching results. Again, the Chinese characters in the regular inspection work cards, statisticsare statistics thesaurus, the use of statistical thesaurus to identify. The principle is to calculatethe probability of a candidate is given by the conditional probability of each candidate word,and word recognition results according to the language model, both derived by adding theintegrated probability in a certain way, in the integrated probability to find the probability ofthe largest candidate the output results. Finally, post-processing and character recognitionresults compared and verified the validity of the method.Technically this article use the VC++programming technology development andapplication of ADO technology operations ACCESS database, artificial, TXT files and XMLdocuments on the list to identify entry and post-processing of character recognition resultsthrough the application of professional vocabulary matching and statistical language model,OCR character recognition results to optimize. |