A Study On Stat.-based Chinese Character Recognition Post-processing

Posted on:2004-12-02

Degree:Master

Type:Thesis

Country:China

Candidate:T Peng

Full Text:PDF

GTID:2168360122461171

Subject:Computer applications

Abstract/Summary:

PDF Full Text Request

With the development of the computer and network technology at full speed, it is needed to digitize the large amount of text in daily life on various kinds of medium. In order to raise the efficiency and lighten people's burden, OCR (Optical Character Recognition) technology has appeared. In recent years, Chinese character OCR study had already made heavy progress. A lot of commercialized recognition systems trend market successfully. But the character that Chinese character's structure is complex and change greatly often restrict the discerning rate of the individual character. Only rely on the single character recognition, raise the discerning rate is already very difficult. Based on the individual character recognition, it is needed for us to do post-processing using language knowledge and context relevant information of text.This thesis introduces the research meaning and some methods of Chinese characters recognition post-processing. And adopt stat.-base method to do the post-processing to the single character recognition result. Through counting all the adjoined two words in "People's Daily" text of the whole year 2000 (about 19,300,000 words), get the probabilistic relationship between the Chinese characters. According to Markov language model, use this probabilistic relationship between the Chinese characters into Chinese character post-processing. It can raise the discerning rate of the whole system to a certain extent.

Keywords/Search Tags:

Chinese character recognition, Post-processing, Text counting, Probability of the adjoined two words, Markov model

PDF Full Text Request

Related items

1	OCR Error Post-correction Based On Chinese Character-level Features And Language Model
2	Bank Check Print Chinese Character String Identifying The China-africa Amount
3	Research And Implementation Of Express List Recognition Based On OCR
4	Natural Language Processing Of Chinese Text Automatic Proofreading
5	Research On Character Level Chinese Scene Text Detection And Recognition Based On Deep Learning
6	Research On Chinese Character Recognition Method Based On Deep Learning
7	Research On Hidden Markov Model And Its Application To Image Recognition
8	Chinese Postal Address Recognition
9	The Research On Attention-based Chinese Text Recognition
10	The Design Of A Dynamic Text Classifier Based On VSM