Deep Learning Based Urdu Optical Character Recognition

Posted on:2018-01-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Ibrar

Full Text:PDF

GTID:1318330518995981

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Due to unprecedented developments in machine learning coupled with pattern recognition and computer vision algorithms, very successful Optical Character Recognition (OCR) system can be seen in every part of daily life.Optical character recognition is a field of research in pattern recognition, artificial intelligence and computer vision. OCR used for the recognition of text documents is widely applicable in both research and industry. OCR is a way to convert typewritten, handwritten or printed text into machine-encoded text. OCR for many languages like Mandarin (Chinese),Spanish, English, Arabic, Japanese, Russian etc. are much more accurate and have numerous applications in daily life. However, there are some Arabic script languages like Urdu, Persian, Pashto, Balochi and Sindhi etc. that still need much more advancement and improvement in the OCR field. All these languages pose difficulties for researchers and developers in dealing with the wide variability of characters' shapes and cursiveness. Thus, it is an uphill task to devise and develop OCRs for such languages. Although, the research in this field has got some momentum from last decade but still the dilemma is the scarcity of resources and researchers.Deep Neural Networks (DNN) is outperforming for classification and recognition tasks. This better performance is primarily due to automatic feature learning. These features often lead to better performance than human engineered feature. Also, there is no need expertise of features extraction, nor requires domain knowledge. Furthermore, features can be extracted from different domains with the help of same algorithm Autoencoders, stacked autoencoders and Long Short Term Memory (LSTM), Bidirectional LSTM(BLSTM) are the forms of DNN incorporating multi-layered feature processing and learning.The objective of this thesis is to improve Urdu OCRs by use state of the art machine learning techniques. Firstly, segmentation is improved by putting forward line and ligature segmentation algorithms. These algorithms performed with better accuracy by using thresholding method with a curved line split algorithm and better allocation of dots/diacritics. Secondly,recognition of Urdu text is performed on ligature and line levels by using deep learning methods. Autoencoders are employed for ligatures' feature extraction instead of human crafted features. For classification of ligature,softmax and SVM are employed at the output layers, achieved accuracy of 98%. LSTM networks are successfully employed for context based Urdu sentence recognition in existing contributions. Gated BLSTM (GBLSTM)networks is introduced and evaluated on UPTI datasets that yielded better results than other prevalent OCR techniques. Gated BLSTM takes sentences as input labelled with ligature and softmax at output layer recognized sentences with 96% accuracy. All prevalent context based sentence recognition contributions rely on character based LSTM.

Keywords/Search Tags:

Urdu text segmentation, Nastaleeq script segmentation, line and ligature segmentation, Urdu Nastaleeq ligature recognition, offline printed ligature recognition, Arabic script, Denoising Autoencoder, Deep learning Network, Classification

PDF Full Text Request

Related items

1	Urdu Natural Scene Text Recognition And Detection
2	Novel Word Recognition and Word Spotting Systems for Offline Urdu Handwriting
3	Offline Handwritten Arabic Segmentation Algorithm And Multi-queue Grapheme Merging Model
4	Sentiment Analysis Of Roman Urdu Sentence With Deep Neural Networks
5	A segmentation-free approach to text recognition with application to Arabic text
6	Research On Key Technologies Of Text Segmentation
7	Handwritten Chinese Text Recognition Based On Deep Convolution Model
8	Research On Deep-Learning-Based Text Recognition And Document Segmentation And Its Application
9	Segmentation And Recognition Of Images And Videos Based On Deep Learning
10	Writer-independent Unconstrained Handwritten Offline Chinese Text Line Recognition