Font Size: a A A

Offline Arabic Handwriting Identification Using Language Diacritics

Posted on:2011-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:T F LuoFull Text:PDF
GTID:2198330338988507Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Of?ine handwriting identification technique is a technique which used to identify thewriter of a given handwriting document based on the writing style. With the continuousexpansion of the applications, of?ine handwriting identification has become a very activeresearch topic in the fields of computer vision and pattern recognition. Handwriting iden-tification almost covers all typical problems in image processing and pattern recognition,such as preprocessing, feature extraction and classifier design, etc. The main purpose of thisthesis is to study the Arabic of?ine handwriting identification.After the Latin alphabet, Arabic is the second-most widely used alphabet around theworld, in addition to Arabic language it also used to write many other languages, such asPersian, Urdu, Pashto, Uyghur (in China), Swahili (East Africa), etc. But this widely usingof Arabic handwriting have no benefit for solving the language handwriting issues in bothidentification and recognition, this is due to the natural complexity of the language itself, andthe attempt of directly applying some successful methods in other languages into Arabic.In this thesis, we took a totally new direction in handling Arabic handwriting documentimage, we split the input document based on its content into two parts: one contains theletters, the other contains the diacritics. The purpose of this was to take advantage of thesimplicity of the diacritics compare to letters, because it well re?ects the individuality of thewriting style, and it is easy to segment.Using the IFN/ENIT Arabic handwriting database, the system we design follows the typ-ical handwriting identification system architecture, it contains the following components:1. Preprocessing: The images in the database were already denoised and Thresholded,So, the preprocessing function was just to segment the diacritics from the input docu- ment image.2. Features extraction: Local binary pattern histogram was calculated for each diacritic.The histograms of the diacritics written by the same writer then concatenated as afeature histogram.3. Classification: We apply two nested K-NN classifier, one for the diacritic, the otherfor the writer. We use the X2 function as a distance function.The experiments show that our approach is valid for Arabic handwriting identification. Ithas several advantages comparing to other methods proposed by others, such as it doesn'trequire many input samples, and it is an Arabic dedicated approach, which means, the co-existing of other languages with Arabic in the same input image will not in?uence the iden-tification result.
Keywords/Search Tags:Handwriting Identification, Arabic Diacritics, LBP
PDF Full Text Request
Related items