Font Size: a A A

Design And Implementation Of Uyghur Words Automatic Proofreading System

Posted on:2014-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:X G L A B D R X T RuFull Text:PDF
GTID:2268330401965994Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Automatic error detection and error correction is important and attractive researchfield in natural language process. With the development of modern internet technologypromotes the fast development in internet networks of minority groups. With thedevelopment of publishing industry, how to guarantee the correction of these textsseems more and more important. So, the development of a Uyghur automatic errordetection and error correction is seemed to be very important in minority regions, and itwill have broad market prospect.In natural language processing research, Uighur text for error detection and errorcorrection of the new research topic in recent years, it accumulated some achievements.Traditional error detection and correcting techniques of Uyghur script have manylimitations, it ignore the Uighur language-specific phenomenon. They are mainly basedon huge corpus and word frequency-based statistical methods. This paper presents aUyghur automatic error detection and error correction system。Considering the uniquecharacteristics of the Uighur text, solve Uighur text proofing problem based on Uyghurlexical analysis and voice and regulations analysis.This thesis is on the design and implementation of Uyghur automatic errordetection and error correction system,object oriented approach concerning rapiddevelopment and maintenance of facilitating is used.Combining with the developmentof software engineering theory,the actual demand of users and detailed needs analysis,the design,implementation and testing Uyghur proofreading system is carried.Firstly,this paper describes automatic text proofreading development in and outside our countryand puts forward two automatic proofreading techniques by analyzing Uyghur texterrors. Secondly,This paper establishes Uyghur roots and affixes databases based onAccess databases through a thorough analysis of the characteristics of word formation.Furthermore,this thesis gives the composition of Uyghur text automatic proofreadingsystem and its key technology, detailed discussed the specific method. The stem andaffixes segmentation method, sub-syllable segmentation method, assimilation and itsprocessing method, voice harmonious processing method and the best candidate selection method are also the detailed discussion point of this thesis.Error detection and ranking are very important parts of language analyzing.According to implementation of these rules on texts from the Xinjiang daily newspaper,the rates of recall and precision is91%and87%respectively. The developed Uyghurproof-reading system has good running ability at the system,speed,stability andother aspects.The methods described in this paper will has the more promising futureapplications in machine translation and natural language understanding fields.
Keywords/Search Tags:Uighur scripts, Lexical analysis, Error detection and correcting, Minimumedit distance
PDF Full Text Request
Related items