Design And Implementation Of Uyghur Words Automatic Proofreading System

Posted on:2014-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:X G L A B D R X T Ru

Full Text:PDF

GTID:2268330401965994

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Automatic error detection and error correction is important and attractive researchfield in natural language process. With the development of modern internet technologypromotes the fast development in internet networks of minority groups. With thedevelopment of publishing industry, how to guarantee the correction of these textsseems more and more important. So, the development of a Uyghur automatic errordetection and error correction is seemed to be very important in minority regions, and itwill have broad market prospect.In natural language processing research, Uighur text for error detection and errorcorrection of the new research topic in recent years, it accumulated some achievements.Traditional error detection and correcting techniques of Uyghur script have manylimitations, it ignore the Uighur language-specific phenomenon. They are mainly basedon huge corpus and word frequency-based statistical methods. This paper presents aUyghur automatic error detection and error correction systemã€‚Considering the uniquecharacteristics of the Uighur text, solve Uighur text proofing problem based on Uyghurlexical analysis and voice and regulations analysis.This thesis is on the design and implementation of Uyghur automatic errordetection and error correction systemï¼Œobject oriented approach concerning rapiddevelopment and maintenance of facilitating is usedï¼ŽCombining with the developmentof software engineering theoryï¼Œthe actual demand of users and detailed needs analysisï¼Œthe designï¼Œimplementation and testing Uyghur proofreading system is carriedï¼ŽFirstly,this paper describes automatic text proofreading development in and outside our countryand puts forward two automatic proofreading techniques by analyzing Uyghur texterrors. Secondlyï¼ŒThis paper establishes Uyghur roots and affixes databases based onAccess databases through a thorough analysis of the characteristics of word formation.Furthermoreï¼Œthis thesis gives the composition of Uyghur text automatic proofreadingsystem and its key technology, detailed discussed the specific method. The stem andaffixes segmentation method, sub-syllable segmentation method, assimilation and itsprocessing method, voice harmonious processing method and the best candidate selection method are also the detailed discussion point of this thesisï¼ŽError detection and ranking are very important parts of language analyzing.According to implementation of these rules on texts from the Xinjiang daily newspaper,the rates of recall and precision is91%and87ï¼…respectively. The developed Uyghurproof-reading system has good running ability at the systemï¼Œspeedï¼Œstability andother aspectsï¼ŽThe methods described in this paper will has the more promising futureapplications in machine translation and natural language understanding fields.

Keywords/Search Tags:

Uighur scripts, Lexical analysis, Error detection and correcting, Minimumedit distance

PDF Full Text Request

Related items

1	Research On The Related Problems And Application Of BCH Codes
2	Research On Error-correcting Code And Its Application
3	Packet Error Control System Performance Analysis And The Best Error Correction Code
4	Digital Fingerprint Based On Error-correcting Codes
5	Analysis And Design For Cryptographic Technique Based On Error Correcting Codes
6	Structural Error Detection And Lexical Parsing In Rule Description Method
7	FPGA Design And Implementation Of Spacecraft EDAC System Based On TMS320VC33Development Platform
8	Identification Key Technology Research Based On Uighur Handwriting Characteristics
9	Research On Information Error-correcting In Wireless Sensor Networks
10	The Application Of Rank Distance Codes To Cryptography