Font Size: a A A

Information Retrieval Of Uyghur Language

Posted on:2013-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ZhaoFull Text:PDF
GTID:2248330362960613Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of the society, the progress of science and technology, particularly the promotion and application of the computer technology builds a good human and t echnological environment for the research and appl ication of Uyghur language, and put s f orward m any ne w requirements for the tr aditional la nguage research.In or der t o maintain network information s ecurity and prevent ha rmful information dissemination, information retrieval of Uyghur l anguage is a n very important work. In a sense, it plays an certain role in strengthening the national unity and c onstructing t he s ocialism ha rmonious s ociety. To retrieve the inf ormation efficiently and correctly required by users, it needs accurate Uyghur segmentation and stem extraction, including as follows:1 Uyghur databases are the data foundation of Uyghur stem extraction. So this paper es tablishes Uyghur affix databases and s tem databases b ased on Access Databases t hrough a t horough a nalysis of t he c haracteristics of w ord f ormation. Uyghur databases can be perfected and updated by its uniform management.2 Uyghur stem extraction: First, it includes Uyghur word segmentation based on space separator and Uyghur words storage. The paper makes use of the method of combining Taking M aximum M atching and Reverse M aximum M atching, which realizes the segmentation of Uyghur word. Then we get the final stem and its relevant information and save the results.3 To build a strong and comprehensive background information database is the most powerful support for comprehensive, accurate, efficient information retrieval. On the basis of establishing the background database, the algorithm of the vector space m odel is employed in this pa per to realize the retrieval of Uyghur to be inquired.To sum up, through the self-learning the background database can be constantly improved. Then Uyghur language information retrieval will be more accurate, and it will provide data support for the future of Uyghur studies.
Keywords/Search Tags:affixation, stem, Access database, vector space model
PDF Full Text Request
Related items