Font Size: a A A

Gzip-U: A Compression Algorithm Technique Research For Uyghur Text

Posted on:2018-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:L N R Y S F AiFull Text:PDF
GTID:2348330533456558Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of the amount of information brought by the mobile internet terminal,the amount of information carried by the network is extremely big.This trend makes the data compression highly valuable.The development of mobile Internet is guiding the growth of the amount of information on the terminal equipment.With the popularity of 4G networks,the amount of data shared by mobile clients is growing faster.The impact of the development of mobile Internet on network data and the limited bandwidth become the bottleneck of data transmission.In the process of massive data processing,the data compression algorithm is getting more attention.Data compression is to reduce the data occupied space without affecting the effective information,so as to make communication faster and more economical.In this paper,the importance of data compression in the era of big data is summarized.The basic concepts and methods of data compression,the basic ideas and several compression algorithms for lossless compression of text are presented.Besides,the distribution of the Uighur alphabet in Unicode encoding is analyzed.According to the current situation of Uighur text compression,a method of Uyghur text compression is presented.For better understanding of some of the core in the client development,Xcode development environment,characteristics of the development of the language,basic knowledge of development are briefly introduced.In this paper,Gzip coding on the Http protocol is used to put forward Gzip-U.Gzip-U is an improved Uyghur language compression algorithm specifically for Uyghur language.The main idea of it is to decompose a long string of characters apart first,and convert each character to encoding type from the Unicode table.Then,compare them with the prefix 06(Uyghur text area 06).If it contains the prefix is 06,remove the prefix first then compose array with entire data type. Finally,experiments on the same data shows that,compared with Gzip,the Gzip-U algorithm is more effective than the Unicode algorithm,and the compression ratio is greatly improved.
Keywords/Search Tags:Mobile Internet, Uyghur language, data compression, Gzip
PDF Full Text Request
Related items