Font Size: a A A

Design And Implementation Of Language LP For Log Parser

Posted on:2013-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:J L LiuFull Text:PDF
GTID:2268330392969554Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the growing num of people using the Internet and the network applications,the global Internet technology has been in the direction to the cloud computingcontinuous development. The cloud computing grows rapidly because of the data ofInternet is growing fast. The amout of data producted annually by the entire Internetis about800million TB. Fast processing and analysis of these data become the keyto surivival of Internet companies.Search engine as the highest number of Internet application, its log datagenerated by the servers is massive. The log Data is usually semi-structrured text,and it’s a place regular structure. Only transforming the semi-structured data intostructured data, and storing in a distributed storage system(such as Tencent’s XFS)or database systems(such as Tencent’s XCUBE), the quality of data can be usableand useful. And the job of datamining can do better.This paper studies on designing a script language to be able to parser(ETL) logdata more simplisticly and efficiently. The script language can avoid writing thetedious conversion sourcecode, reducing code maintenance cost. Taking into accountthe distributed processing of huge amounts of data, we can increase machinesparallelly without changing the ETL code.Language LP(Log Parser) is a DSL(Domain Specific Language) language. Thepurposes of designing LP include the following:(1) Supporting of basic language feature. Including lexical, syntax, semantics, anddynamical loaded plug-in libaray, and code optimization. LP language will focus onthe extraction, transformation and cleaning of log data.(2) The design and implementation of libaray functions which providing ETLfunction. The functions will have efficient design and implementation.(3) The design and implementation of a distributed compiler based on cloudcomputing. It will provide the support of etling huge amount of data.
Keywords/Search Tags:DSL, script language, log parser, ETL, distributed system
PDF Full Text Request
Related items