Font Size: a A A

Realization Of The ECTL Tool Which Based On The Transformation's Process

Posted on:2005-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:X H ZhangFull Text:PDF
GTID:2168360125450774Subject:Computer applications
Abstract/Summary:PDF Full Text Request
In recent years, with the graveness of global industry competition, more and more corporations can't be settled for exterior data reference, they need acquiring deeply meaning of data which based on conformity all info source of the corporations to direct the leader's decision-making and analysis. But to most of the corporations, the most difficult thing is that it is very difficult to centralize using data. It is really because: first, data being stored separated, what will cause making info isolated island; second, data storing formats are very various, which make it is very difficult to consolidate format; at last, data that leaved by history reasons is redundant,deposited, inaccurate, and they are not easy to be conformed analyzed ……corporations just can stay in state of managing data, but can't ascend to the stage of deeply digging to distill usable info to advance corporations' development, this result the precious info resource be wasted.It is obvious , the conformity of data resource is basic of data warehouse,OLAP and DM, it's quality will touch validity of using super-stratum data, to data integration using it is also very important. Because data resource's conformity come down to very big mount data quantity, we can't completed depend on hand work, which result the needing of a kind of data resource conformity tool to accomplish batch disposal. But to different corporations, whose data resource status are also very different, how to design a kind of universal tool is the most important issue that we will studied.We can divide the course of data resource conformity into for parts: data extract, cleaning, transformation, loading, this course is called ECTL, and because cleaning and transformation can be hanged together, many books and publications merge cleaning into transformation, called ETL. This is the reason why we call this tool is ECTL tool. The product of ECTL that is so infantile in the world, has no the process of data cleaning, which is restrict within the data commutation between data-base. The foreign product has perfect system in function and conformity, but in our country market, the application of foreign product is so complex that professional who can apply the foreign product freely is difficult to grow up. They depend on that the item implementation experts provide package service in our country. The price is so expensive that the enterprise can't endure it. So to our country, it is urgent to produce a king of ECTL product that have perfect function,strong extension,easy operation,rational price, which can fill in the requirement about the data conformation to our country enterprise. Because data source are manifold, the courses of processing are also manifold, we will leave old design thinking, we can't fixed system process frame, because if we fix the process frame, we can't find out all requirement combination, that will cause be localized in a fixed function bound, then we will be fixed to a old way: just realize some functions, when it need extended, it will be localized our thinking is that we can design a process which can changed by practical instance, that will adapt to all kinds of data source's conformity command, forming a really universal ECTL tool..In order to realization this thinking, we decompose data source conformity procedure, using commutation chart to define realization process, the nodes of commutation chart can be used to process data, using borders to realize transmission between different nodes, using commutation species to realize deal with data, in this way a process that can define construction any time will realize complicated idiographic applications. When the commutation is running, we adopt process driver engineering to make it run, monitoring running state of each node integrally, tracking the format control by the metadata, then the log file come into being to proof the correction of the process. In general, this kind of design has character as follow: 1.ECTL course produce by the definition of process; 2.ECTL cour...
Keywords/Search Tags:Transformation's
PDF Full Text Request
Related items