Font Size: a A A

Discussion On Data Quality Management And Realization Of Data Warehouse

Posted on:2008-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y X HouFull Text:PDF
GTID:2178360242960097Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The enterprises are more and more dependent on data with the progress of informationization. Data information becomes increasingly important strategic resources. The quality of data relates to the accuracy of information directly and has influence on the survival and competition of the enterprise.In data warehouses and various kinds of MIS systems, data extraction, data conversion and the quality control in the process of data merging have to be considered. One of the key points to this problem is scientific and efficient quality management and control of data, and enterprise-grade scheme of data manipulation should be set up to guarantee there are data warehouses of high quality to support decision making of managers. High quality data should at least meet the following requirements: accuracy Completeness Consistency currency Reliability, which are so-called 'C4R data quality measurements'Data quality management maturity models as shown in the following figure are usually applied to evaluate the developing stage that the enterprise is in on data quality. The model enables enterprises to identify and quantify the maturity level of data quality management, and helps in decision making. The levels of the model are:Unaware,Reactive,Proactive,Predictive。1,The framework of data quality management system.The framework of data quality management system defines that the requirements of data quality management be lead by business needs, and the complete process to formulate management strategy.2,Formulating the data quality management strategy.The data quality management strategy is the specific and practical policy, and is the premise to executing the data quality management work.The definition of the original drive of data quality management: The original drive of data quality management is the business need.The data quality management pattern : The data quality management pattern is mainly inter-departmental cooperation. In the pattern, the business department is the one to put forward the data quality management requirement. The administrative department is in charge of putting the data quality requirements from different departments together into a consistent data quality standard. The technical department's work is to set data quality monitoring points in the corresponding application systems and to periodically provide data quality evaluation report to the administrative department to help them evaluate and analysis the data quality.The data quality evaluation mechanism: Continuous evaluation of data quality is the basis of keeping the data quality improving. The evaluation is carried out through evaluation meetings that launched periodically by the administrative department. The participants of the meets should include data specialists from the business department, data quality administrators and data managing staff from the technical department. The evaluation aims to propose a clear analysis report on the current data quality status and to give solutions and suggestions for the improvements of data quality.3,The framework of data organizing management.The data quality management group is in the center position of the framework of data quality management. It is the basic working unit to contribute to data quality management activities. The groups can be established according to the business application topics, and constitute of people responsible for business, technology and data quality.4,The framework of data quality examination system.The data quality examination system is made up of examination execution platform, management configuration platform and result presentation platform. The procedure of work of the system can be described as follows:Stage 0: It is the initialization stage of the system, and the system need to perform operations such as importing data dictionary from metadata knowledge base and initializing code tables.Stage 1: It is the stage to configure data quality examination tasks. In this stage, first new examination task requests are proposed, and after the requests being approved, the quality examiner develops scripts for the tasks.Stage 2: It is the stage to release the examination tasks. Scripts that passed the tests, after being configured by ETL administrator, are put to an ETL Automation(NCR task management and dispatching tool) server as a scheduled Automation task. Stage 3: It is the stage to check the data files via calling file checking programs, and reports will be generated.Stage 4: It is the stage to examine the database and it is parallel to tasks in Stage 3, but in a different way. The detailed result will be stored in the database.Stage 5: File and database examination results should be manually reexamined. If there is any problem, a data quality problem report should be filed.Stage 6: Standard file examination report should be generated based on template files. It can be done manually or periodically.5,The overview of data quality examination implementation.The examination in the data warehouse mainly consists of two stages.1st Stage: The data warehouse is loaded from ods(operational data storage) to the temporary storage, meanwhile examination is performed on the levels of files, records and fields. This is the examination in the loading process.2nd Stage: The data are moved from the temporary storage to the data storage area of the data warehouse (pdata), and thereafter are examined via scheduled examination scripts on the specific items, such as the consistency of general ledger and ledger, interests and terms.After the two-stage examination, on the business level we guarantee the data quality and the consistency of business definition, and technically the consistency of files, records and codes is maintained to achieve higher data availability.In the process of data quality examination, the processing procedure on encountering exceptional data streams is:After finding data quality problems, revise the data according to the data checking report and reload the revised data to improve the data quality. This procedure must be based on data quality management system. Responsibilities of all the people involved must be clarified and people must cooperate with each other to make the procedure finish smoothly and in time. Otherwise, the data cannot be revised in time, and thus the data quality management system and data quality examination procedure are both indispensable.6,Data quality responsibility is an important way to guarantee the data quality.The data quality management system of CCB sets up a faultless data management responsibility system, formulates rules on data quality management, regulates the responsibility of each position, and makes clear definitions on the responsibility of each stage of the data quality management procedure. In the daily work of data quality management, all the rules and regulations should be followed strictly and the responsibility of every stage should be fulfilled.Data responsibility regulations is an important way to guarantee the data quality. Once the regulations are neglected, the data quality management system would exist in name only. Only if with clear data quality responsibility regulations can the enterprises acquire accurate information to make decisions and save enormous time and costs in developing new applications and data enquiries.In sum, the data quality management system of CCB is an all-involved management and control system. Departments involving data application and management play different roles and have clear-cut responsibilities in the whole system. Integral cooperation patterns and corresponding coordinating scheme have been formed. On one hand the system can satisfy the requirements of fast-developing business, on the other hand, it helps CCB form a united data quality management power, and a good working pattern for future data application services. The practice of CCB in the data quality management and control sets a good example for other enterprises.
Keywords/Search Tags:Realization
PDF Full Text Request
Related items