Font Size: a A A

Application Of Data Warehouse Technique In Communications

Posted on:2010-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:M LiuFull Text:PDF
GTID:2178360272496636Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Today when information is rapidly developing, the information plays the vital roles in the survival and development of enterprises. Because the database technique is widely applied and the enterprise information system produces a large amount of data, it becomes the important difficulty facing the decision-making and management personnel of the enterprises how to withdraw the useful information for decision-making of the enterprises from such a large amount of data. All these need the technical support, some BI tools such as data warehouse and data excavation, etc.The main target of the project is to establish an operation analysis system on the basis of not influencing the normal operation of the current business systems of Jilin Mobile Communication Co., Ltd., extract, transform and integrate of the data in the current multiple business systems, make use of the parallel computing to excavate the generated detail data and comprehensive data, and help the DSS analyzers and high-level decision-makers, etc. to make the strategic decisions for the development of the enterprises in the future. With the continuous enriching of the functions of the data warehouse system, the data warehouse will be developed into the information center of the enterprise and the nerve center system in the enterprise. If the comprehensive data warehouse system is built, the telecom enterprise can obtain the data information of the customers in various aspects. Focused on this information, the management portal of"customer demand"can be developed, the customer's demand can be found in time and then the suppliers in some fields such as retail, etc. are contacted according to the customer's demand and the satisfactory and cheap product services are provided for the customers. Based on this, the telecom operator can be developed into the information operator. With the information operation, the new profit-making modes can be opened and more profits can be made.The main technical means applied in the project are introduction of the ODS layer (Operational Data Store) and application of the parallel database technique. For the introduction of the ODS layer, a new layer, ODS is added on the basis of the structure of the DB-DW system, so the minimum influence on the current business system is made. With the assistance in the enterprise in completing the routine decision-making data analysis and processing, the performances of the current system is improved, to some extent. The parallel database technique is applied in the development and construction of the system and the data division technique is used to make the loads uniformly distributed on all the nodes and bring about the minimum network traffic between the nodes, thus greatly improving the speed of data inquiry and accelerating the process of decision-making analysis.The data warehouse theory and design methods are studied. The operation of the demand analysis part of the data warehouse system is known and the relevant data models are studied. The construction modes of the data warehouse system are studied, the advantages and disadvantages of the top-down method and the bottom-up method are fully compared and the two methods are used to carry out the data warehouse project of Jilin Mobile Communication Co., Ltd.The ODS theory is analyzed and studied and its improvement is presented. The improved ODS theory is applied in the construction of the data warehouse system of Jilin Mobile Communication Co., Ltd. and good achievements have been made. The design of the data granularity and data division in the data warehouse is studied, their influences on the performances of the data warehouse system are analyzed, the parallel database theory is studied and the construction of the data warehouse system based on the IBM DB2 EEE parallel database is carried out. The design of the parallel database application is fully known and various design methods are analyzed.In the design of the data warehouse project of Jilin Mobile Communication Co., Ltd., classification is made according to the different categories. It includes two major analysis themes, that is, big customer analysis theme and competitor analysis in the businesses. It includes OLAP analysis function and data excavation in the functions. The data sources mainly come from all telecom business systems in the data sources, including charging system, operation system, customer service system, financial system, settlement system, network management system and other systems, etc.Because the project has an especially large amount of data and it is considered that the emphasis goes to the performances of the database in the data warehouse platform, the IBM DB2 EEE is used as the data warehouse. The DB2 EEE is a complete parallel database, with the strong computing capability. The system has a good expandability. In the OLAP analysis process, we adopt Powerplay, the Cognos OLAP analysis, which is simple and easy to use, with the strong drilling, data slicing, rotating and interactive graphic analysis capabilities and makes the customers be able to conveniently visit, investigate and analyze the data in the multi-dimensional data sources. In the data excavation part, the data excavation tool (Enterprise Miner) of SAS is adopted. It is an enterprise-level data excavation and integration environment, with the strong functions and high operating efficiency but high price. In the ETL part, all the ETL programs are made by ourselves. In the data withdrawal part, a general data withdrawal program based on the SYBASE database is designed. It can high-efficiently withdraw the data in the production system according to the requirements of the DB2 database and carry out transformation of some simple data formats in the withdrawal process. The data transformation is made when the high-granularity data are generated and realized through relevance of tables. In the loading part, a general DB2 data loading program is made and the IMPORT, the introduction tool of DB2 is mainly transferred to realize the loading of the data.The structure of the three-layer system based on ODS is adopted in the project, a great number of detail data are stored in the ODS layer and some high-granularity summary and statistic data are mainly stored in the data warehouse layer, which directly serves the subsequent OLAP analysis and data excavation. Because the data warehouse of Jilin Mobile Communication Co., Ltd. is the one at the level of mass data, with a large amount of data, some detail-level tables such as gotone sound detailed list, etc. are made for design in the tables, which brings about the high-efficiency processing of each table and rapid deletion of the historical data. In consideration of the uniform distribution and better expandability of the loads on all the nodes, the DB2 EEE data partitioning method is adopted, one or some suitable partitioning keys are chosen and the HASH method is used to carry out distribution between the multiple database partitions.The star model is adopted for the design of the whole system, the fact tables and dimension tables needed in the analysis and the relations between them are first determined and then the data sources of each field in the fact tables are determined, so the ETL program can smoothly produce the relevant data according to the demand.The detailed process of realization of the project is as follows. In line with a general data withdrawal program of the data warehouse of Jilin Mobile Communication Co., Ltd., the data source end is the SYBASE database and the target base is DB2 EEE. The two functions are mainly realized, that is, the data in the SYBASE can be withdrawn according to the time field and the data format transformation in line with DB2 EEE can be realized in the withdrawal process. In the program, some basic information such as service name, database name, customer's name, passwords, table name and withdrawal conditions, etc. are first read from the configuration file. If there is any fault in the reading process, the relevant fault information will be given to the customer.The read service name, database name, customer's name and passwords are then used to connect the database and judge whether the operation of connecting the database is successful or not. If the connection with the database fails, it will retreat from the program and the clear fault information will be given to the customer, for example, fault service, nonexistence of the database or fault passwords. The initialization needs the structure of the withdrawn table, the withdrawn conditions and the withdrawal conditions for initialization are read from the configuration file. The withdrawn time field, time expression type and relevant value are mainly determined in the realization. If there is any fault in the process, the fault processing will be made.And then, the structure of the table is read. The main thought is to take each field name, field type and relevant length from the syscolumns system table. If we have no authority to read such a table or there is no such a table, this is a fault. The read structure of the table is put in a structure.And finally, the relevant inquiry sentence is generated according to the structure of the table. The BCP system transfer is used to read the data into the internal memory buffer, carry out the processing of the field type, transform the value of the datetime into the one recognized by DB2. When the data reach 5000 columns, the data stored in the internal memory buffer is read into the file. Such an operation shall be circularly carried out until the data meeting the conditions are completely read out. In the project, the application of the data warehouse technique in Jilin Mobile Communication Co., Ltd. is taken for an example, the design and development process as well as realization of some functions of the system are dealt with and some disadvantages such as stability of the performances of the data warehouse system produced in the design of the system to be improved, realization of the data supplementary program based on the database log and reduction of digital redundancies, etc. are concluded so as to perfect the design in the future.
Keywords/Search Tags:Data warehouse, ODS, Operating data storage, Data granularity, Data division
PDF Full Text Request
Related items