In recent years,the market scale of our insurance industry is expanding,and the competition has become more fierce.At the same time,with the deepening of the insurance enterprise information construction,has accumulated a lot of business data,but the traditional relational database have been unable to meet the massive historical data storage and analysis,and with the scale of the enterprise is more and more big,the data of departments,self-sustaining,no unified management and planning,for the connectivity between data,is very poor,As a result,data development redundancy,data use efficiency and data island phenomenon become more and more serious.In today’s data era,how to effectively integrate the data scattered in various business systems and obtain valuable information from massive data to help enterprises make decisions quickly and maintain their core competitiveness has become an urgent problem for the operation and development of insurance enterprises.The data warehouse system based on big data technology provides a better solution to this problem.Based on the actual project of an insurance enterprise,this paper describes and analyzes the design and implementation of insurance data warehouse system based on Hive.In this paper,some insurance enterprise’s actual business requirements and present problems of the enterprise as a starting point,the insurance has carried on the detailed demand analysis of the data warehouse system,and on this basis,from the data warehouse modeling,data processing,data management,data analysis and permissions management five aspects of data warehouse systems is designed and implemented.In the aspect of data warehouse modeling,data subject division,business bus matrix design,data layer design and data model design are completed by dimensional modeling combined with the company business.In terms of data processing,Hadoop,Flume,Sqoop and other big data technologies are used to complete data collection,transformation and loading,and synchronize data from the business system to the Hive data warehouse.In the aspect of data management,data standard management,metadata management and data quality management are used to reduce the repeated construction of data warehouse,improve the development efficiency,keep the data caliber consistent and improve the data quality.In the aspect of data analysis,OLAP engine Impala is used to complete the fast query and analysis of data in Hive data warehouse,and Django and Echarts are used to realize the graphical display of data.In terms of rights management,user management and role management are used to isolate system rights and ensure data security.Finally,by writing test cases for test verification,to ensure that the system function meets the requirements.At present,the Hi VE-based insurance data warehouse system has been formally applied in enterprises.It effectively solves the problems such as insufficient data processing capacity,non-standard data,and single data analysis form,realizes the whole-domain data collection and unified management within enterprises,and visually displays and analyzes the data in the data warehouse through a variety of charts.Provide the basis for business units and senior leaders to manage and make decisions. |