With the development of the Internet, e-commerce has developed for 20 years so far. The academic study of e-commerce is emerging in endlessly, and there are also a great many study of e-commerce sites of consumer behavior. The ability of dealing with huge amounts of data quickly and real-time analyzing, effective will decide whether the enterprise can quickly respond to market changes, to make a decision, thus to gain the development opportunities. In this context, obtain valuable information from the vast amounts of information as soon as possible can be valuable. At the historic moment, SAP HANA (SAP High-Performance Analytic Appliance) arisen. The real-time platform have powerful functions in analyzing, storing and processing large data, and can fully exert great commercial value of data, help enterprises to cling to the opportunities, make real-time decisions.Based on HANA database and the corresponding components installed on the HANA, my research use one year’s trading information, which is provided by Japan leading group-buying website Ponpare in big data race platform website kaggle, to do the forecast analysis research. This thesis research work mainly as follows:1. Design the overall system architecture in this paper, and guarantee the smooth implementation overall functions in HANA. Mainly includes the data extraction, data warehouse, data processing and analysis layer. This article initially stored in the Oracle database data as data sources, the EIM as a tool to isolate data extraction to the HANA, PAL and R language as algorithm based on HANA complete data preprocessing and analysis tools. Data in several components can realize barrier-free circulation, satisfy the continuity of the system.2. Browsing the shopping information for customers and personal information, and introduced the original information of a coupon analysis, data preprocessing, initial data provided by the website through data integration, data cleaning, data preprocessing of data transformation, data fusion, the missing value fill and numerical normalized operation, and the data can be used to study, and introduces how to utilize the HANA PAL and AFM tools to achieve this. For data preprocessing before data mining, can improve the efficiency of data mining, the time required to reduce mining.3. Using the R language environment based on HANA, in HANA database environment, to implement the recommendation system. Use cbind function to transform vector and matrix into a new matrix, and according to the attribute importance gives different weights to the matrix. Then using cosine similarity to calculate user attributes and the difference between the coupons and sorting, and get customers most likely to buy coupon IDs. By comparing the actual user purchase products and recommended product type and area, for the result accuracy, and accuracy rate reached more than 80%.This article the recent popularity of e-commerce data mining combined with SAP HANA’s new database in recent years. Through the latest components enterprise information management (EIM), forecast analysis library (PAL) to complete the migration of data, data pretreatment, and data forecast analysis. |