Font Size: a A A

Research And Design Of E-commerce Big Data Analysis System Based On Spark

Posted on:2021-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Q WuFull Text:PDF
GTID:2518306017473624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,big data has become the hottest research topic in the Internet industry.In the daily operation of e-commerce platforms,big data analysis is needed to help make global and systematic decisions.However,due to the lack of sufficient business data and professional data analysis capabilities,small and medium-sized ecommerce companies have significant lags in the customization and adjustment of operating strategies,which has severely limited the development of standardization and intelligence of small and medium-sized e-commerce companies.In this context,this article uses a one-stop universal big data system designed to provide rich data crawling,big data topic analysis,strategy customization guidance,and user management for small and medium e-commerce platforms.Firstly,based on the data needs of small and medium-sized e-commerce,this article has implemented the application innovation of crawler technology,integrated a variety of crawler technology to compensate for the data defects of small and mediumsized e-commerce and completed the pre-processing of data,and built a crawler tool to implement rule template multitask Parallel,cyclic crawling and timing crawling.This article innovatively uses a third-party IP pool to break through the single-node collection limit,uses XPath and regular expressions to filter impurity information,and provides stable and reliable basic data for data analysis.Secondly,this paper improves the big data cluster framework and builds a complete computing cluster based on the improved framework,which improves the defect that the old architecture cannot undertake high-intensity data analysis tasks and realizes business decoupling.Using a new type of data warehouse layering theory and adapting it adaptively,based on this,the data analysis of commodity series topics is carried out,and the potential value of commodity data is deeply explored.Finally,this article builds a complete visual web service system based on the Spring framework,analyzes the product series topics according to business needs,and uses Echarts business-level data charts,including rich maps such as area maps,stacked line charts,and bar charts to analyze the results.Innovative use of the idea of data in the platform,the visual Web side is independently deployed in the cloud server,and the fault isolation from the computing cluster node is completed,which greatly improves the scalability and stability of the system.Through the comprehensive functional and non-functional tests of each module of the system cluster,the advanced design of the data frame improvement design and process optimization scheme was verified.The system can meet the increasing production data requirements and strategic guidance needs of small and medium-sized e-commerce companies,and has high application value.
Keywords/Search Tags:Big data, CDH, Data warehouse, Crawler, E-commerce
PDF Full Text Request
Related items